| #
9369b4a9 |
| 08-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
SearchIndex: rector, phpcs, type hint fixes
|
| #
1148921d |
| 08-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
SearchIndex: unify CollectionSearch API and optimize search pipeline
- Remove separate lookup() API from CollectionSearch. All searches now use addTerm()/execute() with a single unified pipeline.
SearchIndex: unify CollectionSearch API and optimize search pipeline
- Remove separate lookup() API from CollectionSearch. All searches now use addTerm()/execute() with a single unified pipeline. - Add matches() predicate to Term using efficient string functions (===, str_starts_with, str_ends_with, str_contains) instead of regex. - Add caseInsensitive() support on CollectionSearch and Term for metadata/title searches where indexed values preserve case. - Remove callback support from MetadataSearch::lookupKey() — the only real usage (case-insensitive substring) is replaced by caseInsensitive() + wildcards. - Remove min-length validation from Term. Add Tokenizer::isValidSearchTerm() for callers that need it (FulltextSearch, Indexer::lookup). - Optimize execute() from 4 group passes to 2: scan tokens + resolve frequencies in one pass per group, batch entity name resolution, then populate Terms. - Store full match detail in Term: entity → token → frequency. New accessors getMatches(), getEntityTokens(), getEntityFrequencies() derive different views from this single data structure. - Term no longer used as scratch pad by CollectionSearch. Index-internal data (token IDs, entity IDs) stays local to execute(). Terms receive only final resolved results. - Use title from search results in MetadataSearch::pageLookupCallBack() instead of re-fetching via p_get_first_heading(). - Update concept.txt documentation.
show more ...
|
| #
e1272c08 |
| 07-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
SearchIndex: add backward compatibility wrappers
Add deprecated wrappers for idx_* and ft_* functions that were removed when inc/indexer.php and inc/fulltext.php were replaced by the new Search clas
SearchIndex: add backward compatibility wrappers
Add deprecated wrappers for idx_* and ft_* functions that were removed when inc/indexer.php and inc/fulltext.php were replaced by the new Search classes. These wrappers delegate to the new architecture and ensure existing plugins continue to work.
Deprecated standalone functions: idx_get_indexer, idx_getIndex, idx_lookup, idx_listIndexLengths, idx_indexLengths, ft_pageSearch, ft_backlinks, ft_mediause, ft_pageLookup, ft_snippet, ft_pagesorter, ft_snippet_re_preprocess, ft_queryParser.
Deprecated methods on Indexer: lookupKey, getPages, addMetaKeys, renameMetaValue, getPID, lookup.
Also migrates remaining core callers (Ajax, FeedCreator, ApiCore) to use the new classes directly and fixes a UTF-8 case folding bug in MetadataSearch title lookups.
show more ...
|
| #
6734bb8c |
| 07-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
SearchIndex: rewrite MetadataSearch to use Collection classes
Replace MetadataIndex usage in MetadataSearch with the new Collection/Index architecture. This completes the read-path migration so data
SearchIndex: rewrite MetadataSearch to use Collection classes
Replace MetadataIndex usage in MetadataSearch with the new Collection/Index architecture. This completes the read-path migration so data written by the Collection-based Indexer is read back correctly using TupleOps tuple format.
Generalize FrequencyCollectionSearch into CollectionSearch that works with any AbstractCollection type (Frequency, Lookup, Direct) and handles both split-by-length and non-split index layouts transparently. DirectCollection participates via resolveTokenFrequencies() which maps token RID = entity RID.
Key changes: - AbstractCollection gains isSplitByLength(), resolveTokenFrequencies(), getEntitiesWithData(), and groupToSuffix() with validation - Index groups are now int (0 = non-split, positive = token length) - CollectionSearch provides both addTerm()/execute() for fulltext and lookup() for metadata-style search (exact/wildcard/callback) - MetadataSearch delegates entirely to collection APIs - Shared filterPages() replaces duplicated page filtering logic - All callers updated from MetadataIndex to MetadataSearch - Tests moved to Search namespace with full coverage for new APIs
show more ...
|
| #
0b1bbbbb |
| 06-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
SearchIndex: rewrite FulltextSearch to use FrequencyCollectionSearch
Replace FulltextIndex->lookupWords() with FrequencyCollectionSearch which correctly handles the compact tuple format written by t
SearchIndex: rewrite FulltextSearch to use FrequencyCollectionSearch
Replace FulltextIndex->lookupWords() with FrequencyCollectionSearch which correctly handles the compact tuple format written by the new Indexer.
Introduce QueryEvaluator with typed stack entries (PageSet, NamespacePredicate, NegatedEntry) for RPN query evaluation. NOT wraps its operand instead of computing a universe complement, so AND with a negated operand becomes efficient set subtraction. The full page index is only loaded for standalone negative or namespace-only queries.
Move QueryParser and QueryEvaluator into the new Search\Query namespace along with the stack entry types.
Simplify FulltextSearch to orchestration: parse query, look up words, evaluate, filter, sort. Replace FT_SNIPPET_NUMBER constant with maxSnippets property. Combine ACL/existence/time filtering into a single pass.
show more ...
|
| #
a02395a1 |
| 29-Nov-2021 |
Satoshi Sahara <sahara.satoshi@gmail.com> |
catch up #3115 Sort with collator
|
| #
fab81cc8 |
| 29-Nov-2021 |
Satoshi Sahara <sahara.satoshi@gmail.com> |
added missing 'notns' related code
|
| #
cc3a3cde |
| 26-Sep-2020 |
Satoshi Sahara <sahara.satoshi@gmail.com> |
change MetadataSearch and FulltextSearch to non-singleton
singleton is not effective to reduce multiple instantiations, especially for MetadataSearch which is frequently used in ajax call.
|
| #
a32da6dd |
| 25-Sep-2020 |
Satoshi Sahara <sahara.satoshi@gmail.com> |
change Index objects to non-singleton
Indexer, FulltextIndex, MetadataIndex uses common directory to store *.idx files, but this does not mean they should be singleton objects to avoid lock confrict
change Index objects to non-singleton
Indexer, FulltextIndex, MetadataIndex uses common directory to store *.idx files, but this does not mean they should be singleton objects to avoid lock confrictions.
show more ...
|
| #
9329b002 |
| 02-Feb-2020 |
Satoshi Sahara <sahara.satoshi@gmail.com> |
change static methods into instance methods
|
| #
4a90f94b |
| 02-Feb-2020 |
Satoshi Sahara <sahara.satoshi@gmail.com> |
move backlinks() and mediause() into MediaIndex class
|
| #
02361d2a |
| 20-Jan-2020 |
Satoshi Sahara <sahara.satoshi@gmail.com> |
define getPages() in AbstractIndex instead of PageIndex
getPages() is inherited to each subclass of AbstractIndex, but MetadataIndex::getPages() will override the inherited method.
|
| #
be5c1ea2 |
| 19-Jan-2020 |
Satoshi Sahara <sahara.satoshi@gmail.com> |
move lookup() to PagewordIndex class, reduce term 'Indexer'
make similar MetadataIndex::lookupKey() and PagewordIndex::lookup()
|
| #
46b83514 |
| 19-Jan-2020 |
Satoshi Sahara <sahara.satoshi@gmail.com> |
resolve conflictions, CodeSniffer errors
|
| #
185796b3 |
| 13-Jan-2020 |
Satoshi Sahara <sahara.satoshi@gmail.com> |
allow public access to getIndex(), etc.
|
| #
86fc7283 |
| 07-Jan-2020 |
Satoshi Sahara <sahara.satoshi@gmail.com> |
separate methods into metadata, Pageword, Page index classes
|
| #
fe2d1da1 |
| 05-Jan-2020 |
Satoshi Sahara <sahara.satoshi@gmail.com> |
change class name to MetadataSearch
|