| #
5d034a75 |
| 08-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
SearchIndex: increase index version
|
| #
9369b4a9 |
| 08-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
SearchIndex: rector, phpcs, type hint fixes
|
| #
1148921d |
| 08-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
SearchIndex: unify CollectionSearch API and optimize search pipeline
- Remove separate lookup() API from CollectionSearch. All searches now use addTerm()/execute() with a single unified pipeline.
SearchIndex: unify CollectionSearch API and optimize search pipeline
- Remove separate lookup() API from CollectionSearch. All searches now use addTerm()/execute() with a single unified pipeline. - Add matches() predicate to Term using efficient string functions (===, str_starts_with, str_ends_with, str_contains) instead of regex. - Add caseInsensitive() support on CollectionSearch and Term for metadata/title searches where indexed values preserve case. - Remove callback support from MetadataSearch::lookupKey() — the only real usage (case-insensitive substring) is replaced by caseInsensitive() + wildcards. - Remove min-length validation from Term. Add Tokenizer::isValidSearchTerm() for callers that need it (FulltextSearch, Indexer::lookup). - Optimize execute() from 4 group passes to 2: scan tokens + resolve frequencies in one pass per group, batch entity name resolution, then populate Terms. - Store full match detail in Term: entity → token → frequency. New accessors getMatches(), getEntityTokens(), getEntityFrequencies() derive different views from this single data structure. - Term no longer used as scratch pad by CollectionSearch. Index-internal data (token IDs, entity IDs) stays local to execute(). Terms receive only final resolved results. - Use title from search results in MetadataSearch::pageLookupCallBack() instead of re-fetching via p_get_first_heading(). - Update concept.txt documentation.
show more ...
|
| #
e1272c08 |
| 07-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
SearchIndex: add backward compatibility wrappers
Add deprecated wrappers for idx_* and ft_* functions that were removed when inc/indexer.php and inc/fulltext.php were replaced by the new Search clas
SearchIndex: add backward compatibility wrappers
Add deprecated wrappers for idx_* and ft_* functions that were removed when inc/indexer.php and inc/fulltext.php were replaced by the new Search classes. These wrappers delegate to the new architecture and ensure existing plugins continue to work.
Deprecated standalone functions: idx_get_indexer, idx_getIndex, idx_lookup, idx_listIndexLengths, idx_indexLengths, ft_pageSearch, ft_backlinks, ft_mediause, ft_pageLookup, ft_snippet, ft_pagesorter, ft_snippet_re_preprocess, ft_queryParser.
Deprecated methods on Indexer: lookupKey, getPages, addMetaKeys, renameMetaValue, getPID, lookup.
Also migrates remaining core callers (Ajax, FeedCreator, ApiCore) to use the new classes directly and fixes a UTF-8 case folding bug in MetadataSearch title lookups.
show more ...
|
| #
21fbd01b |
| 07-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
SearchIndex: add integrity checking to Collection architecture
Add checkIntegrity() to AbstractCollection and DirectCollection that verifies paired indexes have matching line counts (token==frequenc
SearchIndex: add integrity checking to Collection architecture
Add checkIntegrity() to AbstractCollection and DirectCollection that verifies paired indexes have matching line counts (token==frequency, entity==reverse, entity==token for direct collections). Throws IndexIntegrityException on the first inconsistency found.
Add Countable interface to AbstractIndex with count() implementations in MemoryIndex and FileIndex. Add Indexer::checkIntegrity() and Indexer::isIndexEmpty() to orchestrate checks across all collections.
Update infoutils.php to use the new Indexer API instead of the old FulltextIndex/MetadataIndex classes.
Fix range(1, 0) bug in three places that produced [1, 0] instead of an empty array when split-by-length indexes were empty.
show more ...
|
| #
83b3accc |
| 06-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
SearchIndex: rewrite Indexer to use Collection classes
Replace the intermediate #2943 classes (FulltextIndex, MetadataIndex) with the new Collection-based architecture. The Indexer is now a thin sta
SearchIndex: rewrite Indexer to use Collection classes
Replace the intermediate #2943 classes (FulltextIndex, MetadataIndex) with the new Collection-based architecture. The Indexer is now a thin stateless orchestrator that delegates all index work to collections.
Key changes: - Indexer no longer extends AbstractIndex; page name passed to methods - addPage/deletePage/clear use PageTitleCollection, PageFulltextCollection, and PageMetaCollection - New PageMetaCollection replaces separate ReferencesCollection and MediaCollection with a single class that handles arbitrary metadata keys dynamically - Shared writable FileIndex('page') passed to all collections - Logger callback replaces verbose parameter - Methods return void instead of bool - Index classes implement IteratorAggregate for clean data access - Indexer tests consolidated into namespaced IndexerTest.php - All callers updated to new stateless API
show more ...
|
| #
3df1553d |
| 29-Nov-2021 |
Satoshi Sahara <sahara.satoshi@gmail.com> |
use Logger::debug() instead of deprecated dbglog()
|
| #
5792814c |
| 25-Sep-2020 |
Satoshi Sahara <sahara.satoshi@gmail.com> |
fix scrutinizer claims
|
| #
725e8e5f |
| 25-Sep-2020 |
Satoshi Sahara <sahara.satoshi@gmail.com> |
instantiate *Index with numeric page id
will reduce access to static $pidCache
|
| #
a32da6dd |
| 25-Sep-2020 |
Satoshi Sahara <sahara.satoshi@gmail.com> |
change Index objects to non-singleton
Indexer, FulltextIndex, MetadataIndex uses common directory to store *.idx files, but this does not mean they should be singleton objects to avoid lock confrict
change Index objects to non-singleton
Indexer, FulltextIndex, MetadataIndex uses common directory to store *.idx files, but this does not mean they should be singleton objects to avoid lock confrictions.
show more ...
|
| #
a16bd548 |
| 21-Sep-2020 |
Satoshi Sahara <sahara.satoshi@gmail.com> |
remove unnecessary if blocks
getPID(), saveIndex(), saveIndexKey(), getPageWords() return always true, otherwise exceptions.
|
| #
15f699ac |
| 10-Sep-2020 |
Andreas Gohr <andi@splitbrain.org> |
replace user errors with exceptions
Exceptions are better to handle than errors. What I don't like is that we now have an unfortunate mix of return code and exception signalling for errors. Some met
replace user errors with exceptions
Exceptions are better to handle than errors. What I don't like is that we now have an unfortunate mix of return code and exception signalling for errors. Some methods still return false for errors while others now throw exceptions (always returning true otherwise).
show more ...
|
| #
abb227bc |
| 13-Mar-2020 |
Satoshi Sahara <sahara.satoshi@gmail.com> |
add comment for $requireLock argument
|
| #
51ddbadd |
| 20-Feb-2020 |
Satoshi Sahara <sahara.satoshi@gmail.com> |
move histogram() into MetadataIndex class
|
| #
5f9bd525 |
| 20-Feb-2020 |
Satoshi Sahara <sahara.satoshi@gmail.com> |
Revert "histogram() change args order"
This reverts commit 4d04b7bbfe9c97673b0f22586d88e161aca34f70.
|
| #
11d2e7d0 |
| 20-Feb-2020 |
Satoshi Sahara <sahara.satoshi@gmail.com> |
simplify dispatch()
|
| #
653b91a2 |
| 01-Feb-2020 |
Satoshi Sahara <sahara.satoshi@gmail.com> |
AbstractIndex class const INDEX_MARK_DELETED
|
| #
743c9a28 |
| 31-Jan-2020 |
Satoshi Sahara <sahara.satoshi@gmail.com> |
rename PagewordIndex to FulltextIndex
|
| #
5237d405 |
| 31-Jan-2020 |
Satoshi Sahara <sahara.satoshi@gmail.com> |
trigger error when lock/unlock index directory failed
|
| #
4d04b7bb |
| 31-Jan-2020 |
Satoshi Sahara <sahara.satoshi@gmail.com> |
histogram() change args order
INdexer::histogram() is only used in indexer_histogram.test.php file.
|
| #
a2f39162 |
| 30-Jan-2020 |
Satoshi Sahara <sahara.satoshi@gmail.com> |
fix typo
|
| #
4027a91a |
| 30-Jan-2020 |
Satoshi Sahara <sahara.satoshi@gmail.com> |
use Indexer.php instead of PageIndex.php
|
| #
6225b270 |
| 28-Dec-2019 |
Michael Große <mic.grosse@googlemail.com> |
Extract dokuwiki\Search\Indexer class
Not sure why Doku_Indexer caused phpcs to complain about the class name not being in PascalCase, but Doku_Handler didn't.
The namespace and new class name w
Extract dokuwiki\Search\Indexer class
Not sure why Doku_Indexer caused phpcs to complain about the class name not being in PascalCase, but Doku_Handler didn't.
The namespace and new class name was selected to be compatible with the upcoming changes in #2943. This should hpopefully reduce the overall hassle of touching the same code base.
show more ...
|