| #
06053dca |
| 10-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
SearchIndex: remove write side effect from retrieveRow()
retrieveRow() padded the index file when the requested RID was beyond the current length. This was an optimization for subsequent changeRow()
SearchIndex: remove write side effect from retrieveRow()
retrieveRow() padded the index file when the requested RID was beyond the current length. This was an optimization for subsequent changeRow() calls, but changeRow() already handles padding on its own. The side effect was also inconsistent with retrieveRows() which is a pure read.
Added a cross-index integration test verifying RID consistency across entity, token, frequency and reverse indexes when multiple entities share tokens.
show more ...
|
| #
21fbd01b |
| 07-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
SearchIndex: add integrity checking to Collection architecture
Add checkIntegrity() to AbstractCollection and DirectCollection that verifies paired indexes have matching line counts (token==frequenc
SearchIndex: add integrity checking to Collection architecture
Add checkIntegrity() to AbstractCollection and DirectCollection that verifies paired indexes have matching line counts (token==frequency, entity==reverse, entity==token for direct collections). Throws IndexIntegrityException on the first inconsistency found.
Add Countable interface to AbstractIndex with count() implementations in MemoryIndex and FileIndex. Add Indexer::checkIntegrity() and Indexer::isIndexEmpty() to orchestrate checks across all collections.
Update infoutils.php to use the new Indexer API instead of the old FulltextIndex/MetadataIndex classes.
Fix range(1, 0) bug in three places that produced [1, 0] instead of an empty array when split-by-length indexes were empty.
show more ...
|
| #
6734bb8c |
| 07-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
SearchIndex: rewrite MetadataSearch to use Collection classes
Replace MetadataIndex usage in MetadataSearch with the new Collection/Index architecture. This completes the read-path migration so data
SearchIndex: rewrite MetadataSearch to use Collection classes
Replace MetadataIndex usage in MetadataSearch with the new Collection/Index architecture. This completes the read-path migration so data written by the Collection-based Indexer is read back correctly using TupleOps tuple format.
Generalize FrequencyCollectionSearch into CollectionSearch that works with any AbstractCollection type (Frequency, Lookup, Direct) and handles both split-by-length and non-split index layouts transparently. DirectCollection participates via resolveTokenFrequencies() which maps token RID = entity RID.
Key changes: - AbstractCollection gains isSplitByLength(), resolveTokenFrequencies(), getEntitiesWithData(), and groupToSuffix() with validation - Index groups are now int (0 = non-split, positive = token length) - CollectionSearch provides both addTerm()/execute() for fulltext and lookup() for metadata-style search (exact/wildcard/callback) - MetadataSearch delegates entirely to collection APIs - Shared filterPages() replaces duplicated page filtering logic - All callers updated from MetadataIndex to MetadataSearch - Tests moved to Search namespace with full coverage for new APIs
show more ...
|
| #
ede46466 |
| 06-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
SearchIndex: reorganize and expand test suite
Move all Search tests from _test/tests/inc/Search/ to _test/tests/Search/ to match the dokuwiki\test autoloader convention. Fix namespaces from tests\*
SearchIndex: reorganize and expand test suite
Move all Search tests from _test/tests/inc/Search/ to _test/tests/Search/ to match the dokuwiki\test autoloader convention. Fix namespaces from tests\* to dokuwiki\test\* so all tests work in isolation.
Extract inline test helpers into separate autoloadable mock files: TestDirectCollection → MockDirectCollection, TestLookupCollection → MockLookupCollection, TestFrequencyCollection → MockFrequencyCollection.
Rename AbstractIndexTest → AbstractIndexTestCase to fix PHPUnit warning about abstract classes with Test suffix.
Replace dead xxxRealWord() with proper testWildcardSearch() verifying exact token matches and frequencies for all three wildcard types. Add testTokenizedPageSearch() using a dedicated test data file. Add testNoMatchReturnsEmptyFrequencies() which exposed a bug in Term where uninitialized $tokens/$frequencies caused crashes on zero-match terms.
Replace fulltext_query.test.php with modern QueryParserTest in the Search\Query namespace.
Add new test files: - LockTest: acquire/release, reference counting, stale lock override, foreign lock rejection, releaseAll, independent locks - NamespacePredicateTest: filter/exclude, sub-namespaces, partial prefix safety, empty sets, score preservation - PageSetTest: intersect, unite, subtract, isEmpty - QueryEvaluatorTest: word lookups, AND/OR/NOT, namespace filtering, combined queries, partial namespace prefix safety
Fix Term.php: initialize $tokens and $frequencies to [] instead of null.
show more ...
|