History log of /dokuwiki/inc/Search/Collection/Term.php (Results 1 – 6 of 6)
Revision Date Author Comments
# 9369b4a9 08-Apr-2026 Andreas Gohr <andi@splitbrain.org>

SearchIndex: rector, phpcs, type hint fixes


# 1148921d 08-Apr-2026 Andreas Gohr <andi@splitbrain.org>

SearchIndex: unify CollectionSearch API and optimize search pipeline

- Remove separate lookup() API from CollectionSearch. All searches now
use addTerm()/execute() with a single unified pipeline.

SearchIndex: unify CollectionSearch API and optimize search pipeline

- Remove separate lookup() API from CollectionSearch. All searches now
use addTerm()/execute() with a single unified pipeline.
- Add matches() predicate to Term using efficient string functions
(===, str_starts_with, str_ends_with, str_contains) instead of regex.
- Add caseInsensitive() support on CollectionSearch and Term for
metadata/title searches where indexed values preserve case.
- Remove callback support from MetadataSearch::lookupKey() — the only
real usage (case-insensitive substring) is replaced by
caseInsensitive() + wildcards.
- Remove min-length validation from Term. Add Tokenizer::isValidSearchTerm()
for callers that need it (FulltextSearch, Indexer::lookup).
- Optimize execute() from 4 group passes to 2: scan tokens + resolve
frequencies in one pass per group, batch entity name resolution, then
populate Terms.
- Store full match detail in Term: entity → token → frequency. New
accessors getMatches(), getEntityTokens(), getEntityFrequencies()
derive different views from this single data structure.
- Term no longer used as scratch pad by CollectionSearch. Index-internal
data (token IDs, entity IDs) stays local to execute(). Terms receive
only final resolved results.
- Use title from search results in MetadataSearch::pageLookupCallBack()
instead of re-fetching via p_get_first_heading().
- Update concept.txt documentation.

show more ...


# 6734bb8c 07-Apr-2026 Andreas Gohr <andi@splitbrain.org>

SearchIndex: rewrite MetadataSearch to use Collection classes

Replace MetadataIndex usage in MetadataSearch with the new Collection/Index
architecture. This completes the read-path migration so data

SearchIndex: rewrite MetadataSearch to use Collection classes

Replace MetadataIndex usage in MetadataSearch with the new Collection/Index
architecture. This completes the read-path migration so data written by the
Collection-based Indexer is read back correctly using TupleOps tuple format.

Generalize FrequencyCollectionSearch into CollectionSearch that works with any
AbstractCollection type (Frequency, Lookup, Direct) and handles both
split-by-length and non-split index layouts transparently. DirectCollection
participates via resolveTokenFrequencies() which maps token RID = entity RID.

Key changes:
- AbstractCollection gains isSplitByLength(), resolveTokenFrequencies(),
getEntitiesWithData(), and groupToSuffix() with validation
- Index groups are now int (0 = non-split, positive = token length)
- CollectionSearch provides both addTerm()/execute() for fulltext and
lookup() for metadata-style search (exact/wildcard/callback)
- MetadataSearch delegates entirely to collection APIs
- Shared filterPages() replaces duplicated page filtering logic
- All callers updated from MetadataIndex to MetadataSearch
- Tests moved to Search namespace with full coverage for new APIs

show more ...


# ede46466 06-Apr-2026 Andreas Gohr <andi@splitbrain.org>

SearchIndex: reorganize and expand test suite

Move all Search tests from _test/tests/inc/Search/ to _test/tests/Search/
to match the dokuwiki\test autoloader convention. Fix namespaces from
tests\*

SearchIndex: reorganize and expand test suite

Move all Search tests from _test/tests/inc/Search/ to _test/tests/Search/
to match the dokuwiki\test autoloader convention. Fix namespaces from
tests\* to dokuwiki\test\* so all tests work in isolation.

Extract inline test helpers into separate autoloadable mock files:
TestDirectCollection → MockDirectCollection, TestLookupCollection →
MockLookupCollection, TestFrequencyCollection → MockFrequencyCollection.

Rename AbstractIndexTest → AbstractIndexTestCase to fix PHPUnit warning
about abstract classes with Test suffix.

Replace dead xxxRealWord() with proper testWildcardSearch() verifying
exact token matches and frequencies for all three wildcard types.
Add testTokenizedPageSearch() using a dedicated test data file. Add
testNoMatchReturnsEmptyFrequencies() which exposed a bug in Term where
uninitialized $tokens/$frequencies caused crashes on zero-match terms.

Replace fulltext_query.test.php with modern QueryParserTest in the
Search\Query namespace.

Add new test files:
- LockTest: acquire/release, reference counting, stale lock override,
foreign lock rejection, releaseAll, independent locks
- NamespacePredicateTest: filter/exclude, sub-namespaces, partial prefix
safety, empty sets, score preservation
- PageSetTest: intersect, unite, subtract, isEmpty
- QueryEvaluatorTest: word lookups, AND/OR/NOT, namespace filtering,
combined queries, partial namespace prefix safety

Fix Term.php: initialize $tokens and $frequencies to [] instead of null.

show more ...


# e05998d5 30-Oct-2025 Andreas Gohr <gohr@cosmocode.de>

SearchIndex: more Term tests


# 596d5287 11-May-2023 Andreas Gohr <andi@splitbrain.org>

Working fulltext collection and search

This finalizes the FulltextCollection and FulltextCollectionSearch
classes. Proper locking is implemented, tests have been enhanced.

It should be possible to

Working fulltext collection and search

This finalizes the FulltextCollection and FulltextCollectionSearch
classes. Proper locking is implemented, tests have been enhanced.

It should be possible to reimplement the page full text search on top of
it.

show more ...