| #
f2bbffb5 |
| 05-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
SearchIndex: extract Collection base class hierarchy
Introduce AbstractCollection as the shared base for all index collections, with FrequencyCollection and LookupCollection as the two abstract subc
SearchIndex: extract Collection base class hierarchy
Introduce AbstractCollection as the shared base for all index collections, with FrequencyCollection and LookupCollection as the two abstract subclasses differing only in how tokens are counted (frequency vs dedup).
Key design decisions: - splitByLength is a constructor parameter on AbstractCollection controlling whether token/frequency indexes use length-based file splitting. This is independent of the collection type. - The reverse index format is self-describing: entries with * have a group prefix (split), entries without don't (non-split). No branching needed in parse/format methods. - addEntity, resolveTokens, updateIndexes, and reverse index handling all live in AbstractCollection. Subclasses only implement countTokens().
Concrete collections: PageFulltextCollection (frequency, split), MediaCollection and ReferencesCollection (lookup, non-split).
Renames FulltextCollection -> PageFulltextCollection and FulltextCollectionSearch -> FrequencyCollectionSearch.
show more ...
|