| c66b5ec6 | 05-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
SearchIndex: rewrite Lock as static registry with reference counting
Replace the instance-based Lock class with a static registry that tracks held locks per-process with reference counting. This sol
SearchIndex: rewrite Lock as static registry with reference counting
Replace the instance-based Lock class with a static registry that tracks held locks per-process with reference counting. This solves three problems:
- Split indexes (w3, w4, ...) share a single lock name and now coordinate naturally via the registry - Multiple callers can acquire the same lock without conflict - Indexes enforce their own writability through lock()/unlock() methods on AbstractIndex
The Lock registry manages both the filesystem lock (mkdir) and the in-process tracking. The first acquire creates the directory, subsequent acquires increment the refcount. Release decrements, and only removes the directory when the count reaches zero.
Note: I am not sure if implementing this as a static object is a great idea or if we should pass an instance through the collection to the indexes...
show more ...
|
| d92c078c | 05-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
SearchIndex: add DirectCollection for 1:1 entity-token mappings
Introduce DirectCollection as a third collection type alongside FrequencyCollection and LookupCollection. Direct collections store exa
SearchIndex: add DirectCollection for 1:1 entity-token mappings
Introduce DirectCollection as a third collection type alongside FrequencyCollection and LookupCollection. Direct collections store exactly one token per entity at the entity's position in the token index (entity.RID === token.RID), with no frequency or reverse indexes.
AbstractCollection now accepts optional frequency/reverse index names (default to '') and skips locking empty index names.
Adds PageTitleCollection as the first concrete direct collection for the page -> title mapping.
show more ...
|
| f2bbffb5 | 05-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
SearchIndex: extract Collection base class hierarchy
Introduce AbstractCollection as the shared base for all index collections, with FrequencyCollection and LookupCollection as the two abstract subc
SearchIndex: extract Collection base class hierarchy
Introduce AbstractCollection as the shared base for all index collections, with FrequencyCollection and LookupCollection as the two abstract subclasses differing only in how tokens are counted (frequency vs dedup).
Key design decisions: - splitByLength is a constructor parameter on AbstractCollection controlling whether token/frequency indexes use length-based file splitting. This is independent of the collection type. - The reverse index format is self-describing: entries with * have a group prefix (split), entries without don't (non-split). No branching needed in parse/format methods. - addEntity, resolveTokens, updateIndexes, and reverse index handling all live in AbstractCollection. Subclasses only implement countTokens().
Concrete collections: PageFulltextCollection (frequency, split), MediaCollection and ReferencesCollection (lookup, non-split).
Renames FulltextCollection -> PageFulltextCollection and FulltextCollectionSearch -> FrequencyCollectionSearch.
show more ...
|
| 2f70db90 | 04-Dec-2025 |
WillForan <willforan@gmail.com> |
fix: 32bit IP tests w/string of decimal representation, overflows
Math in PHP is hard! sprintf("%.0f",0x7FFFFFFFFFFFFFFF) == sprintf("%.0f",0x7FFFFFFFFFFFFF00)
Changes * 32bit gets own version o
fix: 32bit IP tests w/string of decimal representation, overflows
Math in PHP is hard! sprintf("%.0f",0x7FFFFFFFFFFFFFFF) == sprintf("%.0f",0x7FFFFFFFFFFFFF00)
Changes * 32bit gets own version of tests where expected values are strings * decimalToBinary32 to replace `sprintf("%032b%032b"...)`, avoids overflow * overflow check in ipv4 too * refactor * partsTo64 for 32bit parts into dec value as str (bcmath) * Ip32::$b32 as class constant * condition always PHP_INT_SIZE == 4 for 32bit (instead of == 8 for 64)
show more ...
|
| bc997a9d | 30-Oct-2025 |
Andreas Gohr <gohr@cosmocode.de> |
SearchIndex: TupleOps now work with frequencies of 1
We have indexes where we simply track that a relation between entity and token exists, but there is no frequency. The frequency is always 1. For
SearchIndex: TupleOps now work with frequencies of 1
We have indexes where we simply track that a relation between entity and token exists, but there is no frequency. The frequency is always 1. For those indexes we do not store *1 as frequency but omit it completely.
The TupleOps class now can work with such indexes and will also store frequencies of 1 like this.
show more ...
|