History log of /dokuwiki/inc/Search/Indexer.php (Results 1 – 23 of 23)
Revision Date Author Comments
# 5d034a75 08-Apr-2026 Andreas Gohr <andi@splitbrain.org>

SearchIndex: increase index version


# 9369b4a9 08-Apr-2026 Andreas Gohr <andi@splitbrain.org>

SearchIndex: rector, phpcs, type hint fixes


# 1148921d 08-Apr-2026 Andreas Gohr <andi@splitbrain.org>

SearchIndex: unify CollectionSearch API and optimize search pipeline

- Remove separate lookup() API from CollectionSearch. All searches now
use addTerm()/execute() with a single unified pipeline.

SearchIndex: unify CollectionSearch API and optimize search pipeline

- Remove separate lookup() API from CollectionSearch. All searches now
use addTerm()/execute() with a single unified pipeline.
- Add matches() predicate to Term using efficient string functions
(===, str_starts_with, str_ends_with, str_contains) instead of regex.
- Add caseInsensitive() support on CollectionSearch and Term for
metadata/title searches where indexed values preserve case.
- Remove callback support from MetadataSearch::lookupKey() — the only
real usage (case-insensitive substring) is replaced by
caseInsensitive() + wildcards.
- Remove min-length validation from Term. Add Tokenizer::isValidSearchTerm()
for callers that need it (FulltextSearch, Indexer::lookup).
- Optimize execute() from 4 group passes to 2: scan tokens + resolve
frequencies in one pass per group, batch entity name resolution, then
populate Terms.
- Store full match detail in Term: entity → token → frequency. New
accessors getMatches(), getEntityTokens(), getEntityFrequencies()
derive different views from this single data structure.
- Term no longer used as scratch pad by CollectionSearch. Index-internal
data (token IDs, entity IDs) stays local to execute(). Terms receive
only final resolved results.
- Use title from search results in MetadataSearch::pageLookupCallBack()
instead of re-fetching via p_get_first_heading().
- Update concept.txt documentation.

show more ...


# e1272c08 07-Apr-2026 Andreas Gohr <andi@splitbrain.org>

SearchIndex: add backward compatibility wrappers

Add deprecated wrappers for idx_* and ft_* functions that were removed
when inc/indexer.php and inc/fulltext.php were replaced by the new
Search clas

SearchIndex: add backward compatibility wrappers

Add deprecated wrappers for idx_* and ft_* functions that were removed
when inc/indexer.php and inc/fulltext.php were replaced by the new
Search classes. These wrappers delegate to the new architecture and
ensure existing plugins continue to work.

Deprecated standalone functions: idx_get_indexer, idx_getIndex,
idx_lookup, idx_listIndexLengths, idx_indexLengths, ft_pageSearch,
ft_backlinks, ft_mediause, ft_pageLookup, ft_snippet, ft_pagesorter,
ft_snippet_re_preprocess, ft_queryParser.

Deprecated methods on Indexer: lookupKey, getPages, addMetaKeys,
renameMetaValue, getPID, lookup.

Also migrates remaining core callers (Ajax, FeedCreator, ApiCore) to
use the new classes directly and fixes a UTF-8 case folding bug in
MetadataSearch title lookups.

show more ...


# 21fbd01b 07-Apr-2026 Andreas Gohr <andi@splitbrain.org>

SearchIndex: add integrity checking to Collection architecture

Add checkIntegrity() to AbstractCollection and DirectCollection that
verifies paired indexes have matching line counts (token==frequenc

SearchIndex: add integrity checking to Collection architecture

Add checkIntegrity() to AbstractCollection and DirectCollection that
verifies paired indexes have matching line counts (token==frequency,
entity==reverse, entity==token for direct collections). Throws
IndexIntegrityException on the first inconsistency found.

Add Countable interface to AbstractIndex with count() implementations
in MemoryIndex and FileIndex. Add Indexer::checkIntegrity() and
Indexer::isIndexEmpty() to orchestrate checks across all collections.

Update infoutils.php to use the new Indexer API instead of the old
FulltextIndex/MetadataIndex classes.

Fix range(1, 0) bug in three places that produced [1, 0] instead of
an empty array when split-by-length indexes were empty.

show more ...


# 83b3accc 06-Apr-2026 Andreas Gohr <andi@splitbrain.org>

SearchIndex: rewrite Indexer to use Collection classes

Replace the intermediate #2943 classes (FulltextIndex, MetadataIndex)
with the new Collection-based architecture. The Indexer is now a thin
sta

SearchIndex: rewrite Indexer to use Collection classes

Replace the intermediate #2943 classes (FulltextIndex, MetadataIndex)
with the new Collection-based architecture. The Indexer is now a thin
stateless orchestrator that delegates all index work to collections.

Key changes:
- Indexer no longer extends AbstractIndex; page name passed to methods
- addPage/deletePage/clear use PageTitleCollection,
PageFulltextCollection, and PageMetaCollection
- New PageMetaCollection replaces separate ReferencesCollection and
MediaCollection with a single class that handles arbitrary metadata
keys dynamically
- Shared writable FileIndex('page') passed to all collections
- Logger callback replaces verbose parameter
- Methods return void instead of bool
- Index classes implement IteratorAggregate for clean data access
- Indexer tests consolidated into namespaced IndexerTest.php
- All callers updated to new stateless API

show more ...


# 3df1553d 29-Nov-2021 Satoshi Sahara <sahara.satoshi@gmail.com>

use Logger::debug() instead of deprecated dbglog()


# 5792814c 25-Sep-2020 Satoshi Sahara <sahara.satoshi@gmail.com>

fix scrutinizer claims


# 725e8e5f 25-Sep-2020 Satoshi Sahara <sahara.satoshi@gmail.com>

instantiate *Index with numeric page id

will reduce access to static $pidCache


# a32da6dd 25-Sep-2020 Satoshi Sahara <sahara.satoshi@gmail.com>

change Index objects to non-singleton

Indexer, FulltextIndex, MetadataIndex uses common directory to store *.idx files, but this does not mean they should be singleton objects to avoid lock confrict

change Index objects to non-singleton

Indexer, FulltextIndex, MetadataIndex uses common directory to store *.idx files, but this does not mean they should be singleton objects to avoid lock confrictions.

show more ...


# a16bd548 21-Sep-2020 Satoshi Sahara <sahara.satoshi@gmail.com>

remove unnecessary if blocks

getPID(), saveIndex(), saveIndexKey(), getPageWords() return always true, otherwise exceptions.


# 15f699ac 10-Sep-2020 Andreas Gohr <andi@splitbrain.org>

replace user errors with exceptions

Exceptions are better to handle than errors. What I don't like is that
we now have an unfortunate mix of return code and exception signalling
for errors. Some met

replace user errors with exceptions

Exceptions are better to handle than errors. What I don't like is that
we now have an unfortunate mix of return code and exception signalling
for errors. Some methods still return false for errors while others
now throw exceptions (always returning true otherwise).

show more ...


# abb227bc 13-Mar-2020 Satoshi Sahara <sahara.satoshi@gmail.com>

add comment for $requireLock argument


# 51ddbadd 20-Feb-2020 Satoshi Sahara <sahara.satoshi@gmail.com>

move histogram() into MetadataIndex class


# 5f9bd525 20-Feb-2020 Satoshi Sahara <sahara.satoshi@gmail.com>

Revert "histogram() change args order"

This reverts commit 4d04b7bbfe9c97673b0f22586d88e161aca34f70.


# 11d2e7d0 20-Feb-2020 Satoshi Sahara <sahara.satoshi@gmail.com>

simplify dispatch()


# 653b91a2 01-Feb-2020 Satoshi Sahara <sahara.satoshi@gmail.com>

AbstractIndex class const INDEX_MARK_DELETED


# 743c9a28 31-Jan-2020 Satoshi Sahara <sahara.satoshi@gmail.com>

rename PagewordIndex to FulltextIndex


# 5237d405 31-Jan-2020 Satoshi Sahara <sahara.satoshi@gmail.com>

trigger error when lock/unlock index directory failed


# 4d04b7bb 31-Jan-2020 Satoshi Sahara <sahara.satoshi@gmail.com>

histogram() change args order

INdexer::histogram() is only used in indexer_histogram.test.php file.


# a2f39162 30-Jan-2020 Satoshi Sahara <sahara.satoshi@gmail.com>

fix typo


# 4027a91a 30-Jan-2020 Satoshi Sahara <sahara.satoshi@gmail.com>

use Indexer.php instead of PageIndex.php


# 6225b270 28-Dec-2019 Michael Große <mic.grosse@googlemail.com>

Extract dokuwiki\Search\Indexer class

Not sure why Doku_Indexer caused phpcs to complain about the class name
not being in PascalCase, but Doku_Handler didn't. ��

The namespace and new class name w

Extract dokuwiki\Search\Indexer class

Not sure why Doku_Indexer caused phpcs to complain about the class name
not being in PascalCase, but Doku_Handler didn't. ��

The namespace and new class name was selected to be compatible with the
upcoming changes in #2943. This should hpopefully reduce the overall
hassle of touching the same code base.

show more ...