History log of /dokuwiki/inc/Search/Indexer.php (Results 1 – 25 of 27)
Revision Date Author Comments
# 2cda0166 17-Jun-2026 Andreas Gohr <gohr@cosmocode.de>

Indexer: signal nothing-to-do via boolean return instead of void

The TaskRunner runs indexing, sitemap, digest and changelog-trim tasks
in sequence and relies on each task returning false when it di

Indexer: signal nothing-to-do via boolean return instead of void

The TaskRunner runs indexing, sitemap, digest and changelog-trim tasks
in sequence and relies on each task returning false when it did no work
so the next one is tried. The indexer rewrite changed addPage(),
deletePage() and renamePage() to return void and only abort via
exceptions, breaking that contract: indexing always looked like work was
done and the following tasks never ran.

Restore the boolean return on these three methods (true when work was
done, false when there was nothing to do) while still using exceptions
to signal errors, and propagate it through TaskRunner::runIndexer().
runIndexer() also no longer forces reindexing on every call.

The legacy compatibility layer is adjusted to match: LegacyIndexer and
idx_addPage() forward the boolean, mapping SearchExceptions back to the
historic error-message/false returns. LegacyIndexer::renamePage()
restores the 'page is not in index' message that the move plugin expects.

Closes #4661

show more ...


# 79dae64d 17-Jun-2026 Andreas Gohr <gohr@cosmocode.de>

Indexer: treat same-second save and index as up to date

needsIndexing() compared the .indexed tag mtime against the page mtime
with <=, so a page that was saved and indexed within the same second wa

Indexer: treat same-second save and index as up to date

needsIndexing() compared the .indexed tag mtime against the page mtime
with <=, so a page that was saved and indexed within the same second was
always reported as still needing indexing. Require the page to be
strictly newer than the index tag instead, so an equal mtime correctly
counts as up to date.

show more ...


# 2ff7e61c 10-Jun-2026 Andreas Gohr <gohr@cosmocode.de>

fix(indexer): explicitly handle renames

In an attempt to simplify the index handling, the newly refactored
indexer implemented a rename as delete+add sequence.

This had unintended consequences for

fix(indexer): explicitly handle renames

In an attempt to simplify the index handling, the newly refactored
indexer implemented a rename as delete+add sequence.

This had unintended consequences for the move plugin which may move
several pages at once, requiring a working index even while some pages
have already been moved while others still remain at their old location.

Related to #4646

show more ...


# 6e39b4e3 28-May-2026 Andreas Gohr <andi@splitbrain.org>

refactor(search): extract LegacyIndexer wrapper for BC contract

Move the deprecated helpers (lookupKey, addMetaKeys, renameMetaValue,
getPID, lookup) off Indexer and into a new LegacyIndexer wrapper

refactor(search): extract LegacyIndexer wrapper for BC contract

Move the deprecated helpers (lookupKey, addMetaKeys, renameMetaValue,
getPID, lookup) off Indexer and into a new LegacyIndexer wrapper. The
wrapper also restores the Doku_Indexer return contract (true|string)
around addPage/deletePage/renamePage/clear so plugins using the legacy
API keep working without try/catch.

idx_get_indexer() now returns the LegacyIndexer; getPages stays on
Indexer because plugins call it directly on Indexer instances.

fixes #4645

show more ...


# 5d034a75 08-Apr-2026 Andreas Gohr <andi@splitbrain.org>

SearchIndex: increase index version


# 9369b4a9 08-Apr-2026 Andreas Gohr <andi@splitbrain.org>

SearchIndex: rector, phpcs, type hint fixes


# 1148921d 08-Apr-2026 Andreas Gohr <andi@splitbrain.org>

SearchIndex: unify CollectionSearch API and optimize search pipeline

- Remove separate lookup() API from CollectionSearch. All searches now
use addTerm()/execute() with a single unified pipeline.

SearchIndex: unify CollectionSearch API and optimize search pipeline

- Remove separate lookup() API from CollectionSearch. All searches now
use addTerm()/execute() with a single unified pipeline.
- Add matches() predicate to Term using efficient string functions
(===, str_starts_with, str_ends_with, str_contains) instead of regex.
- Add caseInsensitive() support on CollectionSearch and Term for
metadata/title searches where indexed values preserve case.
- Remove callback support from MetadataSearch::lookupKey() — the only
real usage (case-insensitive substring) is replaced by
caseInsensitive() + wildcards.
- Remove min-length validation from Term. Add Tokenizer::isValidSearchTerm()
for callers that need it (FulltextSearch, Indexer::lookup).
- Optimize execute() from 4 group passes to 2: scan tokens + resolve
frequencies in one pass per group, batch entity name resolution, then
populate Terms.
- Store full match detail in Term: entity → token → frequency. New
accessors getMatches(), getEntityTokens(), getEntityFrequencies()
derive different views from this single data structure.
- Term no longer used as scratch pad by CollectionSearch. Index-internal
data (token IDs, entity IDs) stays local to execute(). Terms receive
only final resolved results.
- Use title from search results in MetadataSearch::pageLookupCallBack()
instead of re-fetching via p_get_first_heading().
- Update concept.txt documentation.

show more ...


# e1272c08 07-Apr-2026 Andreas Gohr <andi@splitbrain.org>

SearchIndex: add backward compatibility wrappers

Add deprecated wrappers for idx_* and ft_* functions that were removed
when inc/indexer.php and inc/fulltext.php were replaced by the new
Search clas

SearchIndex: add backward compatibility wrappers

Add deprecated wrappers for idx_* and ft_* functions that were removed
when inc/indexer.php and inc/fulltext.php were replaced by the new
Search classes. These wrappers delegate to the new architecture and
ensure existing plugins continue to work.

Deprecated standalone functions: idx_get_indexer, idx_getIndex,
idx_lookup, idx_listIndexLengths, idx_indexLengths, ft_pageSearch,
ft_backlinks, ft_mediause, ft_pageLookup, ft_snippet, ft_pagesorter,
ft_snippet_re_preprocess, ft_queryParser.

Deprecated methods on Indexer: lookupKey, getPages, addMetaKeys,
renameMetaValue, getPID, lookup.

Also migrates remaining core callers (Ajax, FeedCreator, ApiCore) to
use the new classes directly and fixes a UTF-8 case folding bug in
MetadataSearch title lookups.

show more ...


# 21fbd01b 07-Apr-2026 Andreas Gohr <andi@splitbrain.org>

SearchIndex: add integrity checking to Collection architecture

Add checkIntegrity() to AbstractCollection and DirectCollection that
verifies paired indexes have matching line counts (token==frequenc

SearchIndex: add integrity checking to Collection architecture

Add checkIntegrity() to AbstractCollection and DirectCollection that
verifies paired indexes have matching line counts (token==frequency,
entity==reverse, entity==token for direct collections). Throws
IndexIntegrityException on the first inconsistency found.

Add Countable interface to AbstractIndex with count() implementations
in MemoryIndex and FileIndex. Add Indexer::checkIntegrity() and
Indexer::isIndexEmpty() to orchestrate checks across all collections.

Update infoutils.php to use the new Indexer API instead of the old
FulltextIndex/MetadataIndex classes.

Fix range(1, 0) bug in three places that produced [1, 0] instead of
an empty array when split-by-length indexes were empty.

show more ...


# 83b3accc 06-Apr-2026 Andreas Gohr <andi@splitbrain.org>

SearchIndex: rewrite Indexer to use Collection classes

Replace the intermediate #2943 classes (FulltextIndex, MetadataIndex)
with the new Collection-based architecture. The Indexer is now a thin
sta

SearchIndex: rewrite Indexer to use Collection classes

Replace the intermediate #2943 classes (FulltextIndex, MetadataIndex)
with the new Collection-based architecture. The Indexer is now a thin
stateless orchestrator that delegates all index work to collections.

Key changes:
- Indexer no longer extends AbstractIndex; page name passed to methods
- addPage/deletePage/clear use PageTitleCollection,
PageFulltextCollection, and PageMetaCollection
- New PageMetaCollection replaces separate ReferencesCollection and
MediaCollection with a single class that handles arbitrary metadata
keys dynamically
- Shared writable FileIndex('page') passed to all collections
- Logger callback replaces verbose parameter
- Methods return void instead of bool
- Index classes implement IteratorAggregate for clean data access
- Indexer tests consolidated into namespaced IndexerTest.php
- All callers updated to new stateless API

show more ...


# 3df1553d 29-Nov-2021 Satoshi Sahara <sahara.satoshi@gmail.com>

use Logger::debug() instead of deprecated dbglog()


# 5792814c 25-Sep-2020 Satoshi Sahara <sahara.satoshi@gmail.com>

fix scrutinizer claims


# 725e8e5f 25-Sep-2020 Satoshi Sahara <sahara.satoshi@gmail.com>

instantiate *Index with numeric page id

will reduce access to static $pidCache


# a32da6dd 25-Sep-2020 Satoshi Sahara <sahara.satoshi@gmail.com>

change Index objects to non-singleton

Indexer, FulltextIndex, MetadataIndex uses common directory to store *.idx files, but this does not mean they should be singleton objects to avoid lock confrict

change Index objects to non-singleton

Indexer, FulltextIndex, MetadataIndex uses common directory to store *.idx files, but this does not mean they should be singleton objects to avoid lock confrictions.

show more ...


# a16bd548 21-Sep-2020 Satoshi Sahara <sahara.satoshi@gmail.com>

remove unnecessary if blocks

getPID(), saveIndex(), saveIndexKey(), getPageWords() return always true, otherwise exceptions.


# 15f699ac 10-Sep-2020 Andreas Gohr <andi@splitbrain.org>

replace user errors with exceptions

Exceptions are better to handle than errors. What I don't like is that
we now have an unfortunate mix of return code and exception signalling
for errors. Some met

replace user errors with exceptions

Exceptions are better to handle than errors. What I don't like is that
we now have an unfortunate mix of return code and exception signalling
for errors. Some methods still return false for errors while others
now throw exceptions (always returning true otherwise).

show more ...


# abb227bc 13-Mar-2020 Satoshi Sahara <sahara.satoshi@gmail.com>

add comment for $requireLock argument


# 51ddbadd 20-Feb-2020 Satoshi Sahara <sahara.satoshi@gmail.com>

move histogram() into MetadataIndex class


# 5f9bd525 20-Feb-2020 Satoshi Sahara <sahara.satoshi@gmail.com>

Revert "histogram() change args order"

This reverts commit 4d04b7bbfe9c97673b0f22586d88e161aca34f70.


# 11d2e7d0 20-Feb-2020 Satoshi Sahara <sahara.satoshi@gmail.com>

simplify dispatch()


# 653b91a2 01-Feb-2020 Satoshi Sahara <sahara.satoshi@gmail.com>

AbstractIndex class const INDEX_MARK_DELETED


# 743c9a28 31-Jan-2020 Satoshi Sahara <sahara.satoshi@gmail.com>

rename PagewordIndex to FulltextIndex


# 5237d405 31-Jan-2020 Satoshi Sahara <sahara.satoshi@gmail.com>

trigger error when lock/unlock index directory failed


# 4d04b7bb 31-Jan-2020 Satoshi Sahara <sahara.satoshi@gmail.com>

histogram() change args order

INdexer::histogram() is only used in indexer_histogram.test.php file.


# a2f39162 30-Jan-2020 Satoshi Sahara <sahara.satoshi@gmail.com>

fix typo


12