| c3755410 | 20-Apr-2026 |
Andreas Gohr <gohr@cosmocode.de> |
require non-whitespace adjacency for inline formatting delimiters
An opening delimiter must now be followed by a non-whitespace character, and a closing delimiter must be preceded by one. Empty deli
require non-whitespace adjacency for inline formatting delimiters
An opening delimiter must now be followed by a non-whitespace character, and a closing delimiter must be preceded by one. Empty delimiter pairs (****, ____, '''', <sub></sub>, <sup></sup>, <del></del>) no longer match and stay literal.
Rationale: this matches Markdown's flanking-delimiter rules and eliminates accidental bolding of sequences like `** note**` at the start of a sentence. Well-formed uses (**bold**, //italic//, __underline__) are unchanged.
Affected modes: Strong, Emphasis, Underline, Monospace, Subscript, Superscript, Deleted.
BREAKING: content that was already malformed but previously rendered as formatted (e.g. `**foo bar **`) now stays literal.
show more ...
|
| 10fb3d65 | 20-Apr-2026 |
Andreas Gohr <gohr@cosmocode.de> |
prevent inline formatting from matching across paragraph boundaries
The Lexer compiles all patterns with the `s` (DOTALL) flag via ParallelRegex::getPerlMatchingFlags(), which makes `.` match newlin
prevent inline formatting from matching across paragraph boundaries
The Lexer compiles all patterns with the `s` (DOTALL) flag via ParallelRegex::getPerlMatchingFlags(), which makes `.` match newlines. Inline formatting modes use lookaheads like `\*\*(?=.*\*\*)` to verify a closing delimiter exists, so with DOTALL a lone `**` happily matched its "closer" many paragraphs later, swallowing blank lines into a single <strong> run.
Add CONTENT_UNTIL_PARA on AbstractMode — a regex snippet matching any character unless it would start a paragraph break (blank line, possibly with horizontal whitespace). Update all inline formatting entry patterns (Strong, Emphasis, Underline, Monospace, Subscript, Superscript, Deleted) to use it in their closing-delimiter lookaheads.
Emphasis also gets a real closing-`//` check; its previous lookahead just verified "content exists with a non-colon char" without requiring the closing delimiter at all.
Single newlines inside a delimiter pair still match (multi-line formatting); only blank lines end it.
BREAKING: This means you no longer can mark multiple paragraphs as bold or strike them out. On the other hand it prevents accidentally breaking the page layout by missing a closing delimiter (as reported many many times over the years) eg. #1025 #3588 #1056
show more ...
|
| 17c6179b | 20-Apr-2026 |
Andreas Gohr <gohr@cosmocode.de> |
add $conf['syntax'] setting and conditional mode loading in ModeRegistry
Introduce a new 'syntax' configuration setting (dokuwiki, markdown, dw+md, md+dw) that controls which parser modes are loaded
add $conf['syntax'] setting and conditional mode loading in ModeRegistry
Introduce a new 'syntax' configuration setting (dokuwiki, markdown, dw+md, md+dw) that controls which parser modes are loaded. Built-in modes are split into always-loaded (no Markdown equivalent), DW-only, and MD-only groups. Refactor getModes() into focused sub-methods for each group.
No Gfm mode classes exist yet, so only 'dokuwiki' is functional. The change is a strict no-op for existing behavior.
show more ...
|
| 90c2f6e3 | 18-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
Clean up stale realip references after client_ip_header rename
Update docblocks in Ip.php and common.php, fix old tests to use the new config key, remove outdated translations, fix method casing in
Clean up stale realip references after client_ip_header rename
Update docblocks in Ip.php and common.php, fix old tests to use the new config key, remove outdated translations, fix method casing in test, and add example to English config description.
show more ...
|
| 504c13e8 | 18-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
migrate parser tests to modern namespaced classes
Move old-style parser tests from _test/tests/inc/parser/ to namespaced test classes under _test/tests/Parsing/ParserMode/ and _test/tests/Parsing/Le
migrate parser tests to modern namespaced classes
Move old-style parser tests from _test/tests/inc/parser/ to namespaced test classes under _test/tests/Parsing/ParserMode/ and _test/tests/Parsing/Lexer/.
- Add ParserTestBase with assertCalls() helper for comparing handler call sequences with byte index stripping - Split lexer.test.php into ParallelRegexTest, StateStackTest, and LexerTest with a RecordingHandler that extends Handler - Merge handler_parse_highlight_options tests into CodeTest - Add new tests for previously untested modes: Nocache, Notoc, Rss, and all individual formatting modes (Strong, Emphasis, etc.) - Modernize test code: [] syntax, lowercase null, correct assertEquals argument order, replace deprecated withConsecutive and string callables - Renderer tests remain in old location (renderers not yet migrated)
show more ...
|
| 71096e46 | 18-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
move handler methods into ParserMode classes and rename Handler
Each ParserMode class now implements handle() from ModeInterface, containing the token handling logic that previously lived as individ
move handler methods into ParserMode classes and rename Handler
Each ParserMode class now implements handle() from ModeInterface, containing the token handling logic that previously lived as individual methods on Doku_Handler.
The Handler class (formerly Doku_Handler) is the single dispatch point: Lexer passes tokens to Handler::handleToken() which routes to mode objects, plugins, or returns false. The Lexer only tokenizes and resolves mapHandler aliases.
Key changes: - Add handle() to ModeInterface, implemented by all mode classes - Move Doku_Handler to dokuwiki\Parsing\Handler namespace - File extends Code (shared parsing via $type property) - Quotes uses mapHandler() + Handler::getModeName() for sub-modes - Media::parseMedia() replaces Doku_Handler_Parse_Media() - Code::parseHighlightOptions() replaces parse_highlight_options() - Per-parse state (footnote, doublequote) stays on Handler - Deprecated wrappers kept for base/header/internallink/media - Class alias and rector rules added for backward compatibility
show more ...
|
| 7958e698 | 16-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
decouple hardcoded mode names in Eol and Preformatted
Eol.php hardcoded ['listblock', 'table'] as modes to skip, and Preformatted.php hardcoded [\*\-] as a negative lookahead for list markers. Both
decouple hardcoded mode names in Eol and Preformatted
Eol.php hardcoded ['listblock', 'table'] as modes to skip, and Preformatted.php hardcoded [\*\-] as a negative lookahead for list markers. Both embed knowledge that belongs to the respective block modes, not to Eol/Preformatted. Adding a new block mode that handles its own EOL or uses different line start markers would require editing these unrelated files — a hidden coupling.
Listblock and Table now register themselves on ModeRegistry during preConnect(). Eol queries getBlockEolModes() and Preformatted queries getLineStartMarkers() to build its lookahead dynamically. Each mode owns its own data, and new block modes can participate without touching unrelated files.
show more ...
|
| dba14ea3 | 16-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
add unit tests for ModeRegistry |
| 1f443476 | 16-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
split Formatting into individual classes per formatting type
Introduce AbstractFormatting as a base class and seven concrete classes (Strong, Emphasis, Underline, Monospace, Subscript, Superscript,
split Formatting into individual classes per formatting type
Introduce AbstractFormatting as a base class and seven concrete classes (Strong, Emphasis, Underline, Monospace, Subscript, Superscript, Deleted) that each define their own patterns and sort order. Delete the old Formatting class and update tests to use the new classes directly. ModeRegistry now treats formatting modes as regular built-in modes.
show more ...
|
| c8dd1b9d | 16-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
introduce ModeRegistry to encapsulate parser mode categories
Replace the global $PARSER_MODES definition in inc/parser/parser.php with a ModeRegistry singleton that initializes and manages the mode
introduce ModeRegistry to encapsulate parser mode categories
Replace the global $PARSER_MODES definition in inc/parser/parser.php with a ModeRegistry singleton that initializes and manages the mode categories. The global array is still populated for backward compatibility with plugins that access it directly.
Mode constructors now use ModeRegistry::getModesForCategories() instead of accessing the global directly. p_get_parsermodes() and p_sort_modes() are moved to inc/deprecated.php as thin wrappers.
show more ...
|
| 8ab4ec30 | 16-Apr-2026 |
Andreas Gohr <gohr@cosmocode.de> |
remove dead ParallelRegex::apply() method
Remove apply() which was never called from production code. Rewrite the inherited SimpleTest tests to use split() instead, and add a test for pre/post-match
remove dead ParallelRegex::apply() method
Remove apply() which was never called from production code. Rewrite the inherited SimpleTest tests to use split() instead, and add a test for pre/post-match splitting.
show more ...
|
| fe6048cc | 14-Apr-2026 |
Alexander Lehmann <alexlehm@gmail.com> |
remove realip option, add default in conf/dokuwiki.php |
| bfc167db | 11-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
Limit namespace depth in io_createNamespace() #4613
Throw a RuntimeException when the given ID contains 128 or more colon-separated segments, preventing creation of excessively deep directory hierar
Limit namespace depth in io_createNamespace() #4613
Throw a RuntimeException when the given ID contains 128 or more colon-separated segments, preventing creation of excessively deep directory hierarchies.
show more ...
|
| 894c1577 | 11-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
skip strict types rector
even though this rule only adds strict type checking for classes that already are fully typed, I am a bit hesitant to add the strict type declaration yet. maybe later |
| 867da04d | 11-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
Less ubiquitous feed caching. addresses #4574
Instead of creating caches for each and every requested feed, only the recent feed is still cached.
The number of items is clamped to conf[recent]*5.
Less ubiquitous feed caching. addresses #4574
Instead of creating caches for each and every requested feed, only the recent feed is still cached.
The number of items is clamped to conf[recent]*5.
Plugins can influence the caching behavior via the existing FEED_OPTS_POSTPROCESS event by setting cache_allow to true and optionally adding their own cache key in cache_key
Additionally the per-namespace feed autodiscovery link from <head> pointing to list-mode feeds has been removed.
show more ...
|
| 7383ed40 | 10-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
Replace deprecated idx_addPage/ft_backlinks calls in ApiCoreTest
Use Indexer::addPage() and MetadataSearch::backlinks() directly. |
| b188a75b | 10-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
SearchIndex: fix IntegrityTest not re-indexing between tests
The .indexed metadata tag persisted between test methods, causing needsIndexing() to skip re-indexing when saveWikiText() didn't update t
SearchIndex: fix IntegrityTest not re-indexing between tests
The .indexed metadata tag persisted between test methods, causing needsIndexing() to skip re-indexing when saveWikiText() didn't update the wiki file (identical content). Clean the tag in setUp.
show more ...
|
| 06053dca | 10-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
SearchIndex: remove write side effect from retrieveRow()
retrieveRow() padded the index file when the requested RID was beyond the current length. This was an optimization for subsequent changeRow()
SearchIndex: remove write side effect from retrieveRow()
retrieveRow() padded the index file when the requested RID was beyond the current length. This was an optimization for subsequent changeRow() calls, but changeRow() already handles padding on its own. The side effect was also inconsistent with retrieveRows() which is a pure read.
Added a cross-index integration test verifying RID consistency across entity, token, frequency and reverse indexes when multiple entities share tokens.
show more ...
|
| b2c5d210 | 10-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
updated dependencies (core and test)
Using our own php-ixr fork until the following PRs are merged upstream:
https://github.com/kissifrot/php-ixr/pull/13 https://github.com/kissifrot/php-ixr/pull/14 |
| 1148921d | 08-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
SearchIndex: unify CollectionSearch API and optimize search pipeline
- Remove separate lookup() API from CollectionSearch. All searches now use addTerm()/execute() with a single unified pipeline.
SearchIndex: unify CollectionSearch API and optimize search pipeline
- Remove separate lookup() API from CollectionSearch. All searches now use addTerm()/execute() with a single unified pipeline. - Add matches() predicate to Term using efficient string functions (===, str_starts_with, str_ends_with, str_contains) instead of regex. - Add caseInsensitive() support on CollectionSearch and Term for metadata/title searches where indexed values preserve case. - Remove callback support from MetadataSearch::lookupKey() — the only real usage (case-insensitive substring) is replaced by caseInsensitive() + wildcards. - Remove min-length validation from Term. Add Tokenizer::isValidSearchTerm() for callers that need it (FulltextSearch, Indexer::lookup). - Optimize execute() from 4 group passes to 2: scan tokens + resolve frequencies in one pass per group, batch entity name resolution, then populate Terms. - Store full match detail in Term: entity → token → frequency. New accessors getMatches(), getEntityTokens(), getEntityFrequencies() derive different views from this single data structure. - Term no longer used as scratch pad by CollectionSearch. Index-internal data (token IDs, entity IDs) stays local to execute(). Terms receive only final resolved results. - Use title from search results in MetadataSearch::pageLookupCallBack() instead of re-fetching via p_get_first_heading(). - Update concept.txt documentation.
show more ...
|
| 5e9d26e3 | 07-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
SearchIndex: move search() function tests back to tests/inc/search/
The search.test.php file tests the search() function from inc/search.php, not the Search namespace classes. It was incorrectly mov
SearchIndex: move search() function tests back to tests/inc/search/
The search.test.php file tests the search() function from inc/search.php, not the Search namespace classes. It was incorrectly moved into tests/Search/ during the test suite reorganization. Move it and its data files (ns1/, ns2/) back to their original location, keeping only searchtest.txt in tests/Search/data/ where it belongs.
show more ...
|
| e1272c08 | 07-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
SearchIndex: add backward compatibility wrappers
Add deprecated wrappers for idx_* and ft_* functions that were removed when inc/indexer.php and inc/fulltext.php were replaced by the new Search clas
SearchIndex: add backward compatibility wrappers
Add deprecated wrappers for idx_* and ft_* functions that were removed when inc/indexer.php and inc/fulltext.php were replaced by the new Search classes. These wrappers delegate to the new architecture and ensure existing plugins continue to work.
Deprecated standalone functions: idx_get_indexer, idx_getIndex, idx_lookup, idx_listIndexLengths, idx_indexLengths, ft_pageSearch, ft_backlinks, ft_mediause, ft_pageLookup, ft_snippet, ft_pagesorter, ft_snippet_re_preprocess, ft_queryParser.
Deprecated methods on Indexer: lookupKey, getPages, addMetaKeys, renameMetaValue, getPID, lookup.
Also migrates remaining core callers (Ajax, FeedCreator, ApiCore) to use the new classes directly and fixes a UTF-8 case folding bug in MetadataSearch title lookups.
show more ...
|
| 74a9499c | 07-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
SearchIndex: remove legacy intermediate classes from PR #2943
Remove FulltextIndex, MetadataIndex, and the old AbstractIndex which were introduced as a stepping stone in #2943. All callers now use t
SearchIndex: remove legacy intermediate classes from PR #2943
Remove FulltextIndex, MetadataIndex, and the old AbstractIndex which were introduced as a stepping stone in #2943. All callers now use the Collection/Index architecture directly.
Also fix a bug in detail.php where mediause() was called with ignore_perms=true, leaking references from hidden/protected pages to unprivileged users. This bug existed on master as well.
Old test files replaced by their modernized equivalents in tests/Search/.
show more ...
|
| 21fbd01b | 07-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
SearchIndex: add integrity checking to Collection architecture
Add checkIntegrity() to AbstractCollection and DirectCollection that verifies paired indexes have matching line counts (token==frequenc
SearchIndex: add integrity checking to Collection architecture
Add checkIntegrity() to AbstractCollection and DirectCollection that verifies paired indexes have matching line counts (token==frequency, entity==reverse, entity==token for direct collections). Throws IndexIntegrityException on the first inconsistency found.
Add Countable interface to AbstractIndex with count() implementations in MemoryIndex and FileIndex. Add Indexer::checkIntegrity() and Indexer::isIndexEmpty() to orchestrate checks across all collections.
Update infoutils.php to use the new Indexer API instead of the old FulltextIndex/MetadataIndex classes.
Fix range(1, 0) bug in three places that produced [1, 0] instead of an empty array when split-by-length indexes were empty.
show more ...
|
| 6734bb8c | 07-Apr-2026 |
Andreas Gohr <andi@splitbrain.org> |
SearchIndex: rewrite MetadataSearch to use Collection classes
Replace MetadataIndex usage in MetadataSearch with the new Collection/Index architecture. This completes the read-path migration so data
SearchIndex: rewrite MetadataSearch to use Collection classes
Replace MetadataIndex usage in MetadataSearch with the new Collection/Index architecture. This completes the read-path migration so data written by the Collection-based Indexer is read back correctly using TupleOps tuple format.
Generalize FrequencyCollectionSearch into CollectionSearch that works with any AbstractCollection type (Frequency, Lookup, Direct) and handles both split-by-length and non-split index layouts transparently. DirectCollection participates via resolveTokenFrequencies() which maps token RID = entity RID.
Key changes: - AbstractCollection gains isSplitByLength(), resolveTokenFrequencies(), getEntitiesWithData(), and groupToSuffix() with validation - Index groups are now int (0 = non-split, positive = token length) - CollectionSearch provides both addTerm()/execute() for fulltext and lookup() for metadata-style search (exact/wildcard/callback) - MetadataSearch delegates entirely to collection APIs - Shared filterPages() replaces duplicated page filtering logic - All callers updated from MetadataIndex to MetadataSearch - Tests moved to Search namespace with full coverage for new APIs
show more ...
|