History log of /dokuwiki/_test/ (Results 51 – 75 of 1080)
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
13a62f8104-May-2026 Andreas Gohr <andi@splitbrain.org>

rename syntax flavors 'dokuwiki' / 'markdown' to 'dw' / 'md'

Symmetry with the existing 'dw+md' / 'md+dw' setting values.

c4bcbc2e04-May-2026 Andreas Gohr <andi@splitbrain.org>

add GfmLinebreak for GFM hard line breaks

Two-or-more trailing spaces, or a single backslash, immediately before
a non-final newline render as a `<br/>`. Both delimiter forms share a
single SUBSTITI

add GfmLinebreak for GFM hard line breaks

Two-or-more trailing spaces, or a single backslash, immediately before
a non-final newline render as a `<br/>`. Both delimiter forms share a
single SUBSTITION mode at sort 140, loaded under any MD-active syntax
(markdown, dw+md, md+dw); pure dokuwiki is unaffected.

Reuses the existing `linebreak` handler call and renderer; no new
instructions or renderer changes. SpecCompatRenderer overrides
linebreak() to emit the spec's `<br />` shape. Examples 662, 663
(line break inside a raw HTML tag) are skipped — raw HTML is not
passed through by default.

show more ...

3e6baeff30-Apr-2026 Andreas Gohr <andi@splitbrain.org>

replace DW Hr with unified GfmHr

Single mode covers both DokuWiki (4+ dashes) and GFM (3+ of -/*/_)
horizontal rules; pattern self-narrows on $conf['syntax']. Always
loaded across all four syntax se

replace DW Hr with unified GfmHr

Single mode covers both DokuWiki (4+ dashes) and GFM (3+ of -/*/_)
horizontal rules; pattern self-narrows on $conf['syntax']. Always
loaded across all four syntax settings, mirroring the GfmQuote
replacement pattern. Same `hr` handler call so renderers and the
call API are unchanged.

Drops DW's old [ \t]* leading-whitespace tolerance — inert in
practice past 0-1 spaces (Preformatted at sort 20 intercepts
everything ≥ 2 spaces or any tab).

Spec examples 13, 20, 26-28, 224 turn green; 17, 21-24, 29, 30, 31
go to skip.php as deliberate non-implementations (whitespace
tolerance and list-precedence cases).

show more ...

309a085230-Apr-2026 Andreas Gohr <andi@splitbrain.org>

replace DW Quote with unified GfmQuote

GfmQuote covers blockquote parsing for both DokuWiki and GFM dialects
in a single mode. Same quote_open/quote_close handler instructions; a
DW-preferred post-p

replace DW Quote with unified GfmQuote

GfmQuote covers blockquote parsing for both DokuWiki and GFM dialects
in a single mode. Same quote_open/quote_close handler instructions; a
DW-preferred post-pass flattens sub-parsed paragraph wrapping into
linebreak calls so existing pages keep their <br/>-between-lines
rendering. MD-preferred keeps the <p>-wrapped spec shape.

Block content (lists, fenced code, tables) inside `>` quotes now
renders, since the body is sub-parsed. Headers stay excluded
(BASEONLY) — TOC and section-edit anchors don't compose with
<blockquote>, same rationale as GfmListblock.

Convert ModeRegistry's sub-parser cache into an acquire/release pool
to support same-key re-entrancy: a list inside a quote re-enters
gfm_quote during the list-item sub-parse, and the inner call needs
its own parser instance even though the exclusion key matches.
GfmListblock is updated to use the new acquire/release primitives.

show more ...

f7c6e4ac30-Apr-2026 Andreas Gohr <gohr@cosmocode.de>

add listo_open_start sibling method for GFM start numbers

Reverts the listo_open signature widening from 5a2118acc and instead
adds a sibling method `listo_open_start($start = 1)` on the renderer
hi

add listo_open_start sibling method for GFM start numbers

Reverts the listo_open signature widening from 5a2118acc and instead
adds a sibling method `listo_open_start($start = 1)` on the renderer
hierarchy. The base default delegates to listo_open() so renderers
that don't override it still produce a valid (but unnumbered) list;
xhtml's override emits <ol start="N">.

The handler now emits 'listo_open_start' only for ordered lists with
a non-default first number; plain ordered lists keep emitting the
unchanged 'listo_open' instruction. This preserves the historical
listo_open / listu_open signatures (zero-arg base, $classes-only
xhtml form from 2016) so the 17 plugin renderers found via
codesearch keep working without modification, while still
implementing GFM's "5. foo" -> <ol start="5"> rule.

show more ...

74031e4628-Apr-2026 Andreas Gohr <andi@splitbrain.org>

add GfmEscape for GFM backslash escapes

Implements GFM §6.1 backslash-escape handling. GfmEscape is a sort-5
inline mode in CATEGORY_SUBSTITION that claims `\X` for any escapable
ASCII punctuation c

add GfmEscape for GFM backslash escapes

Implements GFM §6.1 backslash-escape handling. GfmEscape is a sort-5
inline mode in CATEGORY_SUBSTITION that claims `\X` for any escapable
ASCII punctuation char before competing delimiters can match. The
shared character class lives on Helpers\Escape so the lexer pattern
and the post-hoc unescape stay in lockstep.

Whole-span captures (GfmCode info string, GfmLink label/URL) bypass
the lexer; those modes call Escape::unescapeBackslashes() on the
relevant slot. GfmLink skips the unescape when the URL classifies as
a windowssharelink so the leading \\host survives intact.

GfmTable cells get a separate per-cell `\|` to `|` pass in the
rewriter to honour the tables-extension rule that pipes always
unescape, even inside code spans where standard §6.1 escapes don't
fire.

show more ...

3dabe4e028-Apr-2026 Andreas Gohr <andi@splitbrain.org>

add GfmTable for GFM tables

Implements the GFM pipe-table extension as a CONTAINER mode at sort 55,
one below DW Table at 60. A lookahead-validated entry pattern asserts a
header line plus a `:?-+:?

add GfmTable for GFM tables

Implements the GFM pipe-table extension as a CONTAINER mode at sort 55,
one below DW Table at 60. A lookahead-validated entry pattern asserts a
header line plus a `:?-+:?` delimiter row before consuming any input, so
non-table paragraphs containing pipes flow through unchanged. Cells are
inline-only per spec.

Handler\GfmTable rewrites the flat token stream into the canonical
table_open / tablethead_* / tabletbody_* / table_close sequence, deriving
per-column alignment from the delimiter row, padding short body rows
(spec 202), truncating long ones (spec 204), and falling back to a single
cdata when the column count mismatches (spec 203).

`tabletbody_open` / `tabletbody_close` are emitted for the first time;
they are part of the base renderer API but DW Table never used them.
Added to Block's blockOpen / blockClose lists alongside `tabletfoot_*`
for symmetry. SpecCompatRenderer gains minimal table-element overrides
so spec roundtrip output matches GFM's `<table><thead><tr><th>` shape
without DW's wrapper div, row/col counter classes, or align-as-class.

show more ...

685560eb28-Apr-2026 Andreas Gohr <andi@splitbrain.org>

add GfmListblock for GFM lists

GfmListblock captures an entire list block atomically with one
addSpecialPattern match, then walks the captured text in handle()
grouping lines into items. Each item's

add GfmListblock for GFM lists

GfmListblock captures an entire list block atomically with one
addSpecialPattern match, then walks the captured text in handle()
grouping lines into items. Each item's body is dedented to its
content column and parsed by ModeRegistry::getSubParser() so
block content (paragraphs, fenced code, blockquotes, plugin
blocks) works inside items uniformly. Sub-parsed calls are wrapped
in a Nest call before they reach the outer handler, matching the
Footnote pattern: the main handler's Block rewriter treats nest
as opaque and the renderer base class unwraps it transparently,
so multi-paragraph items don't get double-wrapped in <p>.

Marker syntax: -, *, + (unordered) or 1-9 digits followed by
. or ) (ordered). Indentation is a 2-space-multiple step starting
at 0; depth = (indent / 2) + 1, odd indents round down, tabs become
two spaces. The first ordered item's number drives the start
attribute on <ol> via the listo_open $start parameter.

GfmLists subclasses AbstractListsRewriter with the GFM marker
parser; the state machine on the base class is shared with DW Lists.

GfmListblock loads only when $conf['syntax'] is markdown or md+dw.
Under those settings the DW Listblock is suppressed because the two
list models conflict — DW's mandatory 2-space indent rule vs GFM's
zero-indent top-level rule, and -/*/+ markers shared. Plugins that
relied on Listblock loading under md+dw will see it absent there.

Sub-parser exclusion set: CATEGORY_BASEONLY (no Header inside list
items) and gfm_listblock itself (defensive guard against re-entry
on pathological inputs; nested lists are handled by the outer
pattern, not by re-entry).

Tests cover marker variants, ordered start numbers, nested lists at
two and three levels, inline formatting inside items, marker-
character switches keeping one list, type switches splitting the
list, fenced code inside items, multi-paragraph (loose) items, and
two regressions on blank-line tolerance inside the captured block.
SpecCompatRenderer learns to render the list call sequence, and
spec.txt tests for digit/marker-width/lazy-continuation behavior
that GfmListblock deliberately doesn't implement are documented in
gfm-spec/skip.php with the per-bucket reasons (A-F).

Drops two now-obsolete entries from skip.php (image escapes that
land via earlier GfmLink/GfmMedia work) and inlines the Setext
explanation that previously pointed at SPEC.md. Replaces the
SPEC.md reference in GfmEmphasisTest with the inline reason.

show more ...

9172eccf28-Apr-2026 Andreas Gohr <andi@splitbrain.org>

add sub-parser support to Handler / Parser / ModeRegistry

A block mode that wants to parse the body of one of its captured
matches needs a second Parser instance configured with the active
modes min

add sub-parser support to Handler / Parser / ModeRegistry

A block mode that wants to parse the body of one of its captured
matches needs a second Parser instance configured with the active
modes minus whatever would re-enter the outer mode. Doing this by
hand is verbose and easy to get wrong — modes hold a $Lexer slot
that addMode() overwrites, so the same mode object can't be shared
between the main parser and a sub-parser.

Three small additions:

Handler::reset() — clears calls, status, currentModeName, and
installs a fresh CallWriter. Lets one Handler instance be parsed
against repeatedly without state bleed.

Parser::getHandler() — accessor; sub-parser callers need it to
reach the handler for reset() and for harvesting the produced
call list.

ModeRegistry::getSubParser($excludeCategories, $excludeModes) —
returns a cached Parser preconfigured with every active mode
except those excluded. Mode objects are cloned before being
attached so connectTo()'s assignment to $Lexer does not clobber
the main parser's references. Cache key is the exclusion-set;
default exclusion is CATEGORY_BASEONLY (no Header inside the
sub-parsed content).

Tests cover Handler::reset's full clear, sub-parser caching,
default and custom exclusions, registry-reset propagation, and
the clone-not-share invariant for $Lexer.

show more ...

96d096f127-Apr-2026 Andreas Gohr <andi@splitbrain.org>

remove getLineStartMarkers registry — sort order already wins

Preformatted's entry pattern carried a `(?![\*\-])` negative
lookahead to defer to list modes on indented bullet lines.
0cecf9d50 (2005,

remove getLineStartMarkers registry — sort order already wins

Preformatted's entry pattern carried a `(?![\*\-])` negative
lookahead to defer to list modes on indented bullet lines.
0cecf9d50 (2005, "new parser added") introduced it hardcoded;
7958e6980 (2026, "decouple hardcoded mode names in Eol and
Preformatted") refactored that hardcoded knowledge into
register/getLineStartMarkers on ModeRegistry so each list mode
owned its marker chars. Both preserved the behavior verbatim;
neither documented why it was needed.

Tracing the lexer, it isn't. ParallelRegex merges all entry
patterns into one PCRE expression; PCRE returns the leftmost
match and breaks ties on expression order. Modes are added in
sort order via ModeRegistry::getModes(), so Listblock (sort 10)
always precedes Preformatted (sort 20) and wins the tie on
" - foo" without any lookahead. The only test that caught a
difference was testPreformattedList, which happened to register
modes in non-canonical order - that was a test bug.

This patch drops the lookahead in Preformatted::connectTo, the
registerLineStartMarkers call in Listblock::preConnect, the
register/getLineStartMarkers methods on ModeRegistry, and the
three registry-API unit tests. testPreformattedList now
registers Listblock before Preformatted.

show more ...

01e8d73925-Apr-2026 Andreas Gohr <andi@splitbrain.org>

refactor(changelog): persist external-edit detection on first read

This addresses the flaky test that makes tests randomly fail (mostly on
windows runners).

The flake in common_saveWikiText_test::t

refactor(changelog): persist external-edit detection on first read

This addresses the flaky test that makes tests randomly fail (mostly on
windows runners).

The flake in common_saveWikiText_test::test_savesequence5 came from
this line in ChangeLog::getCurrentRevisionInfo():

'date' => max($lastRev + 1, time() - 1)

The synthesized "external delete" entry was kept in memory only and
only persisted later, when saveWikiText next called detectExternalEdit.
That meant the formula was evaluated twice on different ChangeLog
instances — once during the test's inspection, and again during the
following saveWikiText — and the two evaluations could pick different
seconds depending on how long the surrounding I/O took. The test
cached the first result in $expectExternal and asserted it against the
on-disk entry written during the second call. On the slower Windows
runner the second call sometimes crossed a second boundary, producing
the off-by-one date mismatch.

The questions I had was, why are we persisting external file deletions
(or edits) only when a page is saved when we are obviously already
detecting it earlier during the changelog read already?

Instead of recording the external delete at the time a new page is
written, it makes sense to record it as soon as we detect it (when the
changelog is requested by a user or a bot). This will make the recoded
timestamp closer to the actual deletion.

This patch refactors the changelog accordingly, but still tries to be
minimal invasive (I think the changelog handling would need much more
refactoring, but that's beyond the scope of this change).

To enable proper locking (when logging an external edit and copying
the attic file), locking had to be moved to the Changelog class,
duplicating some code of io_saveFile.

PageFile::detectExternalEdit() and the deprecated procedural wrapper
detectExternalEdit() in inc/common.php are removed. A codesearch.dokuwiki.org
check confirmed no plugin calls the method directly; the only external
caller of the procedural function is the farmsync plugin, which needs
a parallel update.

show more ...

dd9e8e5e23-Apr-2026 Andreas Gohr <andi@splitbrain.org>

fix EXIF-rotated images shown cropped in previews, closes #4482

JPEGs with EXIF orientation 5/6/7/8 were rendered cropped in the
mediamanager detail view, the image diff view and the fullscreen
deta

fix EXIF-rotated images shown cropped in previews, closes #4482

JPEGs with EXIF orientation 5/6/7/8 were rendered cropped in the
mediamanager detail view, the image diff view and the fullscreen
detail page: getimagesize() / JpegMeta report raw (rotation-unaware)
pixel dimensions, and passing both w and h to fetch.php defaults to
center-crop.

- Bump splitbrain/slika to 1.1, which ships ImageInfo: a rotation-
aware, metadata-only dimension simulator mirroring Adapter's
fluent API (autorotate/resize/crop).
- Add fit=1 to fetch.php: when both w and h are given, route to
media_resize_image() (bbox fit) instead of media_crop_image().
Token hashes only (id, w, h) so existing tokens stay valid.
- Add MediaFile::getDisplayDimensions($w, $h, $crop) which delegates
to ImageInfo and returns the dims fetch.php would produce.
- Add Display::getDetailHtml() and retire media_preview() / the old
media_image_preview_size() helper. media_tab_view and MediaDiff
now share the Display-based renderer.
- Rewrite tpl_img() (lib/tpl/*/detail.php) to use MediaFile dims +
fit=1 URL; drops the manual ratio math.
- Tests: MediaFileTest covers the dims math, DisplayTest covers the
detail-view HTML (rotated dims, rev-vs-timestamp URL selection,
structure), fetch_imagetoken gains a fit-token compat test.
Fixture _test/data/media/wiki/exif-orient-6.jpg: 20x30 JPEG with
EXIF orientation 6.

show more ...

1e28e40623-Apr-2026 Andreas Gohr <andi@splitbrain.org>

split Parsing\Helpers into per-domain Link / Media / Code classes

781f5c7123-Apr-2026 Andreas Gohr <andi@splitbrain.org>

gate monospace, unformatted, file on DokuWiki syntax

These DokuWiki specific modes should only be loaded when DokuWiki syntax
is still wanted, not in Markdown-only mode.
Expands the ModeRegistryTest

gate monospace, unformatted, file on DokuWiki syntax

These DokuWiki specific modes should only be loaded when DokuWiki syntax
is still wanted, not in Markdown-only mode.
Expands the ModeRegistryTest data provider to cover the full always-loaded
and DW-always sets.

show more ...

b1c59bed23-Apr-2026 Andreas Gohr <andi@splitbrain.org>

add GfmCode / GfmFile for fenced code blocks

GfmCode (backticks) emits the `code` handler instruction; GfmFile
(tildes) emits `file`. Column-0 fences only, no length pairing
between opener and close

add GfmCode / GfmFile for fenced code blocks

GfmCode (backticks) emits the `code` handler instruction; GfmFile
(tildes) emits `file`. Column-0 fences only, no length pairing
between opener and closer, and unclosed fences stay literal —
matching DokuWiki's `<code>` tag convention. The info string accepts
DW's full attribute vocabulary (language, filename, [options])
through a new shared `Helpers::parseCodeAttributes` that `Code`
also uses, with `html` aliased to `html4strict` and `-` meaning "no
language".

Preformatted's indent threshold is now preference-gated: 2 spaces
in DW-preferred settings, 4 spaces in MD-preferred, matching GFM's
indented code block rule. A single tab is a trigger in both.

show more ...

3440a8c022-Apr-2026 Andreas Gohr <gohr@cosmocode.de>

add GfmMedia and extend GfmLink with image-as-label form

- New GfmMedia parses `![alt](url)` with the full DokuWiki media-parameter
vocabulary in the URL slot (?100x200, ?right, ?nolink, ?recache,

add GfmMedia and extend GfmLink with image-as-label form

- New GfmMedia parses `![alt](url)` with the full DokuWiki media-parameter
vocabulary in the URL slot (?100x200, ?right, ?nolink, ?recache, …).
Adds `?left`/`?right`/`?center` align keywords shared with DW `{{…}}`
— gives pure-Markdown users a way to align inline images.
- GfmLink now also matches `[![alt](img)](target)` — the GFM equivalent
of `[[target|{{img}}]]`. Detection is post-entry, mirroring
Internallink's `^{{…}}$` check; one mode covers the whole family.
- LinkDispatch trait replaced by Helpers::classifyLink and
Helpers::parseMediaParameters — two pure static methods, shared by
DW and GFM counterparts.
- Entry patterns for GfmLink / GfmMedia simplified (permissive URL slot,
handle-time parsing), following DW's Internallink style.
- GfmSpecTest drives a test-only SpecCompatRenderer that emits bare
<img> / <a> instead of DW's wiki-wrapped HTML, recovering 13 spec
tests that previously failed/skipped only because of renderer shape.

show more ...

e89aeebd22-Apr-2026 Andreas Gohr <gohr@cosmocode.de>

add GfmLink for GFM inline links `[text](url)`

Extracts the URL-classification ladder from Internallink into a
LinkDispatch trait so both modes route identically across all six
DokuWiki link flavors

add GfmLink for GFM inline links `[text](url)`

Extracts the URL-classification ladder from Internallink into a
LinkDispatch trait so both modes route identically across all six
DokuWiki link flavors (internal, external, interwiki, email,
windowsshare, local anchor). GfmLink parses the `[text](url)` form
with optional `"title"` / `'title'` and hands the URL to the trait.
The GFM title attribute is discarded — DokuWiki link instructions
have no slot for it.

show more ...

8719732d22-Apr-2026 Andreas Gohr <gohr@cosmocode.de>

add GfmHeader for ATX headings (`# text` through `###### text`)

Opener must sit at column 0. GFM tolerates 0-3 spaces before the `#`
but that collides with DokuWiki's 2-space-indent preformatted blo

add GfmHeader for ATX headings (`# text` through `###### text`)

Opener must sit at column 0. GFM tolerates 0-3 spaces before the `#`
but that collides with DokuWiki's 2-space-indent preformatted block,
so the tolerance is dropped rather than plumbed across modes.

Widen the XHTML renderer's section-node tracker from 5 slots to 6 so
h6 doesn't hit "Undefined array key 5". Extend GfmSpecTest's HTML
normalizer to strip DokuWiki's section-div wrappers, section-edit
comments, and header id/class attributes so heading spec examples
can validate semantic correctness.

show more ...

8ed75a2322-Apr-2026 Andreas Gohr <gohr@cosmocode.de>

add GfmBacktickSingle / GfmBacktickDouble for GFM inline code spans

Two new inline formatting modes covering GFM code spans in their n=1
and n=2 forms:

GfmBacktickSingle `text` → <code>text<

add GfmBacktickSingle / GfmBacktickDouble for GFM inline code spans

Two new inline formatting modes covering GFM code spans in their n=1
and n=2 forms:

GfmBacktickSingle `text` → <code>text</code>
GfmBacktickDouble ``text`` → <code>text</code>

Both emit monospace_open and monospace_close around an unformatted()
call (the same instruction shape as DokuWiki's two-single-quote pair
wrapping a nowiki span), so renderers that distinguish verbatim text
from plain cdata — metadata, indexer, non-XHTML backends — treat the
body as literal.

GfmBacktickDouble extends GfmBacktickSingle to reuse handle() and the
body-normalization helper; only the delimiter length and the body
character class differ. Both share sort 165 and gate on Markdown
being loaded.

Design notes:

* The lexer has no backreferences, so each length is its own mode.
Length-boundary guards (?<!`)...(?!`) on every opener and closer
ensure a run of two-or-more backticks is never read as an n=1
delimiter and a run of three-or-more is never read as n=2. The two
modes never steal each other's input regardless of registration
order — sort can't reach this kind of cross-position constraint.

* Edge-whitespace handling and newline normalization live in handle(),
not in the regex. On DOKU_LEXER_UNMATCHED the body is normalized:
1. CR/LF and LF become single spaces (GFM line-ending rule).
2. If the body starts and ends with a space and is not entirely
whitespace, one space is stripped from each end.
That produces the right GFM output for the tricky cases without
special-casing the entry pattern:
` ` → <code> </code> (all-whitespace, no strip)
` a` → <code> a</code> (asymmetric, no strip)
` `` ` → <code>``</code> (interior run-of-2 + strip)
``foo`bar`` → <code>foo`bar</code>

* Body character classes admit exactly the runs that cannot be valid
closers for this mode's length: n=1 allows `[^`] | ``+`, n=2 allows
`[^`] | `(?!`)`. That is what lets a single-backtick span contain
a pair and a double-backtick span contain a lone backtick.

* allowedModes is empty — no other inline parsing runs inside a span.

Deliberately not implemented, with skip.php entries explaining why:

351 — code-span precedence over emphasis (*foo`*` expected to render
as *foo<code>*</code>). Cross-positional: the single-pass
lexer matches leftmost-first and cannot reject an earlier
emphasis opener because a later backtick span would consume
its closer. A proper fix would need a pre-scan pass; sort
values only break ties at the same position.
353 — the trailing " outside the code span gets converted to a
curly quote by DokuWiki typography, diverging from spec HTML.
354 — raw HTML tag pass-through; DokuWiki does not render raw HTML
by default.
356 — GFM angle-bracket autolink <http://…>: not implemented.

Per-mode unit tests cover basic matching, flanking via the length-
boundary guards, interior-run support in the body, edge-space
stripping, newline normalization, all-whitespace bodies, paragraph-
boundary rejection, content-is-literal, and sort values.
ModeRegistryTest's gating data provider picks up both modes.

Net effect on GfmSpecTest: eleven previously-red code-span examples
now pass (339, 340, 341, 342, 344, 345, 346, 347, 349, 350, 357, 359
— the simple pairs, edge-space, interior-run, newline-normalization,
and mismatched-run cases). Four skipped. Three remain pending outside
the code-span scope (emphasis interactions that need GfmLink once
that lands).

show more ...

864d6c6d21-Apr-2026 Andreas Gohr <gohr@cosmocode.de>

fix Lexer so exit-pattern lookbehinds see chars consumed by prior tokens

Lexer::reduce used to hand PCRE a shrinking tail of the subject — each
matched token was chopped off the front of $raw and th

fix Lexer so exit-pattern lookbehinds see chars consumed by prior tokens

Lexer::reduce used to hand PCRE a shrinking tail of the subject — each
matched token was chopped off the front of $raw and the next preg_match
ran on what remained. Once a token was consumed, the bytes before the
cursor were gone, and any lookbehind assertion in a subsequent pattern
silently failed.

The bug was latent for DokuWiki's entire history because literal exit
patterns like `\*\*`, `</file>`, or `%%` don't care what's behind them.
It surfaced with c3755410a ("require non-whitespace adjacency for
inline formatting delimiters"), which added `(?<=[^\s])` to Strong,
Emphasis, Underline, Monospace, Subscript, Superscript and Deleted at
once. After that commit, `**[[link]]**` stopped closing — the `]` that
would satisfy the lookbehind had just been consumed by the link match,
so Strong stayed open until end-of-section and swallowed everything
after it (list items, headings, the lot).

Fix:

* Lexer::parse and Lexer::reduce track a byte offset into $raw instead
of mutating $raw. $initialLength and the shrinking-length arithmetic
for absolute match positions are replaced by straight offset
increments; the no-progress guard and the trailing-unmatched dispatch
both shift to the same cursor.

* ParallelRegex::split takes an optional $offset and passes it to
preg_match together with PREG_OFFSET_CAPTURE. PCRE scans from the
offset forward but still sees the whole subject, so lookbehinds work
across already-consumed tokens. The secondary preg_split call used
to carve out pre/post is no longer needed — PREG_OFFSET_CAPTURE
gives the match start for free, saving one regex operation per
reduce() step.

Regression tests at all three layers:

* ParallelRegexTest — offset plumbing and pre/match accounting.
* LexerTest::testIndexLookbehindAcrossConsumedToken — exit-pattern
lookbehind targeting the `/>` of a self-closing `<a/>` that was
consumed as a SPECIAL token on the previous step. Fails under the
old Lexer.
* FormattingTest — `**[[link]]**` and `**foo//bar//**` round-trip
with correct open/close instructions through the full pipeline.

Also updates ListsTest::testUnorderedListStrong, whose expectations
documented the pre-fix buggy behaviour ("formatting able to spread
across list items"). With the fix, bold correctly stays within a
single list item; the expected call sequence and the comment are
updated to match.

show more ...

0244be5c21-Apr-2026 Andreas Gohr <gohr@cosmocode.de>

add GfmDeleted mode for GFM strikethrough (`~~text~~`)

Shares the deleted_open/deleted_close instructions with DW's <del> mode.
Entry/exit anchors `(?<!~)` / `(?!~)` reject runs of three or more til

add GfmDeleted mode for GFM strikethrough (`~~text~~`)

Shares the deleted_open/deleted_close instructions with DW's <del> mode.
Entry/exit anchors `(?<!~)` / `(?!~)` reject runs of three or more tildes
so fenced-code markers remain untouched. Also trim redundant class-level
docblocks on sibling Gfm test files.

show more ...

2bb62bca20-Apr-2026 Andreas Gohr <gohr@cosmocode.de>

add GFM em-wrapping-strong modes for `***foo***` / `___foo___`

Two new inline formatting modes that render triple-delimiter runs as
em wrapping strong:

GfmEmphasisStrong `***text***`

add GFM em-wrapping-strong modes for `***foo***` / `___foo___`

Two new inline formatting modes that render triple-delimiter runs as
em wrapping strong:

GfmEmphasisStrong `***text***` → <em><strong>text</strong></em>
GfmEmphasisStrongUnderscore `___text___` → same (MD-preferred only)

Only the exact 3+3 symmetric case is handled. The other long-run and
asymmetric variants (4+4, 5+5, `***foo**`, etc.) require CommonMark's
stack-based delimiter-pairing algorithm with its flanking and
multiple-of-3 rules, which is explicitly out of scope; those examples
stay skipped in gfm-spec/skip.php.

Implementation notes:

* Patterns enforce exact 3+3 via `(?<!\*)` / `(?<!_)` lookbehinds
(preventing entry at the second `*` of a `****...` run) and
`(?!\*)` / `(?!_)` lookaheads after the closing triple (rejecting
`***foo****` etc.). Combined with the existing non-whitespace
adjacency lookaheads, all asymmetric cases cleanly fall through to
other modes or stay literal.

* GfmEmphasisStrong overrides handle() to emit two instructions on
entry (emphasis_open + strong_open) and two on exit (strong_close
+ emphasis_close). GfmEmphasisStrongUnderscore inherits that
handler — only delimiters and word-boundary rules differ.

* Sort 65 — below Strong (70) and GfmEmphasis (80) so the em+strong
modes win the lexer race for `***`/`___` runs. Underscore variant
is MD-preferred-only, matching the existing gating of
GfmEmphasisUnderscore and GfmStrongUnderscore.

Per-mode unit tests cover basic matching, single-char bodies,
whitespace flanking rejection, paragraph-boundary rejection,
longer-run rejection, asymmetric rejection, multibyte intraword
protection, and sort values. ModeRegistryTest's gating data provider
picks up the two new rules.

Net effect on GfmSpecTest: example #476 (`***foo***`) now passes;
473/474/475/477 remain skipped as documented in skip.php.

show more ...

bcefb8ae20-Apr-2026 Andreas Gohr <gohr@cosmocode.de>

add GFM emphasis and underscore-delimited strong modes

Three new inline formatting modes for GitHub Flavored Markdown:

GfmEmphasis `*text*` → <em>
GfmEmphasisUnderscore `_text_`

add GFM emphasis and underscore-delimited strong modes

Three new inline formatting modes for GitHub Flavored Markdown:

GfmEmphasis `*text*` → <em>
GfmEmphasisUnderscore `_text_` → <em> (MD-preferred only)
GfmStrongUnderscore `__text__` → <strong> (MD-preferred only)

All three emit the same handler instructions as DokuWiki's Emphasis /
Strong, so existing renderers need no changes.

Design notes:

* Lexer mode names use snake_case (gfm_emphasis, gfm_emphasis_underscore,
gfm_strong_underscore) to keep PascalCase readable at the class level.
The asterisk variant emits `emphasis_open`/`emphasis_close` via the
getInstructionName() hook, so DW's Emphasis (`//...//`) and
GfmEmphasis (`*...*`) can coexist in mixed modes without a lexer
state collision while still producing the same <em> output.

* Underscore variants gate on Markdown-preferred syntax (`markdown`,
`md+dw`) because `__` otherwise means DW underline. GfmStrongUnderscore
sorts at 70 (matching Strong) — below Underline at 90 — so when loaded
it wins the lexer race for `__` runs. Underline is already gated out
of MD-preferred modes in the previous commit.

* Entry patterns enforce the simplified CommonMark flanking rules
already shared across DW inline modes (non-whitespace adjacency,
no paragraph-boundary crossing) plus the word-boundary check for
underscore variants using NO_WORD_BEFORE / NO_WORD_AFTER. The
positive non-word-char enumeration makes them multibyte-safe without
requiring the `u` flag: `für_etwas` and `пристаням_стремятся_`
correctly stay literal.

Per-mode unit tests cover basic matching, single-char bodies,
leading/trailing-whitespace rejection, empty-delimiter rejection,
paragraph-boundary rejection, multibyte intraword protection, and
sort values. ModeRegistryTest's gating data provider picks up the
three new rules.

show more ...

35f9143220-Apr-2026 Andreas Gohr <gohr@cosmocode.de>

gate Underline on DokuWiki-preferred syntax; tidy registry plumbing

Three related changes to ModeRegistry, prep work for the Markdown modes
that follow.

1. Underline (`__text__`) is moved out of lo

gate Underline on DokuWiki-preferred syntax; tidy registry plumbing

Three related changes to ModeRegistry, prep work for the Markdown modes
that follow.

1. Underline (`__text__`) is moved out of loadAlwaysModes() and into
loadDokuWikiModes(), gated on a new `\$dwPreferred` check that
evaluates true for 'dokuwiki' and 'dw+md'. In MD-preferred settings
('markdown' and 'md+dw') `__` will mean GFM strong, so loading
Underline there would conflict at the lexer level. Underline is
unchanged in the default 'dokuwiki' setting.

2. resolveModeClass() now PascalCases every `_`-separated segment of
the mode name, so `gfm_emphasis_underscore` resolves to
`GfmEmphasisUnderscore`. Existing lowercase-compound names like
`internallink` still resolve to `Internallink` (one segment,
ucfirst-ed) — no behaviour change for current modes. This prepares
the registry to load Gfm mode classes whose PascalCase filenames
preserve word boundaries for readability.

3. ModeRegistryTest's multiple near-identical per-mode gating tests
are consolidated into a single data-provider-driven
testModeLoadingBySyntax, fed by a `\$rules` table that lists each
mode against its four-setting expected load state. Adding a new
gated mode now means one line in the provider. Currently only
Underline is listed; upcoming Gfm-mode commits will add theirs.

show more ...

72b2703b20-Apr-2026 Andreas Gohr <gohr@cosmocode.de>

add spec.txt-driven roundtrip test infrastructure for Markdown

Imports GitHub Flavored Markdown's test/spec.txt (671 examples combining
the CommonMark baseline with GFM extensions: tables, strikethr

add spec.txt-driven roundtrip test infrastructure for Markdown

Imports GitHub Flavored Markdown's test/spec.txt (671 examples combining
the CommonMark baseline with GFM extensions: tables, strikethrough, task
lists, autolink, disallowed raw HTML) and runs each example as a data-
provider-driven roundtrip test against DokuWiki's full parser + XHTML
renderer.

Files:

_test/tests/Parsing/Markdown/SpecReader.php
Parses the fenced-example format. Each example is ``` example
[optional-label] ... . ... ``` delimited by 10+ backticks with a
`.` separator; tracks most recent `## Heading` as section context
and numbers examples sequentially from 1 to match the spec's
rendered "Example N" labels.

_test/tests/Parsing/Markdown/SpecReaderTest.php
Hand-crafted fixtures covering ordinary examples, section tracking,
extension labels, multiline bodies, nested backticks, unclosed
fences (throws).

_test/tests/Parsing/Markdown/GfmSpecTest.php
Data-provider test. Renders each example's markdown through
p_get_instructions + p_render('xhtml') and compares to the expected
HTML with block-level-aware whitespace normalisation (DokuWiki
emits \n around block tags; inline-tag whitespace is preserved
because `<em>x</em> y` != `<em>x</em>y`).

_test/tests/Parsing/Markdown/gfm-spec/
spec.txt — verbatim from github/cmark-gfm, commit 587a12bb
LICENSE — CC-BY-SA 4.0 full legal text
README.md — upstream URL, pinned commit, resync notes
skip.php — map example-number => reason for SPEC-excluded
CommonMark behaviour (flanking-delimiter analysis,
multiple-of-3 rule, excess-drop). Unimplemented
features are NOT listed here — they show as real
failures so they remain the visible TODO list.

show more ...

12345678910>>...44