History log of /dokuwiki/_test/tests/Parsing/Markdown/gfm-spec/skip.php (Results 1 – 21 of 21)
Revision Date Author Comments
# 1beb7450 12-May-2026 Andreas Gohr <andi@splitbrain.org>

GfmSpecTest: skip deferred-feature spec cases

Heading-inline syntax (#36, #46) is deferred; the existing header
instruction and downstream renderers would need rework to process
inline modes inside

GfmSpecTest: skip deferred-feature spec cases

Heading-inline syntax (#36, #46) is deferred; the existing header
instruction and downstream renderers would need rework to process
inline modes inside heading content.

Bare URL autolinking without angle brackets (#619) is a deliberate
DokuWiki feature in Externallink, not a feature we'll remove to match
the strict CommonMark §6.8 rule.

The GFM bare-email autolink extension (#629-631) is out of scope -
DokuWiki's Email mode only recognises emails inside angle brackets.

show more ...


# 451f2842 05-May-2026 Andreas Gohr <andi@splitbrain.org>

clean up markdown spec test skips

Ordered by example number, same format, single line.

Some skips now actually pass - removed. A couple others should pass but
don't yet - also removed. Code fix fol

clean up markdown spec test skips

Ordered by example number, same format, single line.

Some skips now actually pass - removed. A couple others should pass but
don't yet - also removed. Code fix follows.

show more ...


# 198d33e8 05-May-2026 Andreas Gohr <andi@splitbrain.org>

GfmSpecTest: skip task list items extension (#279, #280)

GFM task list items (`- [ ] foo` / `- [x] foo`) are not implemented;
the literal marker stays as the first content of the list item.


# 506762f4 05-May-2026 Andreas Gohr <andi@splitbrain.org>

GfmSpecTest: skip indented-code §4.4 family and Disallowed Raw HTML #652

The 4-space indent trigger fires on paragraph-continuation lines and
exits on any blank line, both of which collide with Comm

GfmSpecTest: skip indented-code §4.4 family and Disallowed Raw HTML #652

The 4-space indent trigger fires on paragraph-continuation lines and
exits on any blank line, both of which collide with CommonMark §4.4 —
fixing either would require paragraph-open state the single-pass lexer
cannot carry. List-interior cases additionally need the column
arithmetic documented as out of scope for the §2.2 tabs family.

#652 (Disallowed Raw HTML) is a filter on top of raw HTML pass-through,
which DokuWiki escapes by policy (see #118-160), so it has no input.

show more ...


# f9d3b7bd 05-May-2026 Andreas Gohr <andi@splitbrain.org>

Externallink: add per-scheme angle-bracket autolinks for MD syntax

Adds CommonMark §6.5 <URL> autolinks to Externallink, gated to
md/md+dw/dw+md syntax via ModeRegistry::isMdPreferred(). Per-scheme

Externallink: add per-scheme angle-bracket autolinks for MD syntax

Adds CommonMark §6.5 <URL> autolinks to Externallink, gated to
md/md+dw/dw+md syntax via ModeRegistry::isMdPreferred(). Per-scheme
patterns share the existing conf/scheme.conf allow-list so unknown
schemes fall through to literal cdata instead of being silently
dropped by the renderer. Internal whitespace inside the brackets
disqualifies the autolink and the whole envelope is emitted as
cdata to keep the bare-URL detector off the URL.

LinksTest gains 5 cases covering success, internal-whitespace and
leading-whitespace disqualification, unregistered scheme fallthrough,
and the dw-only no-op path. SpecCompatRenderer URL encoder is updated
to match cmark-gfm's HREF_SAFE table (square brackets and a few other
characters move from safe to encoded). skip.php loses the obsolete
#356 entry and gains #605/#606/#607/#609 explaining the unregistered-
scheme cases that the per-scheme regex naturally rejects.

show more ...


# d379b737 05-May-2026 Andreas Gohr <andi@splitbrain.org>

GfmSpecTest: neutralize DW typography for spec roundtrip

Force $conf[typography] = 0 in renderMarkdown() so the Quotes and
MultiplyEntity modes are not loaded, override entity() in
SpecCompatRendere

GfmSpecTest: neutralize DW typography for spec roundtrip

Force $conf[typography] = 0 in renderMarkdown() so the Quotes and
MultiplyEntity modes are not loaded, override entity() in
SpecCompatRenderer to emit the original match instead of the typographic
glyph, and switch _xmlEntities() from ENT_QUOTES to ENT_COMPAT so `'`
stays literal in body text while `"` is still escaped to &quot;. Drops
three skip entries (#308, #310, #353) that existed only to paper over
the same divergence and unblocks #16, #25 and #670.

show more ...


# b37c6ef7 04-May-2026 Andreas Gohr <andi@splitbrain.org>

more test skips


# 6359e7fd 04-May-2026 Andreas Gohr <andi@splitbrain.org>

percent-encode URLs in SpecCompatRenderer to match spec output

CommonMark's reference renderer percent-encodes URL bytes outside the
RFC 3986 unreserved/reserved set (and existing %XX sequences pass

percent-encode URLs in SpecCompatRenderer to match spec output

CommonMark's reference renderer percent-encodes URL bytes outside the
RFC 3986 unreserved/reserved set (and existing %XX sequences pass
through unchanged). DokuWiki's XHTML renderer leaves UTF-8 and
backslashes literal in href, which is fine for live wiki output but
diverges byte-for-byte from spec.

Adds specEncodeUrl() to the spec-compat renderer and applies it in
specLink(). Same shape as the earlier `→`->`\t` substitution: a
test-harness alignment with spec convention, no production behavior
change.

Unskips #510 (backslash in URL) and #511 (entity / percent-encoding in
URL); both now match spec output with the parser-side decoding from
the previous commit and the renderer-side encoding here.

show more ...


# eb15e634 04-May-2026 Andreas Gohr <andi@splitbrain.org>

extract Helpers\HtmlEntity, wire into GfmCode and GfmLink URL slot

Numeric and named HTML entity decoding moves out of GfmHtmlEntity into
a pure helper, so capture-by-regex modes can apply the same

extract Helpers\HtmlEntity, wire into GfmCode and GfmLink URL slot

Numeric and named HTML entity decoding moves out of GfmHtmlEntity into
a pure helper, so capture-by-regex modes can apply the same decode
post-extraction (the inline lexer never reaches their bodies). Mirrors
the Helpers\Escape pattern.

Wired up in two slots:

- GfmCode info string: f&ouml;&ouml; now decodes to föö in the
language class. Clears spec example #330.

- GfmLink URL: GfmLink::extractUrl() decodes entities. URL pattern
extends from `[^)\n]+` to `(?:\\.|[^)\n])+` so an escaped \) no
longer terminates the URL early; the existing post-classify
Escape::unescapeBackslashes call strips the backslashes after
Link::classify has done its work. Clears #504, #506, #508.

Skip #328 with a self-contained title-slot reason: the URL side now
decodes correctly, but the title attribute is still discarded
(DokuWiki link instructions have no title slot).

show more ...


# b414dba2 04-May-2026 Andreas Gohr <gohr@cosmocode.de>

skip a few more spec tests

Those are all deliberately not supported cases


# c4bcbc2e 04-May-2026 Andreas Gohr <andi@splitbrain.org>

add GfmLinebreak for GFM hard line breaks

Two-or-more trailing spaces, or a single backslash, immediately before
a non-final newline render as a `<br/>`. Both delimiter forms share a
single SUBSTITI

add GfmLinebreak for GFM hard line breaks

Two-or-more trailing spaces, or a single backslash, immediately before
a non-final newline render as a `<br/>`. Both delimiter forms share a
single SUBSTITION mode at sort 140, loaded under any MD-active syntax
(markdown, dw+md, md+dw); pure dokuwiki is unaffected.

Reuses the existing `linebreak` handler call and renderer; no new
instructions or renderer changes. SpecCompatRenderer overrides
linebreak() to emit the spec's `<br />` shape. Examples 662, 663
(line break inside a raw HTML tag) are skipped — raw HTML is not
passed through by default.

show more ...


# 3e6baeff 30-Apr-2026 Andreas Gohr <andi@splitbrain.org>

replace DW Hr with unified GfmHr

Single mode covers both DokuWiki (4+ dashes) and GFM (3+ of -/*/_)
horizontal rules; pattern self-narrows on $conf['syntax']. Always
loaded across all four syntax se

replace DW Hr with unified GfmHr

Single mode covers both DokuWiki (4+ dashes) and GFM (3+ of -/*/_)
horizontal rules; pattern self-narrows on $conf['syntax']. Always
loaded across all four syntax settings, mirroring the GfmQuote
replacement pattern. Same `hr` handler call so renderers and the
call API are unchanged.

Drops DW's old [ \t]* leading-whitespace tolerance — inert in
practice past 0-1 spaces (Preformatted at sort 20 intercepts
everything ≥ 2 spaces or any tab).

Spec examples 13, 20, 26-28, 224 turn green; 17, 21-24, 29, 30, 31
go to skip.php as deliberate non-implementations (whitespace
tolerance and list-precedence cases).

show more ...


# 309a0852 30-Apr-2026 Andreas Gohr <andi@splitbrain.org>

replace DW Quote with unified GfmQuote

GfmQuote covers blockquote parsing for both DokuWiki and GFM dialects
in a single mode. Same quote_open/quote_close handler instructions; a
DW-preferred post-p

replace DW Quote with unified GfmQuote

GfmQuote covers blockquote parsing for both DokuWiki and GFM dialects
in a single mode. Same quote_open/quote_close handler instructions; a
DW-preferred post-pass flattens sub-parsed paragraph wrapping into
linebreak calls so existing pages keep their <br/>-between-lines
rendering. MD-preferred keeps the <p>-wrapped spec shape.

Block content (lists, fenced code, tables) inside `>` quotes now
renders, since the body is sub-parsed. Headers stay excluded
(BASEONLY) — TOC and section-edit anchors don't compose with
<blockquote>, same rationale as GfmListblock.

Convert ModeRegistry's sub-parser cache into an acquire/release pool
to support same-key re-entrancy: a list inside a quote re-enters
gfm_quote during the list-item sub-parse, and the inner call needs
its own parser instance even though the exclusion key matches.
GfmListblock is updated to use the new acquire/release primitives.

show more ...


# 74031e46 28-Apr-2026 Andreas Gohr <andi@splitbrain.org>

add GfmEscape for GFM backslash escapes

Implements GFM §6.1 backslash-escape handling. GfmEscape is a sort-5
inline mode in CATEGORY_SUBSTITION that claims `\X` for any escapable
ASCII punctuation c

add GfmEscape for GFM backslash escapes

Implements GFM §6.1 backslash-escape handling. GfmEscape is a sort-5
inline mode in CATEGORY_SUBSTITION that claims `\X` for any escapable
ASCII punctuation char before competing delimiters can match. The
shared character class lives on Helpers\Escape so the lexer pattern
and the post-hoc unescape stay in lockstep.

Whole-span captures (GfmCode info string, GfmLink label/URL) bypass
the lexer; those modes call Escape::unescapeBackslashes() on the
relevant slot. GfmLink skips the unescape when the URL classifies as
a windowssharelink so the leading \\host survives intact.

GfmTable cells get a separate per-cell `\|` to `|` pass in the
rewriter to honour the tables-extension rule that pipes always
unescape, even inside code spans where standard §6.1 escapes don't
fire.

show more ...


# 685560eb 28-Apr-2026 Andreas Gohr <andi@splitbrain.org>

add GfmListblock for GFM lists

GfmListblock captures an entire list block atomically with one
addSpecialPattern match, then walks the captured text in handle()
grouping lines into items. Each item's

add GfmListblock for GFM lists

GfmListblock captures an entire list block atomically with one
addSpecialPattern match, then walks the captured text in handle()
grouping lines into items. Each item's body is dedented to its
content column and parsed by ModeRegistry::getSubParser() so
block content (paragraphs, fenced code, blockquotes, plugin
blocks) works inside items uniformly. Sub-parsed calls are wrapped
in a Nest call before they reach the outer handler, matching the
Footnote pattern: the main handler's Block rewriter treats nest
as opaque and the renderer base class unwraps it transparently,
so multi-paragraph items don't get double-wrapped in <p>.

Marker syntax: -, *, + (unordered) or 1-9 digits followed by
. or ) (ordered). Indentation is a 2-space-multiple step starting
at 0; depth = (indent / 2) + 1, odd indents round down, tabs become
two spaces. The first ordered item's number drives the start
attribute on <ol> via the listo_open $start parameter.

GfmLists subclasses AbstractListsRewriter with the GFM marker
parser; the state machine on the base class is shared with DW Lists.

GfmListblock loads only when $conf['syntax'] is markdown or md+dw.
Under those settings the DW Listblock is suppressed because the two
list models conflict — DW's mandatory 2-space indent rule vs GFM's
zero-indent top-level rule, and -/*/+ markers shared. Plugins that
relied on Listblock loading under md+dw will see it absent there.

Sub-parser exclusion set: CATEGORY_BASEONLY (no Header inside list
items) and gfm_listblock itself (defensive guard against re-entry
on pathological inputs; nested lists are handled by the outer
pattern, not by re-entry).

Tests cover marker variants, ordered start numbers, nested lists at
two and three levels, inline formatting inside items, marker-
character switches keeping one list, type switches splitting the
list, fenced code inside items, multi-paragraph (loose) items, and
two regressions on blank-line tolerance inside the captured block.
SpecCompatRenderer learns to render the list call sequence, and
spec.txt tests for digit/marker-width/lazy-continuation behavior
that GfmListblock deliberately doesn't implement are documented in
gfm-spec/skip.php with the per-bucket reasons (A-F).

Drops two now-obsolete entries from skip.php (image escapes that
land via earlier GfmLink/GfmMedia work) and inlines the Setext
explanation that previously pointed at SPEC.md. Replaces the
SPEC.md reference in GfmEmphasisTest with the inline reason.

show more ...


# b1c59bed 23-Apr-2026 Andreas Gohr <andi@splitbrain.org>

add GfmCode / GfmFile for fenced code blocks

GfmCode (backticks) emits the `code` handler instruction; GfmFile
(tildes) emits `file`. Column-0 fences only, no length pairing
between opener and close

add GfmCode / GfmFile for fenced code blocks

GfmCode (backticks) emits the `code` handler instruction; GfmFile
(tildes) emits `file`. Column-0 fences only, no length pairing
between opener and closer, and unclosed fences stay literal —
matching DokuWiki's `<code>` tag convention. The info string accepts
DW's full attribute vocabulary (language, filename, [options])
through a new shared `Helpers::parseCodeAttributes` that `Code`
also uses, with `html` aliased to `html4strict` and `-` meaning "no
language".

Preformatted's indent threshold is now preference-gated: 2 spaces
in DW-preferred settings, 4 spaces in MD-preferred, matching GFM's
indented code block rule. A single tab is a trigger in both.

show more ...


# 3440a8c0 22-Apr-2026 Andreas Gohr <gohr@cosmocode.de>

add GfmMedia and extend GfmLink with image-as-label form

- New GfmMedia parses `![alt](url)` with the full DokuWiki media-parameter
vocabulary in the URL slot (?100x200, ?right, ?nolink, ?recache,

add GfmMedia and extend GfmLink with image-as-label form

- New GfmMedia parses `![alt](url)` with the full DokuWiki media-parameter
vocabulary in the URL slot (?100x200, ?right, ?nolink, ?recache, …).
Adds `?left`/`?right`/`?center` align keywords shared with DW `{{…}}`
— gives pure-Markdown users a way to align inline images.
- GfmLink now also matches `[![alt](img)](target)` — the GFM equivalent
of `[[target|{{img}}]]`. Detection is post-entry, mirroring
Internallink's `^{{…}}$` check; one mode covers the whole family.
- LinkDispatch trait replaced by Helpers::classifyLink and
Helpers::parseMediaParameters — two pure static methods, shared by
DW and GFM counterparts.
- Entry patterns for GfmLink / GfmMedia simplified (permissive URL slot,
handle-time parsing), following DW's Internallink style.
- GfmSpecTest drives a test-only SpecCompatRenderer that emits bare
<img> / <a> instead of DW's wiki-wrapped HTML, recovering 13 spec
tests that previously failed/skipped only because of renderer shape.

show more ...


# e89aeebd 22-Apr-2026 Andreas Gohr <gohr@cosmocode.de>

add GfmLink for GFM inline links `[text](url)`

Extracts the URL-classification ladder from Internallink into a
LinkDispatch trait so both modes route identically across all six
DokuWiki link flavors

add GfmLink for GFM inline links `[text](url)`

Extracts the URL-classification ladder from Internallink into a
LinkDispatch trait so both modes route identically across all six
DokuWiki link flavors (internal, external, interwiki, email,
windowsshare, local anchor). GfmLink parses the `[text](url)` form
with optional `"title"` / `'title'` and hands the URL to the trait.
The GFM title attribute is discarded — DokuWiki link instructions
have no slot for it.

show more ...


# 8719732d 22-Apr-2026 Andreas Gohr <gohr@cosmocode.de>

add GfmHeader for ATX headings (`# text` through `###### text`)

Opener must sit at column 0. GFM tolerates 0-3 spaces before the `#`
but that collides with DokuWiki's 2-space-indent preformatted blo

add GfmHeader for ATX headings (`# text` through `###### text`)

Opener must sit at column 0. GFM tolerates 0-3 spaces before the `#`
but that collides with DokuWiki's 2-space-indent preformatted block,
so the tolerance is dropped rather than plumbed across modes.

Widen the XHTML renderer's section-node tracker from 5 slots to 6 so
h6 doesn't hit "Undefined array key 5". Extend GfmSpecTest's HTML
normalizer to strip DokuWiki's section-div wrappers, section-edit
comments, and header id/class attributes so heading spec examples
can validate semantic correctness.

show more ...


# 8ed75a23 22-Apr-2026 Andreas Gohr <gohr@cosmocode.de>

add GfmBacktickSingle / GfmBacktickDouble for GFM inline code spans

Two new inline formatting modes covering GFM code spans in their n=1
and n=2 forms:

GfmBacktickSingle `text` → <code>text<

add GfmBacktickSingle / GfmBacktickDouble for GFM inline code spans

Two new inline formatting modes covering GFM code spans in their n=1
and n=2 forms:

GfmBacktickSingle `text` → <code>text</code>
GfmBacktickDouble ``text`` → <code>text</code>

Both emit monospace_open and monospace_close around an unformatted()
call (the same instruction shape as DokuWiki's two-single-quote pair
wrapping a nowiki span), so renderers that distinguish verbatim text
from plain cdata — metadata, indexer, non-XHTML backends — treat the
body as literal.

GfmBacktickDouble extends GfmBacktickSingle to reuse handle() and the
body-normalization helper; only the delimiter length and the body
character class differ. Both share sort 165 and gate on Markdown
being loaded.

Design notes:

* The lexer has no backreferences, so each length is its own mode.
Length-boundary guards (?<!`)...(?!`) on every opener and closer
ensure a run of two-or-more backticks is never read as an n=1
delimiter and a run of three-or-more is never read as n=2. The two
modes never steal each other's input regardless of registration
order — sort can't reach this kind of cross-position constraint.

* Edge-whitespace handling and newline normalization live in handle(),
not in the regex. On DOKU_LEXER_UNMATCHED the body is normalized:
1. CR/LF and LF become single spaces (GFM line-ending rule).
2. If the body starts and ends with a space and is not entirely
whitespace, one space is stripped from each end.
That produces the right GFM output for the tricky cases without
special-casing the entry pattern:
` ` → <code> </code> (all-whitespace, no strip)
` a` → <code> a</code> (asymmetric, no strip)
` `` ` → <code>``</code> (interior run-of-2 + strip)
``foo`bar`` → <code>foo`bar</code>

* Body character classes admit exactly the runs that cannot be valid
closers for this mode's length: n=1 allows `[^`] | ``+`, n=2 allows
`[^`] | `(?!`)`. That is what lets a single-backtick span contain
a pair and a double-backtick span contain a lone backtick.

* allowedModes is empty — no other inline parsing runs inside a span.

Deliberately not implemented, with skip.php entries explaining why:

351 — code-span precedence over emphasis (*foo`*` expected to render
as *foo<code>*</code>). Cross-positional: the single-pass
lexer matches leftmost-first and cannot reject an earlier
emphasis opener because a later backtick span would consume
its closer. A proper fix would need a pre-scan pass; sort
values only break ties at the same position.
353 — the trailing " outside the code span gets converted to a
curly quote by DokuWiki typography, diverging from spec HTML.
354 — raw HTML tag pass-through; DokuWiki does not render raw HTML
by default.
356 — GFM angle-bracket autolink <http://…>: not implemented.

Per-mode unit tests cover basic matching, flanking via the length-
boundary guards, interior-run support in the body, edge-space
stripping, newline normalization, all-whitespace bodies, paragraph-
boundary rejection, content-is-literal, and sort values.
ModeRegistryTest's gating data provider picks up both modes.

Net effect on GfmSpecTest: eleven previously-red code-span examples
now pass (339, 340, 341, 342, 344, 345, 346, 347, 349, 350, 357, 359
— the simple pairs, edge-space, interior-run, newline-normalization,
and mismatched-run cases). Four skipped. Three remain pending outside
the code-span scope (emphasis interactions that need GfmLink once
that lands).

show more ...


# 72b2703b 20-Apr-2026 Andreas Gohr <gohr@cosmocode.de>

add spec.txt-driven roundtrip test infrastructure for Markdown

Imports GitHub Flavored Markdown's test/spec.txt (671 examples combining
the CommonMark baseline with GFM extensions: tables, strikethr

add spec.txt-driven roundtrip test infrastructure for Markdown

Imports GitHub Flavored Markdown's test/spec.txt (671 examples combining
the CommonMark baseline with GFM extensions: tables, strikethrough, task
lists, autolink, disallowed raw HTML) and runs each example as a data-
provider-driven roundtrip test against DokuWiki's full parser + XHTML
renderer.

Files:

_test/tests/Parsing/Markdown/SpecReader.php
Parses the fenced-example format. Each example is ``` example
[optional-label] ... . ... ``` delimited by 10+ backticks with a
`.` separator; tracks most recent `## Heading` as section context
and numbers examples sequentially from 1 to match the spec's
rendered "Example N" labels.

_test/tests/Parsing/Markdown/SpecReaderTest.php
Hand-crafted fixtures covering ordinary examples, section tracking,
extension labels, multiline bodies, nested backticks, unclosed
fences (throws).

_test/tests/Parsing/Markdown/GfmSpecTest.php
Data-provider test. Renders each example's markdown through
p_get_instructions + p_render('xhtml') and compares to the expected
HTML with block-level-aware whitespace normalisation (DokuWiki
emits \n around block tags; inline-tag whitespace is preserved
because `<em>x</em> y` != `<em>x</em>y`).

_test/tests/Parsing/Markdown/gfm-spec/
spec.txt — verbatim from github/cmark-gfm, commit 587a12bb
LICENSE — CC-BY-SA 4.0 full legal text
README.md — upstream URL, pinned commit, resync notes
skip.php — map example-number => reason for SPEC-excluded
CommonMark behaviour (flanking-delimiter analysis,
multiple-of-3 rule, excess-drop). Unimplemented
features are NOT listed here — they show as real
failures so they remain the visible TODO list.

show more ...