History log of /dokuwiki/inc/Parsing/ParserMode/GfmHtmlEntity.php (Results 1 – 4 of 4)
Revision Date Author Comments
# 47a02a10 04-Jun-2026 Andreas Gohr <gohr@cosmocode.de>

Parsing: make parse syntax a per-parse value, drop ModeInterface

The active parse's syntax flavour is a per-parse question, not process-
global state: within a single request a plugin can render bun

Parsing: make parse syntax a per-parse value, drop ModeInterface

The active parse's syntax flavour is a per-parse question, not process-
global state: within a single request a plugin can render bundled
DokuWiki-syntax text inside an otherwise-Markdown page. Yet ModeRegistry
was a singleton that read $conf['syntax'] and the $PARSER_MODES global,
and every mode reached it through ModeRegistry::getInstance() — so the
flavour lived in shared mutable state that two parses in one request
would fight over.

Make the registry a short-lived value instead:

- ModeRegistry is constructed once per parse with an explicit $syntax
and injected into Parser, Handler and every mode. getSyntax() /
isDwPreferred() / isMdPreferred() consult $this->syntax; the
DOKU_UNITTEST-gated mode-list cache hack is gone (each registry is
fresh, nothing to invalidate).
- p_get_instructions() is now the single place in the pipeline where
$conf['syntax'] is read; from there the flavour travels as a
parameter. No code under inc/Parsing/ reads $conf['syntax'] directly
anymore — the five syntax-reading modes (Preformatted, GfmHr,
GfmEscape, Externallink, GfmQuote) route through $this->registry.

Keep the two concepts apart, as documented in the ModeRegistry and
AbstractMode docblocks: the user's configured *preference* stays in
$conf['syntax'] for UI code (toolbar, settings), while the active
parse's syntax is a parameter carried by the registry.

$PARSER_MODES is demoted to a deprecated, read-only mirror, published
during loadPluginModes() — third-party syntax plugins (columnlist,
alphalist2, phpwikify, skipentity) and the bundled info plugin read the
global directly, often from their constructors, so the taxonomy must
stay visible there. No core code reads the mirror.

Fold ModeInterface into AbstractMode while here: getSort()/handle() are
abstract, the connect callbacks carry defaults, and the public $Lexer
"FIXME should be done by setter" becomes setLexer()/getLexer() injected
by Parser::addMode() alongside the registry. Nested-content resolution
moves to the allowedCategories()/filterAllowedModes() hooks, resolved
once when the registry is attached.

Tests build their own parser/registry through ParserTestBase::setSyntax()
instead of mutating $conf and calling the removed ModeRegistry::reset().

show more ...


# d331a839 12-May-2026 Andreas Gohr <andi@splitbrain.org>

GFM modes: follow CATEGORY_SUBSTITION → CATEGORY_SUBSTITUTION rename

Constant was renamed on master (the typo'd 'substition' value is kept,
but the constant name spells it correctly). Update GfmTabl

GFM modes: follow CATEGORY_SUBSTITION → CATEGORY_SUBSTITUTION rename

Constant was renamed on master (the typo'd 'substition' value is kept,
but the constant name spells it correctly). Update GfmTable's use of
the constant, plus stale docblock/comment references in GfmEscape,
GfmHtmlEntity, GfmLinebreak, and GfmLinebreakTest.

show more ...


# eb15e634 04-May-2026 Andreas Gohr <andi@splitbrain.org>

extract Helpers\HtmlEntity, wire into GfmCode and GfmLink URL slot

Numeric and named HTML entity decoding moves out of GfmHtmlEntity into
a pure helper, so capture-by-regex modes can apply the same

extract Helpers\HtmlEntity, wire into GfmCode and GfmLink URL slot

Numeric and named HTML entity decoding moves out of GfmHtmlEntity into
a pure helper, so capture-by-regex modes can apply the same decode
post-extraction (the inline lexer never reaches their bodies). Mirrors
the Helpers\Escape pattern.

Wired up in two slots:

- GfmCode info string: f&ouml;&ouml; now decodes to föö in the
language class. Clears spec example #330.

- GfmLink URL: GfmLink::extractUrl() decodes entities. URL pattern
extends from `[^)\n]+` to `(?:\\.|[^)\n])+` so an escaped \) no
longer terminates the URL early; the existing post-classify
Escape::unescapeBackslashes call strips the backslashes after
Link::classify has done its work. Clears #504, #506, #508.

Skip #328 with a self-contained title-slot reason: the URL side now
decodes correctly, but the title attribute is still discarded
(DokuWiki link instructions have no title slot).

show more ...


# d2085866 04-May-2026 Andreas Gohr <andi@splitbrain.org>

extend GfmNumericEntity to HTML5 named entities, rename to GfmHtmlEntity

Numeric refs are still decoded explicitly: PHP's html_entity_decode
returns the input unchanged for U+0000, surrogates, U+10F

extend GfmNumericEntity to HTML5 named entities, rename to GfmHtmlEntity

Numeric refs are still decoded explicitly: PHP's html_entity_decode
returns the input unchanged for U+0000, surrogates, U+10FFFF, and
BMP noncharacters where CommonMark requires U+FFFD or the literal
codepoint. Named refs delegate to html_entity_decode with ENT_HTML5,
which carries the full HTML5 named-entity table (including multi-
codepoint decodes like &ngE; -> U+2267 + U+0338).

Unknown names stay literal: the original &xxx; passes through as
cdata and the renderer's &-escaping turns it into &amp;xxx;.

show more ...