xref: /dokuwiki/_test/tests/Parsing/Markdown/gfm-spec/spec.txt (revision 884caed926ca0aa0af6ce3f34ae3aa7317a3361a)
1---
2title: GitHub Flavored Markdown Spec
3version: 0.29
4date: '2019-04-06'
5license: '[CC-BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/)'
6...
7
8# Introduction
9
10## What is GitHub Flavored Markdown?
11
12GitHub Flavored Markdown, often shortened as GFM, is the dialect of Markdown
13that is currently supported for user content on GitHub.com and GitHub
14Enterprise.
15
16This formal specification, based on the CommonMark Spec, defines the syntax and
17semantics of this dialect.
18
19GFM is a strict superset of CommonMark. All the features which are supported in
20GitHub user content and that are not specified on the original CommonMark Spec
21are hence known as **extensions**, and highlighted as such.
22
23While GFM supports a wide range of inputs, it's worth noting that GitHub.com
24and GitHub Enterprise perform additional post-processing and sanitization after
25GFM is converted to HTML to ensure security and consistency of the website.
26
27## What is Markdown?
28
29Markdown is a plain text format for writing structured documents,
30based on conventions for indicating formatting in email
31and usenet posts.  It was developed by John Gruber (with
32help from Aaron Swartz) and released in 2004 in the form of a
33[syntax description](http://daringfireball.net/projects/markdown/syntax)
34and a Perl script (`Markdown.pl`) for converting Markdown to
35HTML.  In the next decade, dozens of implementations were
36developed in many languages.  Some extended the original
37Markdown syntax with conventions for footnotes, tables, and
38other document elements.  Some allowed Markdown documents to be
39rendered in formats other than HTML.  Websites like Reddit,
40StackOverflow, and GitHub had millions of people using Markdown.
41And Markdown started to be used beyond the web, to author books,
42articles, slide shows, letters, and lecture notes.
43
44What distinguishes Markdown from many other lightweight markup
45syntaxes, which are often easier to write, is its readability.
46As Gruber writes:
47
48> The overriding design goal for Markdown's formatting syntax is
49> to make it as readable as possible. The idea is that a
50> Markdown-formatted document should be publishable as-is, as
51> plain text, without looking like it's been marked up with tags
52> or formatting instructions.
53> (<http://daringfireball.net/projects/markdown/>)
54
55The point can be illustrated by comparing a sample of
56[AsciiDoc](http://www.methods.co.nz/asciidoc/) with
57an equivalent sample of Markdown.  Here is a sample of
58AsciiDoc from the AsciiDoc manual:
59
60```
611. List item one.
62+
63List item one continued with a second paragraph followed by an
64Indented block.
65+
66.................
67$ ls *.sh
68$ mv *.sh ~/tmp
69.................
70+
71List item continued with a third paragraph.
72
732. List item two continued with an open block.
74+
75--
76This paragraph is part of the preceding list item.
77
78a. This list is nested and does not require explicit item
79continuation.
80+
81This paragraph is part of the preceding list item.
82
83b. List item b.
84
85This paragraph belongs to item two of the outer list.
86--
87```
88
89And here is the equivalent in Markdown:
90```
911.  List item one.
92
93    List item one continued with a second paragraph followed by an
94    Indented block.
95
96        $ ls *.sh
97        $ mv *.sh ~/tmp
98
99    List item continued with a third paragraph.
100
1012.  List item two continued with an open block.
102
103    This paragraph is part of the preceding list item.
104
105    1. This list is nested and does not require explicit item continuation.
106
107       This paragraph is part of the preceding list item.
108
109    2. List item b.
110
111    This paragraph belongs to item two of the outer list.
112```
113
114The AsciiDoc version is, arguably, easier to write. You don't need
115to worry about indentation.  But the Markdown version is much easier
116to read.  The nesting of list items is apparent to the eye in the
117source, not just in the processed document.
118
119## Why is a spec needed?
120
121John Gruber's [canonical description of Markdown's
122syntax](http://daringfireball.net/projects/markdown/syntax)
123does not specify the syntax unambiguously.  Here are some examples of
124questions it does not answer:
125
1261.  How much indentation is needed for a sublist?  The spec says that
127    continuation paragraphs need to be indented four spaces, but is
128    not fully explicit about sublists.  It is natural to think that
129    they, too, must be indented four spaces, but `Markdown.pl` does
130    not require that.  This is hardly a "corner case," and divergences
131    between implementations on this issue often lead to surprises for
132    users in real documents. (See [this comment by John
133    Gruber](https://web.archive.org/web/20170611172104/http://article.gmane.org/gmane.text.markdown.general/1997).)
134
1352.  Is a blank line needed before a block quote or heading?
136    Most implementations do not require the blank line.  However,
137    this can lead to unexpected results in hard-wrapped text, and
138    also to ambiguities in parsing (note that some implementations
139    put the heading inside the blockquote, while others do not).
140    (John Gruber has also spoken [in favor of requiring the blank
141    lines](https://web.archive.org/web/20170611172104/http://article.gmane.org/gmane.text.markdown.general/2146).)
142
1433.  Is a blank line needed before an indented code block?
144    (`Markdown.pl` requires it, but this is not mentioned in the
145    documentation, and some implementations do not require it.)
146
147    ``` markdown
148    paragraph
149        code?
150    ```
151
1524.  What is the exact rule for determining when list items get
153    wrapped in `<p>` tags?  Can a list be partially "loose" and partially
154    "tight"?  What should we do with a list like this?
155
156    ``` markdown
157    1. one
158
159    2. two
160    3. three
161    ```
162
163    Or this?
164
165    ``` markdown
166    1.  one
167        - a
168
169        - b
170    2.  two
171    ```
172
173    (There are some relevant comments by John Gruber
174    [here](https://web.archive.org/web/20170611172104/http://article.gmane.org/gmane.text.markdown.general/2554).)
175
1765.  Can list markers be indented?  Can ordered list markers be right-aligned?
177
178    ``` markdown
179     8. item 1
180     9. item 2
181    10. item 2a
182    ```
183
1846.  Is this one list with a thematic break in its second item,
185    or two lists separated by a thematic break?
186
187    ``` markdown
188    * a
189    * * * * *
190    * b
191    ```
192
1937.  When list markers change from numbers to bullets, do we have
194    two lists or one?  (The Markdown syntax description suggests two,
195    but the perl scripts and many other implementations produce one.)
196
197    ``` markdown
198    1. fee
199    2. fie
200    -  foe
201    -  fum
202    ```
203
2048.  What are the precedence rules for the markers of inline structure?
205    For example, is the following a valid link, or does the code span
206    take precedence ?
207
208    ``` markdown
209    [a backtick (`)](/url) and [another backtick (`)](/url).
210    ```
211
2129.  What are the precedence rules for markers of emphasis and strong
213    emphasis?  For example, how should the following be parsed?
214
215    ``` markdown
216    *foo *bar* baz*
217    ```
218
21910. What are the precedence rules between block-level and inline-level
220    structure?  For example, how should the following be parsed?
221
222    ``` markdown
223    - `a long code span can contain a hyphen like this
224      - and it can screw things up`
225    ```
226
22711. Can list items include section headings?  (`Markdown.pl` does not
228    allow this, but does allow blockquotes to include headings.)
229
230    ``` markdown
231    - # Heading
232    ```
233
23412. Can list items be empty?
235
236    ``` markdown
237    * a
238    *
239    * b
240    ```
241
24213. Can link references be defined inside block quotes or list items?
243
244    ``` markdown
245    > Blockquote [foo].
246    >
247    > [foo]: /url
248    ```
249
25014. If there are multiple definitions for the same reference, which takes
251    precedence?
252
253    ``` markdown
254    [foo]: /url1
255    [foo]: /url2
256
257    [foo][]
258    ```
259
260In the absence of a spec, early implementers consulted `Markdown.pl`
261to resolve these ambiguities.  But `Markdown.pl` was quite buggy, and
262gave manifestly bad results in many cases, so it was not a
263satisfactory replacement for a spec.
264
265Because there is no unambiguous spec, implementations have diverged
266considerably.  As a result, users are often surprised to find that
267a document that renders one way on one system (say, a GitHub wiki)
268renders differently on another (say, converting to docbook using
269pandoc).  To make matters worse, because nothing in Markdown counts
270as a "syntax error," the divergence often isn't discovered right away.
271
272## About this document
273
274This document attempts to specify Markdown syntax unambiguously.
275It contains many examples with side-by-side Markdown and
276HTML.  These are intended to double as conformance tests.  An
277accompanying script `spec_tests.py` can be used to run the tests
278against any Markdown program:
279
280    python test/spec_tests.py --spec spec.txt --program PROGRAM
281
282Since this document describes how Markdown is to be parsed into
283an abstract syntax tree, it would have made sense to use an abstract
284representation of the syntax tree instead of HTML.  But HTML is capable
285of representing the structural distinctions we need to make, and the
286choice of HTML for the tests makes it possible to run the tests against
287an implementation without writing an abstract syntax tree renderer.
288
289This document is generated from a text file, `spec.txt`, written
290in Markdown with a small extension for the side-by-side tests.
291The script `tools/makespec.py` can be used to convert `spec.txt` into
292HTML or CommonMark (which can then be converted into other formats).
293
294In the examples, the `→` character is used to represent tabs.
295
296# Preliminaries
297
298## Characters and lines
299
300Any sequence of [characters] is a valid CommonMark
301document.
302
303A [character](@) is a Unicode code point.  Although some
304code points (for example, combining accents) do not correspond to
305characters in an intuitive sense, all code points count as characters
306for purposes of this spec.
307
308This spec does not specify an encoding; it thinks of lines as composed
309of [characters] rather than bytes.  A conforming parser may be limited
310to a certain encoding.
311
312A [line](@) is a sequence of zero or more [characters]
313other than newline (`U+000A`) or carriage return (`U+000D`),
314followed by a [line ending] or by the end of file.
315
316A [line ending](@) is a newline (`U+000A`), a carriage return
317(`U+000D`) not followed by a newline, or a carriage return and a
318following newline.
319
320A line containing no characters, or a line containing only spaces
321(`U+0020`) or tabs (`U+0009`), is called a [blank line](@).
322
323The following definitions of character classes will be used in this spec:
324
325A [whitespace character](@) is a space
326(`U+0020`), tab (`U+0009`), newline (`U+000A`), line tabulation (`U+000B`),
327form feed (`U+000C`), or carriage return (`U+000D`).
328
329[Whitespace](@) is a sequence of one or more [whitespace
330characters].
331
332A [Unicode whitespace character](@) is
333any code point in the Unicode `Zs` general category, or a tab (`U+0009`),
334carriage return (`U+000D`), newline (`U+000A`), or form feed
335(`U+000C`).
336
337[Unicode whitespace](@) is a sequence of one
338or more [Unicode whitespace characters].
339
340A [space](@) is `U+0020`.
341
342A [non-whitespace character](@) is any character
343that is not a [whitespace character].
344
345An [ASCII punctuation character](@)
346is `!`, `"`, `#`, `$`, `%`, `&`, `'`, `(`, `)`,
347`*`, `+`, `,`, `-`, `.`, `/` (U+0021–2F),
348`:`, `;`, `<`, `=`, `>`, `?`, `@` (U+003A–0040),
349`[`, `\`, `]`, `^`, `_`, `` ` `` (U+005B–0060),
350`{`, `|`, `}`, or `~` (U+007B–007E).
351
352A [punctuation character](@) is an [ASCII
353punctuation character] or anything in
354the general Unicode categories  `Pc`, `Pd`, `Pe`, `Pf`, `Pi`, `Po`, or `Ps`.
355
356## Tabs
357
358Tabs in lines are not expanded to [spaces].  However,
359in contexts where whitespace helps to define block structure,
360tabs behave as if they were replaced by spaces with a tab stop
361of 4 characters.
362
363Thus, for example, a tab can be used instead of four spaces
364in an indented code block.  (Note, however, that internal
365tabs are passed through as literal tabs, not expanded to
366spaces.)
367
368```````````````````````````````` example
369→foo→baz→→bim
370.
371<pre><code>foo→baz→→bim
372</code></pre>
373````````````````````````````````
374
375```````````````````````````````` example
376  →foo→baz→→bim
377.
378<pre><code>foo→baz→→bim
379</code></pre>
380````````````````````````````````
381
382```````````````````````````````` example
383    a→a
384    ὐ→a
385.
386<pre><code>a→a
387ὐ→a
388</code></pre>
389````````````````````````````````
390
391In the following example, a continuation paragraph of a list
392item is indented with a tab; this has exactly the same effect
393as indentation with four spaces would:
394
395```````````````````````````````` example
396  - foo
397
398→bar
399.
400<ul>
401<li>
402<p>foo</p>
403<p>bar</p>
404</li>
405</ul>
406````````````````````````````````
407
408```````````````````````````````` example
409- foo
410
411→→bar
412.
413<ul>
414<li>
415<p>foo</p>
416<pre><code>  bar
417</code></pre>
418</li>
419</ul>
420````````````````````````````````
421
422Normally the `>` that begins a block quote may be followed
423optionally by a space, which is not considered part of the
424content.  In the following case `>` is followed by a tab,
425which is treated as if it were expanded into three spaces.
426Since one of these spaces is considered part of the
427delimiter, `foo` is considered to be indented six spaces
428inside the block quote context, so we get an indented
429code block starting with two spaces.
430
431```````````````````````````````` example
432>→→foo
433.
434<blockquote>
435<pre><code>  foo
436</code></pre>
437</blockquote>
438````````````````````````````````
439
440```````````````````````````````` example
441-→→foo
442.
443<ul>
444<li>
445<pre><code>  foo
446</code></pre>
447</li>
448</ul>
449````````````````````````````````
450
451
452```````````````````````````````` example
453    foo
454→bar
455.
456<pre><code>foo
457bar
458</code></pre>
459````````````````````````````````
460
461```````````````````````````````` example
462 - foo
463   - bar
464→ - baz
465.
466<ul>
467<li>foo
468<ul>
469<li>bar
470<ul>
471<li>baz</li>
472</ul>
473</li>
474</ul>
475</li>
476</ul>
477````````````````````````````````
478
479```````````````````````````````` example
480#→Foo
481.
482<h1>Foo</h1>
483````````````````````````````````
484
485```````````````````````````````` example
486*→*→*→
487.
488<hr />
489````````````````````````````````
490
491
492## Insecure characters
493
494For security reasons, the Unicode character `U+0000` must be replaced
495with the REPLACEMENT CHARACTER (`U+FFFD`).
496
497# Blocks and inlines
498
499We can think of a document as a sequence of
500[blocks](@)---structural elements like paragraphs, block
501quotations, lists, headings, rules, and code blocks.  Some blocks (like
502block quotes and list items) contain other blocks; others (like
503headings and paragraphs) contain [inline](@) content---text,
504links, emphasized text, images, code spans, and so on.
505
506## Precedence
507
508Indicators of block structure always take precedence over indicators
509of inline structure.  So, for example, the following is a list with
510two items, not a list with one item containing a code span:
511
512```````````````````````````````` example
513- `one
514- two`
515.
516<ul>
517<li>`one</li>
518<li>two`</li>
519</ul>
520````````````````````````````````
521
522
523This means that parsing can proceed in two steps:  first, the block
524structure of the document can be discerned; second, text lines inside
525paragraphs, headings, and other block constructs can be parsed for inline
526structure.  The second step requires information about link reference
527definitions that will be available only at the end of the first
528step.  Note that the first step requires processing lines in sequence,
529but the second can be parallelized, since the inline parsing of
530one block element does not affect the inline parsing of any other.
531
532## Container blocks and leaf blocks
533
534We can divide blocks into two types:
535[container blocks](@),
536which can contain other blocks, and [leaf blocks](@),
537which cannot.
538
539# Leaf blocks
540
541This section describes the different kinds of leaf block that make up a
542Markdown document.
543
544## Thematic breaks
545
546A line consisting of 0-3 spaces of indentation, followed by a sequence
547of three or more matching `-`, `_`, or `*` characters, each followed
548optionally by any number of spaces or tabs, forms a
549[thematic break](@).
550
551```````````````````````````````` example
552***
553---
554___
555.
556<hr />
557<hr />
558<hr />
559````````````````````````````````
560
561
562Wrong characters:
563
564```````````````````````````````` example
565+++
566.
567<p>+++</p>
568````````````````````````````````
569
570
571```````````````````````````````` example
572===
573.
574<p>===</p>
575````````````````````````````````
576
577
578Not enough characters:
579
580```````````````````````````````` example
581--
582**
583__
584.
585<p>--
586**
587__</p>
588````````````````````````````````
589
590
591One to three spaces indent are allowed:
592
593```````````````````````````````` example
594 ***
595  ***
596   ***
597.
598<hr />
599<hr />
600<hr />
601````````````````````````````````
602
603
604Four spaces is too many:
605
606```````````````````````````````` example
607    ***
608.
609<pre><code>***
610</code></pre>
611````````````````````````````````
612
613
614```````````````````````````````` example
615Foo
616    ***
617.
618<p>Foo
619***</p>
620````````````````````````````````
621
622
623More than three characters may be used:
624
625```````````````````````````````` example
626_____________________________________
627.
628<hr />
629````````````````````````````````
630
631
632Spaces are allowed between the characters:
633
634```````````````````````````````` example
635 - - -
636.
637<hr />
638````````````````````````````````
639
640
641```````````````````````````````` example
642 **  * ** * ** * **
643.
644<hr />
645````````````````````````````````
646
647
648```````````````````````````````` example
649-     -      -      -
650.
651<hr />
652````````````````````````````````
653
654
655Spaces are allowed at the end:
656
657```````````````````````````````` example
658- - - -
659.
660<hr />
661````````````````````````````````
662
663
664However, no other characters may occur in the line:
665
666```````````````````````````````` example
667_ _ _ _ a
668
669a------
670
671---a---
672.
673<p>_ _ _ _ a</p>
674<p>a------</p>
675<p>---a---</p>
676````````````````````````````````
677
678
679It is required that all of the [non-whitespace characters] be the same.
680So, this is not a thematic break:
681
682```````````````````````````````` example
683 *-*
684.
685<p><em>-</em></p>
686````````````````````````````````
687
688
689Thematic breaks do not need blank lines before or after:
690
691```````````````````````````````` example
692- foo
693***
694- bar
695.
696<ul>
697<li>foo</li>
698</ul>
699<hr />
700<ul>
701<li>bar</li>
702</ul>
703````````````````````````````````
704
705
706Thematic breaks can interrupt a paragraph:
707
708```````````````````````````````` example
709Foo
710***
711bar
712.
713<p>Foo</p>
714<hr />
715<p>bar</p>
716````````````````````````````````
717
718
719If a line of dashes that meets the above conditions for being a
720thematic break could also be interpreted as the underline of a [setext
721heading], the interpretation as a
722[setext heading] takes precedence. Thus, for example,
723this is a setext heading, not a paragraph followed by a thematic break:
724
725```````````````````````````````` example
726Foo
727---
728bar
729.
730<h2>Foo</h2>
731<p>bar</p>
732````````````````````````````````
733
734
735When both a thematic break and a list item are possible
736interpretations of a line, the thematic break takes precedence:
737
738```````````````````````````````` example
739* Foo
740* * *
741* Bar
742.
743<ul>
744<li>Foo</li>
745</ul>
746<hr />
747<ul>
748<li>Bar</li>
749</ul>
750````````````````````````````````
751
752
753If you want a thematic break in a list item, use a different bullet:
754
755```````````````````````````````` example
756- Foo
757- * * *
758.
759<ul>
760<li>Foo</li>
761<li>
762<hr />
763</li>
764</ul>
765````````````````````````````````
766
767
768## ATX headings
769
770An [ATX heading](@)
771consists of a string of characters, parsed as inline content, between an
772opening sequence of 1--6 unescaped `#` characters and an optional
773closing sequence of any number of unescaped `#` characters.
774The opening sequence of `#` characters must be followed by a
775[space] or by the end of line. The optional closing sequence of `#`s must be
776preceded by a [space] and may be followed by spaces only.  The opening
777`#` character may be indented 0-3 spaces.  The raw contents of the
778heading are stripped of leading and trailing spaces before being parsed
779as inline content.  The heading level is equal to the number of `#`
780characters in the opening sequence.
781
782Simple headings:
783
784```````````````````````````````` example
785# foo
786## foo
787### foo
788#### foo
789##### foo
790###### foo
791.
792<h1>foo</h1>
793<h2>foo</h2>
794<h3>foo</h3>
795<h4>foo</h4>
796<h5>foo</h5>
797<h6>foo</h6>
798````````````````````````````````
799
800
801More than six `#` characters is not a heading:
802
803```````````````````````````````` example
804####### foo
805.
806<p>####### foo</p>
807````````````````````````````````
808
809
810At least one space is required between the `#` characters and the
811heading's contents, unless the heading is empty.  Note that many
812implementations currently do not require the space.  However, the
813space was required by the
814[original ATX implementation](http://www.aaronsw.com/2002/atx/atx.py),
815and it helps prevent things like the following from being parsed as
816headings:
817
818```````````````````````````````` example
819#5 bolt
820
821#hashtag
822.
823<p>#5 bolt</p>
824<p>#hashtag</p>
825````````````````````````````````
826
827
828This is not a heading, because the first `#` is escaped:
829
830```````````````````````````````` example
831\## foo
832.
833<p>## foo</p>
834````````````````````````````````
835
836
837Contents are parsed as inlines:
838
839```````````````````````````````` example
840# foo *bar* \*baz\*
841.
842<h1>foo <em>bar</em> *baz*</h1>
843````````````````````````````````
844
845
846Leading and trailing [whitespace] is ignored in parsing inline content:
847
848```````````````````````````````` example
849#                  foo
850.
851<h1>foo</h1>
852````````````````````````````````
853
854
855One to three spaces indentation are allowed:
856
857```````````````````````````````` example
858 ### foo
859  ## foo
860   # foo
861.
862<h3>foo</h3>
863<h2>foo</h2>
864<h1>foo</h1>
865````````````````````````````````
866
867
868Four spaces are too much:
869
870```````````````````````````````` example
871    # foo
872.
873<pre><code># foo
874</code></pre>
875````````````````````````````````
876
877
878```````````````````````````````` example
879foo
880    # bar
881.
882<p>foo
883# bar</p>
884````````````````````````````````
885
886
887A closing sequence of `#` characters is optional:
888
889```````````````````````````````` example
890## foo ##
891  ###   bar    ###
892.
893<h2>foo</h2>
894<h3>bar</h3>
895````````````````````````````````
896
897
898It need not be the same length as the opening sequence:
899
900```````````````````````````````` example
901# foo ##################################
902##### foo ##
903.
904<h1>foo</h1>
905<h5>foo</h5>
906````````````````````````````````
907
908
909Spaces are allowed after the closing sequence:
910
911```````````````````````````````` example
912### foo ###
913.
914<h3>foo</h3>
915````````````````````````````````
916
917
918A sequence of `#` characters with anything but [spaces] following it
919is not a closing sequence, but counts as part of the contents of the
920heading:
921
922```````````````````````````````` example
923### foo ### b
924.
925<h3>foo ### b</h3>
926````````````````````````````````
927
928
929The closing sequence must be preceded by a space:
930
931```````````````````````````````` example
932# foo#
933.
934<h1>foo#</h1>
935````````````````````````````````
936
937
938Backslash-escaped `#` characters do not count as part
939of the closing sequence:
940
941```````````````````````````````` example
942### foo \###
943## foo #\##
944# foo \#
945.
946<h3>foo ###</h3>
947<h2>foo ###</h2>
948<h1>foo #</h1>
949````````````````````````````````
950
951
952ATX headings need not be separated from surrounding content by blank
953lines, and they can interrupt paragraphs:
954
955```````````````````````````````` example
956****
957## foo
958****
959.
960<hr />
961<h2>foo</h2>
962<hr />
963````````````````````````````````
964
965
966```````````````````````````````` example
967Foo bar
968# baz
969Bar foo
970.
971<p>Foo bar</p>
972<h1>baz</h1>
973<p>Bar foo</p>
974````````````````````````````````
975
976
977ATX headings can be empty:
978
979```````````````````````````````` example
980##
981#
982### ###
983.
984<h2></h2>
985<h1></h1>
986<h3></h3>
987````````````````````````````````
988
989
990## Setext headings
991
992A [setext heading](@) consists of one or more
993lines of text, each containing at least one [non-whitespace
994character], with no more than 3 spaces indentation, followed by
995a [setext heading underline].  The lines of text must be such
996that, were they not followed by the setext heading underline,
997they would be interpreted as a paragraph:  they cannot be
998interpretable as a [code fence], [ATX heading][ATX headings],
999[block quote][block quotes], [thematic break][thematic breaks],
1000[list item][list items], or [HTML block][HTML blocks].
1001
1002A [setext heading underline](@) is a sequence of
1003`=` characters or a sequence of `-` characters, with no more than 3
1004spaces of indentation and any number of trailing spaces or tabs.
1005
1006The heading is a level 1 heading if `=` characters are used in
1007the [setext heading underline], and a level 2 heading if `-`
1008characters are used.  The contents of the heading are the result
1009of parsing the preceding lines of text as CommonMark inline
1010content.
1011
1012In general, a setext heading need not be preceded or followed by a
1013blank line.  However, it cannot interrupt a paragraph, so when a
1014setext heading comes after a paragraph, a blank line is needed between
1015them.
1016
1017Simple examples:
1018
1019```````````````````````````````` example
1020Foo *bar*
1021=========
1022
1023Foo *bar*
1024---------
1025.
1026<h1>Foo <em>bar</em></h1>
1027<h2>Foo <em>bar</em></h2>
1028````````````````````````````````
1029
1030
1031The content of the header may span more than one line:
1032
1033```````````````````````````````` example
1034Foo *bar
1035baz*
1036====
1037.
1038<h1>Foo <em>bar
1039baz</em></h1>
1040````````````````````````````````
1041
1042The contents are the result of parsing the headings's raw
1043content as inlines.  The heading's raw content is formed by
1044concatenating the lines and removing initial and final
1045[whitespace].
1046
1047```````````````````````````````` example
1048  Foo *bar
1049baz*→
1050====
1051.
1052<h1>Foo <em>bar
1053baz</em></h1>
1054````````````````````````````````
1055
1056
1057The underlining can be any length:
1058
1059```````````````````````````````` example
1060Foo
1061-------------------------
1062
1063Foo
1064=
1065.
1066<h2>Foo</h2>
1067<h1>Foo</h1>
1068````````````````````````````````
1069
1070
1071The heading content can be indented up to three spaces, and need
1072not line up with the underlining:
1073
1074```````````````````````````````` example
1075   Foo
1076---
1077
1078  Foo
1079-----
1080
1081  Foo
1082  ===
1083.
1084<h2>Foo</h2>
1085<h2>Foo</h2>
1086<h1>Foo</h1>
1087````````````````````````````````
1088
1089
1090Four spaces indent is too much:
1091
1092```````````````````````````````` example
1093    Foo
1094    ---
1095
1096    Foo
1097---
1098.
1099<pre><code>Foo
1100---
1101
1102Foo
1103</code></pre>
1104<hr />
1105````````````````````````````````
1106
1107
1108The setext heading underline can be indented up to three spaces, and
1109may have trailing spaces:
1110
1111```````````````````````````````` example
1112Foo
1113   ----
1114.
1115<h2>Foo</h2>
1116````````````````````````````````
1117
1118
1119Four spaces is too much:
1120
1121```````````````````````````````` example
1122Foo
1123    ---
1124.
1125<p>Foo
1126---</p>
1127````````````````````````````````
1128
1129
1130The setext heading underline cannot contain internal spaces:
1131
1132```````````````````````````````` example
1133Foo
1134= =
1135
1136Foo
1137--- -
1138.
1139<p>Foo
1140= =</p>
1141<p>Foo</p>
1142<hr />
1143````````````````````````````````
1144
1145
1146Trailing spaces in the content line do not cause a line break:
1147
1148```````````````````````````````` example
1149Foo
1150-----
1151.
1152<h2>Foo</h2>
1153````````````````````````````````
1154
1155
1156Nor does a backslash at the end:
1157
1158```````````````````````````````` example
1159Foo\
1160----
1161.
1162<h2>Foo\</h2>
1163````````````````````````````````
1164
1165
1166Since indicators of block structure take precedence over
1167indicators of inline structure, the following are setext headings:
1168
1169```````````````````````````````` example
1170`Foo
1171----
1172`
1173
1174<a title="a lot
1175---
1176of dashes"/>
1177.
1178<h2>`Foo</h2>
1179<p>`</p>
1180<h2>&lt;a title=&quot;a lot</h2>
1181<p>of dashes&quot;/&gt;</p>
1182````````````````````````````````
1183
1184
1185The setext heading underline cannot be a [lazy continuation
1186line] in a list item or block quote:
1187
1188```````````````````````````````` example
1189> Foo
1190---
1191.
1192<blockquote>
1193<p>Foo</p>
1194</blockquote>
1195<hr />
1196````````````````````````````````
1197
1198
1199```````````````````````````````` example
1200> foo
1201bar
1202===
1203.
1204<blockquote>
1205<p>foo
1206bar
1207===</p>
1208</blockquote>
1209````````````````````````````````
1210
1211
1212```````````````````````````````` example
1213- Foo
1214---
1215.
1216<ul>
1217<li>Foo</li>
1218</ul>
1219<hr />
1220````````````````````````````````
1221
1222
1223A blank line is needed between a paragraph and a following
1224setext heading, since otherwise the paragraph becomes part
1225of the heading's content:
1226
1227```````````````````````````````` example
1228Foo
1229Bar
1230---
1231.
1232<h2>Foo
1233Bar</h2>
1234````````````````````````````````
1235
1236
1237But in general a blank line is not required before or after
1238setext headings:
1239
1240```````````````````````````````` example
1241---
1242Foo
1243---
1244Bar
1245---
1246Baz
1247.
1248<hr />
1249<h2>Foo</h2>
1250<h2>Bar</h2>
1251<p>Baz</p>
1252````````````````````````````````
1253
1254
1255Setext headings cannot be empty:
1256
1257```````````````````````````````` example
1258
1259====
1260.
1261<p>====</p>
1262````````````````````````````````
1263
1264
1265Setext heading text lines must not be interpretable as block
1266constructs other than paragraphs.  So, the line of dashes
1267in these examples gets interpreted as a thematic break:
1268
1269```````````````````````````````` example
1270---
1271---
1272.
1273<hr />
1274<hr />
1275````````````````````````````````
1276
1277
1278```````````````````````````````` example
1279- foo
1280-----
1281.
1282<ul>
1283<li>foo</li>
1284</ul>
1285<hr />
1286````````````````````````````````
1287
1288
1289```````````````````````````````` example
1290    foo
1291---
1292.
1293<pre><code>foo
1294</code></pre>
1295<hr />
1296````````````````````````````````
1297
1298
1299```````````````````````````````` example
1300> foo
1301-----
1302.
1303<blockquote>
1304<p>foo</p>
1305</blockquote>
1306<hr />
1307````````````````````````````````
1308
1309
1310If you want a heading with `> foo` as its literal text, you can
1311use backslash escapes:
1312
1313```````````````````````````````` example
1314\> foo
1315------
1316.
1317<h2>&gt; foo</h2>
1318````````````````````````````````
1319
1320
1321**Compatibility note:**  Most existing Markdown implementations
1322do not allow the text of setext headings to span multiple lines.
1323But there is no consensus about how to interpret
1324
1325``` markdown
1326Foo
1327bar
1328---
1329baz
1330```
1331
1332One can find four different interpretations:
1333
13341. paragraph "Foo", heading "bar", paragraph "baz"
13352. paragraph "Foo bar", thematic break, paragraph "baz"
13363. paragraph "Foo bar --- baz"
13374. heading "Foo bar", paragraph "baz"
1338
1339We find interpretation 4 most natural, and interpretation 4
1340increases the expressive power of CommonMark, by allowing
1341multiline headings.  Authors who want interpretation 1 can
1342put a blank line after the first paragraph:
1343
1344```````````````````````````````` example
1345Foo
1346
1347bar
1348---
1349baz
1350.
1351<p>Foo</p>
1352<h2>bar</h2>
1353<p>baz</p>
1354````````````````````````````````
1355
1356
1357Authors who want interpretation 2 can put blank lines around
1358the thematic break,
1359
1360```````````````````````````````` example
1361Foo
1362bar
1363
1364---
1365
1366baz
1367.
1368<p>Foo
1369bar</p>
1370<hr />
1371<p>baz</p>
1372````````````````````````````````
1373
1374
1375or use a thematic break that cannot count as a [setext heading
1376underline], such as
1377
1378```````````````````````````````` example
1379Foo
1380bar
1381* * *
1382baz
1383.
1384<p>Foo
1385bar</p>
1386<hr />
1387<p>baz</p>
1388````````````````````````````````
1389
1390
1391Authors who want interpretation 3 can use backslash escapes:
1392
1393```````````````````````````````` example
1394Foo
1395bar
1396\---
1397baz
1398.
1399<p>Foo
1400bar
1401---
1402baz</p>
1403````````````````````````````````
1404
1405
1406## Indented code blocks
1407
1408An [indented code block](@) is composed of one or more
1409[indented chunks] separated by blank lines.
1410An [indented chunk](@) is a sequence of non-blank lines,
1411each indented four or more spaces. The contents of the code block are
1412the literal contents of the lines, including trailing
1413[line endings], minus four spaces of indentation.
1414An indented code block has no [info string].
1415
1416An indented code block cannot interrupt a paragraph, so there must be
1417a blank line between a paragraph and a following indented code block.
1418(A blank line is not needed, however, between a code block and a following
1419paragraph.)
1420
1421```````````````````````````````` example
1422    a simple
1423      indented code block
1424.
1425<pre><code>a simple
1426  indented code block
1427</code></pre>
1428````````````````````````````````
1429
1430
1431If there is any ambiguity between an interpretation of indentation
1432as a code block and as indicating that material belongs to a [list
1433item][list items], the list item interpretation takes precedence:
1434
1435```````````````````````````````` example
1436  - foo
1437
1438    bar
1439.
1440<ul>
1441<li>
1442<p>foo</p>
1443<p>bar</p>
1444</li>
1445</ul>
1446````````````````````````````````
1447
1448
1449```````````````````````````````` example
14501.  foo
1451
1452    - bar
1453.
1454<ol>
1455<li>
1456<p>foo</p>
1457<ul>
1458<li>bar</li>
1459</ul>
1460</li>
1461</ol>
1462````````````````````````````````
1463
1464
1465
1466The contents of a code block are literal text, and do not get parsed
1467as Markdown:
1468
1469```````````````````````````````` example
1470    <a/>
1471    *hi*
1472
1473    - one
1474.
1475<pre><code>&lt;a/&gt;
1476*hi*
1477
1478- one
1479</code></pre>
1480````````````````````````````````
1481
1482
1483Here we have three chunks separated by blank lines:
1484
1485```````````````````````````````` example
1486    chunk1
1487
1488    chunk2
1489
1490
1491
1492    chunk3
1493.
1494<pre><code>chunk1
1495
1496chunk2
1497
1498
1499
1500chunk3
1501</code></pre>
1502````````````````````````````````
1503
1504
1505Any initial spaces beyond four will be included in the content, even
1506in interior blank lines:
1507
1508```````````````````````````````` example
1509    chunk1
1510
1511      chunk2
1512.
1513<pre><code>chunk1
1514
1515  chunk2
1516</code></pre>
1517````````````````````````````````
1518
1519
1520An indented code block cannot interrupt a paragraph.  (This
1521allows hanging indents and the like.)
1522
1523```````````````````````````````` example
1524Foo
1525    bar
1526
1527.
1528<p>Foo
1529bar</p>
1530````````````````````````````````
1531
1532
1533However, any non-blank line with fewer than four leading spaces ends
1534the code block immediately.  So a paragraph may occur immediately
1535after indented code:
1536
1537```````````````````````````````` example
1538    foo
1539bar
1540.
1541<pre><code>foo
1542</code></pre>
1543<p>bar</p>
1544````````````````````````````````
1545
1546
1547And indented code can occur immediately before and after other kinds of
1548blocks:
1549
1550```````````````````````````````` example
1551# Heading
1552    foo
1553Heading
1554------
1555    foo
1556----
1557.
1558<h1>Heading</h1>
1559<pre><code>foo
1560</code></pre>
1561<h2>Heading</h2>
1562<pre><code>foo
1563</code></pre>
1564<hr />
1565````````````````````````````````
1566
1567
1568The first line can be indented more than four spaces:
1569
1570```````````````````````````````` example
1571        foo
1572    bar
1573.
1574<pre><code>    foo
1575bar
1576</code></pre>
1577````````````````````````````````
1578
1579
1580Blank lines preceding or following an indented code block
1581are not included in it:
1582
1583```````````````````````````````` example
1584
1585
1586    foo
1587
1588
1589.
1590<pre><code>foo
1591</code></pre>
1592````````````````````````````````
1593
1594
1595Trailing spaces are included in the code block's content:
1596
1597```````````````````````````````` example
1598    foo
1599.
1600<pre><code>foo
1601</code></pre>
1602````````````````````````````````
1603
1604
1605
1606## Fenced code blocks
1607
1608A [code fence](@) is a sequence
1609of at least three consecutive backtick characters (`` ` ``) or
1610tildes (`~`).  (Tildes and backticks cannot be mixed.)
1611A [fenced code block](@)
1612begins with a code fence, indented no more than three spaces.
1613
1614The line with the opening code fence may optionally contain some text
1615following the code fence; this is trimmed of leading and trailing
1616whitespace and called the [info string](@). If the [info string] comes
1617after a backtick fence, it may not contain any backtick
1618characters.  (The reason for this restriction is that otherwise
1619some inline code would be incorrectly interpreted as the
1620beginning of a fenced code block.)
1621
1622The content of the code block consists of all subsequent lines, until
1623a closing [code fence] of the same type as the code block
1624began with (backticks or tildes), and with at least as many backticks
1625or tildes as the opening code fence.  If the leading code fence is
1626indented N spaces, then up to N spaces of indentation are removed from
1627each line of the content (if present).  (If a content line is not
1628indented, it is preserved unchanged.  If it is indented less than N
1629spaces, all of the indentation is removed.)
1630
1631The closing code fence may be indented up to three spaces, and may be
1632followed only by spaces, which are ignored.  If the end of the
1633containing block (or document) is reached and no closing code fence
1634has been found, the code block contains all of the lines after the
1635opening code fence until the end of the containing block (or
1636document).  (An alternative spec would require backtracking in the
1637event that a closing code fence is not found.  But this makes parsing
1638much less efficient, and there seems to be no real downside to the
1639behavior described here.)
1640
1641A fenced code block may interrupt a paragraph, and does not require
1642a blank line either before or after.
1643
1644The content of a code fence is treated as literal text, not parsed
1645as inlines.  The first word of the [info string] is typically used to
1646specify the language of the code sample, and rendered in the `class`
1647attribute of the `code` tag.  However, this spec does not mandate any
1648particular treatment of the [info string].
1649
1650Here is a simple example with backticks:
1651
1652```````````````````````````````` example
1653```
1654<
1655 >
1656```
1657.
1658<pre><code>&lt;
1659 &gt;
1660</code></pre>
1661````````````````````````````````
1662
1663
1664With tildes:
1665
1666```````````````````````````````` example
1667~~~
1668<
1669 >
1670~~~
1671.
1672<pre><code>&lt;
1673 &gt;
1674</code></pre>
1675````````````````````````````````
1676
1677Fewer than three backticks is not enough:
1678
1679```````````````````````````````` example
1680``
1681foo
1682``
1683.
1684<p><code>foo</code></p>
1685````````````````````````````````
1686
1687The closing code fence must use the same character as the opening
1688fence:
1689
1690```````````````````````````````` example
1691```
1692aaa
1693~~~
1694```
1695.
1696<pre><code>aaa
1697~~~
1698</code></pre>
1699````````````````````````````````
1700
1701
1702```````````````````````````````` example
1703~~~
1704aaa
1705```
1706~~~
1707.
1708<pre><code>aaa
1709```
1710</code></pre>
1711````````````````````````````````
1712
1713
1714The closing code fence must be at least as long as the opening fence:
1715
1716```````````````````````````````` example
1717````
1718aaa
1719```
1720``````
1721.
1722<pre><code>aaa
1723```
1724</code></pre>
1725````````````````````````````````
1726
1727
1728```````````````````````````````` example
1729~~~~
1730aaa
1731~~~
1732~~~~
1733.
1734<pre><code>aaa
1735~~~
1736</code></pre>
1737````````````````````````````````
1738
1739
1740Unclosed code blocks are closed by the end of the document
1741(or the enclosing [block quote][block quotes] or [list item][list items]):
1742
1743```````````````````````````````` example
1744```
1745.
1746<pre><code></code></pre>
1747````````````````````````````````
1748
1749
1750```````````````````````````````` example
1751`````
1752
1753```
1754aaa
1755.
1756<pre><code>
1757```
1758aaa
1759</code></pre>
1760````````````````````````````````
1761
1762
1763```````````````````````````````` example
1764> ```
1765> aaa
1766
1767bbb
1768.
1769<blockquote>
1770<pre><code>aaa
1771</code></pre>
1772</blockquote>
1773<p>bbb</p>
1774````````````````````````````````
1775
1776
1777A code block can have all empty lines as its content:
1778
1779```````````````````````````````` example
1780```
1781
1782
1783```
1784.
1785<pre><code>
1786
1787</code></pre>
1788````````````````````````````````
1789
1790
1791A code block can be empty:
1792
1793```````````````````````````````` example
1794```
1795```
1796.
1797<pre><code></code></pre>
1798````````````````````````````````
1799
1800
1801Fences can be indented.  If the opening fence is indented,
1802content lines will have equivalent opening indentation removed,
1803if present:
1804
1805```````````````````````````````` example
1806 ```
1807 aaa
1808aaa
1809```
1810.
1811<pre><code>aaa
1812aaa
1813</code></pre>
1814````````````````````````````````
1815
1816
1817```````````````````````````````` example
1818  ```
1819aaa
1820  aaa
1821aaa
1822  ```
1823.
1824<pre><code>aaa
1825aaa
1826aaa
1827</code></pre>
1828````````````````````````````````
1829
1830
1831```````````````````````````````` example
1832   ```
1833   aaa
1834    aaa
1835  aaa
1836   ```
1837.
1838<pre><code>aaa
1839 aaa
1840aaa
1841</code></pre>
1842````````````````````````````````
1843
1844
1845Four spaces indentation produces an indented code block:
1846
1847```````````````````````````````` example
1848    ```
1849    aaa
1850    ```
1851.
1852<pre><code>```
1853aaa
1854```
1855</code></pre>
1856````````````````````````````````
1857
1858
1859Closing fences may be indented by 0-3 spaces, and their indentation
1860need not match that of the opening fence:
1861
1862```````````````````````````````` example
1863```
1864aaa
1865  ```
1866.
1867<pre><code>aaa
1868</code></pre>
1869````````````````````````````````
1870
1871
1872```````````````````````````````` example
1873   ```
1874aaa
1875  ```
1876.
1877<pre><code>aaa
1878</code></pre>
1879````````````````````````````````
1880
1881
1882This is not a closing fence, because it is indented 4 spaces:
1883
1884```````````````````````````````` example
1885```
1886aaa
1887    ```
1888.
1889<pre><code>aaa
1890    ```
1891</code></pre>
1892````````````````````````````````
1893
1894
1895
1896Code fences (opening and closing) cannot contain internal spaces:
1897
1898```````````````````````````````` example
1899``` ```
1900aaa
1901.
1902<p><code> </code>
1903aaa</p>
1904````````````````````````````````
1905
1906
1907```````````````````````````````` example
1908~~~~~~
1909aaa
1910~~~ ~~
1911.
1912<pre><code>aaa
1913~~~ ~~
1914</code></pre>
1915````````````````````````````````
1916
1917
1918Fenced code blocks can interrupt paragraphs, and can be followed
1919directly by paragraphs, without a blank line between:
1920
1921```````````````````````````````` example
1922foo
1923```
1924bar
1925```
1926baz
1927.
1928<p>foo</p>
1929<pre><code>bar
1930</code></pre>
1931<p>baz</p>
1932````````````````````````````````
1933
1934
1935Other blocks can also occur before and after fenced code blocks
1936without an intervening blank line:
1937
1938```````````````````````````````` example
1939foo
1940---
1941~~~
1942bar
1943~~~
1944# baz
1945.
1946<h2>foo</h2>
1947<pre><code>bar
1948</code></pre>
1949<h1>baz</h1>
1950````````````````````````````````
1951
1952
1953An [info string] can be provided after the opening code fence.
1954Although this spec doesn't mandate any particular treatment of
1955the info string, the first word is typically used to specify
1956the language of the code block. In HTML output, the language is
1957normally indicated by adding a class to the `code` element consisting
1958of `language-` followed by the language name.
1959
1960```````````````````````````````` example
1961```ruby
1962def foo(x)
1963  return 3
1964end
1965```
1966.
1967<pre><code class="language-ruby">def foo(x)
1968  return 3
1969end
1970</code></pre>
1971````````````````````````````````
1972
1973
1974```````````````````````````````` example
1975~~~~    ruby startline=3 $%@#$
1976def foo(x)
1977  return 3
1978end
1979~~~~~~~
1980.
1981<pre><code class="language-ruby">def foo(x)
1982  return 3
1983end
1984</code></pre>
1985````````````````````````````````
1986
1987
1988```````````````````````````````` example
1989````;
1990````
1991.
1992<pre><code class="language-;"></code></pre>
1993````````````````````````````````
1994
1995
1996[Info strings] for backtick code blocks cannot contain backticks:
1997
1998```````````````````````````````` example
1999``` aa ```
2000foo
2001.
2002<p><code>aa</code>
2003foo</p>
2004````````````````````````````````
2005
2006
2007[Info strings] for tilde code blocks can contain backticks and tildes:
2008
2009```````````````````````````````` example
2010~~~ aa ``` ~~~
2011foo
2012~~~
2013.
2014<pre><code class="language-aa">foo
2015</code></pre>
2016````````````````````````````````
2017
2018
2019Closing code fences cannot have [info strings]:
2020
2021```````````````````````````````` example
2022```
2023``` aaa
2024```
2025.
2026<pre><code>``` aaa
2027</code></pre>
2028````````````````````````````````
2029
2030
2031
2032## HTML blocks
2033
2034An [HTML block](@) is a group of lines that is treated
2035as raw HTML (and will not be escaped in HTML output).
2036
2037There are seven kinds of [HTML block], which can be defined by their
2038start and end conditions.  The block begins with a line that meets a
2039[start condition](@) (after up to three spaces optional indentation).
2040It ends with the first subsequent line that meets a matching [end
2041condition](@), or the last line of the document, or the last line of
2042the [container block](#container-blocks) containing the current HTML
2043block, if no line is encountered that meets the [end condition].  If
2044the first line meets both the [start condition] and the [end
2045condition], the block will contain just that line.
2046
20471.  **Start condition:**  line begins with the string `<script`,
2048`<pre`, or `<style` (case-insensitive), followed by whitespace,
2049the string `>`, or the end of the line.\
2050**End condition:**  line contains an end tag
2051`</script>`, `</pre>`, or `</style>` (case-insensitive; it
2052need not match the start tag).
2053
20542.  **Start condition:** line begins with the string `<!--`.\
2055**End condition:**  line contains the string `-->`.
2056
20573.  **Start condition:** line begins with the string `<?`.\
2058**End condition:** line contains the string `?>`.
2059
20604.  **Start condition:** line begins with the string `<!`
2061followed by an uppercase ASCII letter.\
2062**End condition:** line contains the character `>`.
2063
20645.  **Start condition:**  line begins with the string
2065`<![CDATA[`.\
2066**End condition:** line contains the string `]]>`.
2067
20686.  **Start condition:** line begins with the string `<` or `</`
2069followed by one of the strings (case-insensitive) `address`,
2070`article`, `aside`, `base`, `basefont`, `blockquote`, `body`,
2071`caption`, `center`, `col`, `colgroup`, `dd`, `details`, `dialog`,
2072`dir`, `div`, `dl`, `dt`, `fieldset`, `figcaption`, `figure`,
2073`footer`, `form`, `frame`, `frameset`,
2074`h1`, `h2`, `h3`, `h4`, `h5`, `h6`, `head`, `header`, `hr`,
2075`html`, `iframe`, `legend`, `li`, `link`, `main`, `menu`, `menuitem`,
2076`nav`, `noframes`, `ol`, `optgroup`, `option`, `p`, `param`,
2077`section`, `summary`, `table`, `tbody`, `td`,
2078`tfoot`, `th`, `thead`, `title`, `tr`, `track`, `ul`, followed
2079by [whitespace], the end of the line, the string `>`, or
2080the string `/>`.\
2081**End condition:** line is followed by a [blank line].
2082
20837.  **Start condition:**  line begins with a complete [open tag]
2084(with any [tag name] other than `script`,
2085`style`, or `pre`) or a complete [closing tag],
2086followed only by [whitespace] or the end of the line.\
2087**End condition:** line is followed by a [blank line].
2088
2089HTML blocks continue until they are closed by their appropriate
2090[end condition], or the last line of the document or other [container
2091block](#container-blocks).  This means any HTML **within an HTML
2092block** that might otherwise be recognised as a start condition will
2093be ignored by the parser and passed through as-is, without changing
2094the parser's state.
2095
2096For instance, `<pre>` within a HTML block started by `<table>` will not affect
2097the parser state; as the HTML block was started in by start condition 6, it
2098will end at any blank line. This can be surprising:
2099
2100```````````````````````````````` example
2101<table><tr><td>
2102<pre>
2103**Hello**,
2104
2105_world_.
2106</pre>
2107</td></tr></table>
2108.
2109<table><tr><td>
2110<pre>
2111**Hello**,
2112<p><em>world</em>.
2113</pre></p>
2114</td></tr></table>
2115````````````````````````````````
2116
2117In this case, the HTML block is terminated by the newline — the `**Hello**`
2118text remains verbatim — and regular parsing resumes, with a paragraph,
2119emphasised `world` and inline and block HTML following.
2120
2121All types of [HTML blocks] except type 7 may interrupt
2122a paragraph.  Blocks of type 7 may not interrupt a paragraph.
2123(This restriction is intended to prevent unwanted interpretation
2124of long tags inside a wrapped paragraph as starting HTML blocks.)
2125
2126Some simple examples follow.  Here are some basic HTML blocks
2127of type 6:
2128
2129```````````````````````````````` example
2130<table>
2131  <tr>
2132    <td>
2133           hi
2134    </td>
2135  </tr>
2136</table>
2137
2138okay.
2139.
2140<table>
2141  <tr>
2142    <td>
2143           hi
2144    </td>
2145  </tr>
2146</table>
2147<p>okay.</p>
2148````````````````````````````````
2149
2150
2151```````````````````````````````` example
2152 <div>
2153  *hello*
2154         <foo><a>
2155.
2156 <div>
2157  *hello*
2158         <foo><a>
2159````````````````````````````````
2160
2161
2162A block can also start with a closing tag:
2163
2164```````````````````````````````` example
2165</div>
2166*foo*
2167.
2168</div>
2169*foo*
2170````````````````````````````````
2171
2172
2173Here we have two HTML blocks with a Markdown paragraph between them:
2174
2175```````````````````````````````` example
2176<DIV CLASS="foo">
2177
2178*Markdown*
2179
2180</DIV>
2181.
2182<DIV CLASS="foo">
2183<p><em>Markdown</em></p>
2184</DIV>
2185````````````````````````````````
2186
2187
2188The tag on the first line can be partial, as long
2189as it is split where there would be whitespace:
2190
2191```````````````````````````````` example
2192<div id="foo"
2193  class="bar">
2194</div>
2195.
2196<div id="foo"
2197  class="bar">
2198</div>
2199````````````````````````````````
2200
2201
2202```````````````````````````````` example
2203<div id="foo" class="bar
2204  baz">
2205</div>
2206.
2207<div id="foo" class="bar
2208  baz">
2209</div>
2210````````````````````````````````
2211
2212
2213An open tag need not be closed:
2214```````````````````````````````` example
2215<div>
2216*foo*
2217
2218*bar*
2219.
2220<div>
2221*foo*
2222<p><em>bar</em></p>
2223````````````````````````````````
2224
2225
2226
2227A partial tag need not even be completed (garbage
2228in, garbage out):
2229
2230```````````````````````````````` example
2231<div id="foo"
2232*hi*
2233.
2234<div id="foo"
2235*hi*
2236````````````````````````````````
2237
2238
2239```````````````````````````````` example
2240<div class
2241foo
2242.
2243<div class
2244foo
2245````````````````````````````````
2246
2247
2248The initial tag doesn't even need to be a valid
2249tag, as long as it starts like one:
2250
2251```````````````````````````````` example
2252<div *???-&&&-<---
2253*foo*
2254.
2255<div *???-&&&-<---
2256*foo*
2257````````````````````````````````
2258
2259
2260In type 6 blocks, the initial tag need not be on a line by
2261itself:
2262
2263```````````````````````````````` example
2264<div><a href="bar">*foo*</a></div>
2265.
2266<div><a href="bar">*foo*</a></div>
2267````````````````````````````````
2268
2269
2270```````````````````````````````` example
2271<table><tr><td>
2272foo
2273</td></tr></table>
2274.
2275<table><tr><td>
2276foo
2277</td></tr></table>
2278````````````````````````````````
2279
2280
2281Everything until the next blank line or end of document
2282gets included in the HTML block.  So, in the following
2283example, what looks like a Markdown code block
2284is actually part of the HTML block, which continues until a blank
2285line or the end of the document is reached:
2286
2287```````````````````````````````` example
2288<div></div>
2289``` c
2290int x = 33;
2291```
2292.
2293<div></div>
2294``` c
2295int x = 33;
2296```
2297````````````````````````````````
2298
2299
2300To start an [HTML block] with a tag that is *not* in the
2301list of block-level tags in (6), you must put the tag by
2302itself on the first line (and it must be complete):
2303
2304```````````````````````````````` example
2305<a href="foo">
2306*bar*
2307</a>
2308.
2309<a href="foo">
2310*bar*
2311</a>
2312````````````````````````````````
2313
2314
2315In type 7 blocks, the [tag name] can be anything:
2316
2317```````````````````````````````` example
2318<Warning>
2319*bar*
2320</Warning>
2321.
2322<Warning>
2323*bar*
2324</Warning>
2325````````````````````````````````
2326
2327
2328```````````````````````````````` example
2329<i class="foo">
2330*bar*
2331</i>
2332.
2333<i class="foo">
2334*bar*
2335</i>
2336````````````````````````````````
2337
2338
2339```````````````````````````````` example
2340</ins>
2341*bar*
2342.
2343</ins>
2344*bar*
2345````````````````````````````````
2346
2347
2348These rules are designed to allow us to work with tags that
2349can function as either block-level or inline-level tags.
2350The `<del>` tag is a nice example.  We can surround content with
2351`<del>` tags in three different ways.  In this case, we get a raw
2352HTML block, because the `<del>` tag is on a line by itself:
2353
2354```````````````````````````````` example
2355<del>
2356*foo*
2357</del>
2358.
2359<del>
2360*foo*
2361</del>
2362````````````````````````````````
2363
2364
2365In this case, we get a raw HTML block that just includes
2366the `<del>` tag (because it ends with the following blank
2367line).  So the contents get interpreted as CommonMark:
2368
2369```````````````````````````````` example
2370<del>
2371
2372*foo*
2373
2374</del>
2375.
2376<del>
2377<p><em>foo</em></p>
2378</del>
2379````````````````````````````````
2380
2381
2382Finally, in this case, the `<del>` tags are interpreted
2383as [raw HTML] *inside* the CommonMark paragraph.  (Because
2384the tag is not on a line by itself, we get inline HTML
2385rather than an [HTML block].)
2386
2387```````````````````````````````` example
2388<del>*foo*</del>
2389.
2390<p><del><em>foo</em></del></p>
2391````````````````````````````````
2392
2393
2394HTML tags designed to contain literal content
2395(`script`, `style`, `pre`), comments, processing instructions,
2396and declarations are treated somewhat differently.
2397Instead of ending at the first blank line, these blocks
2398end at the first line containing a corresponding end tag.
2399As a result, these blocks can contain blank lines:
2400
2401A pre tag (type 1):
2402
2403```````````````````````````````` example
2404<pre language="haskell"><code>
2405import Text.HTML.TagSoup
2406
2407main :: IO ()
2408main = print $ parseTags tags
2409</code></pre>
2410okay
2411.
2412<pre language="haskell"><code>
2413import Text.HTML.TagSoup
2414
2415main :: IO ()
2416main = print $ parseTags tags
2417</code></pre>
2418<p>okay</p>
2419````````````````````````````````
2420
2421
2422A script tag (type 1):
2423
2424```````````````````````````````` example
2425<script type="text/javascript">
2426// JavaScript example
2427
2428document.getElementById("demo").innerHTML = "Hello JavaScript!";
2429</script>
2430okay
2431.
2432<script type="text/javascript">
2433// JavaScript example
2434
2435document.getElementById("demo").innerHTML = "Hello JavaScript!";
2436</script>
2437<p>okay</p>
2438````````````````````````````````
2439
2440
2441A style tag (type 1):
2442
2443```````````````````````````````` example
2444<style
2445  type="text/css">
2446h1 {color:red;}
2447
2448p {color:blue;}
2449</style>
2450okay
2451.
2452<style
2453  type="text/css">
2454h1 {color:red;}
2455
2456p {color:blue;}
2457</style>
2458<p>okay</p>
2459````````````````````````````````
2460
2461
2462If there is no matching end tag, the block will end at the
2463end of the document (or the enclosing [block quote][block quotes]
2464or [list item][list items]):
2465
2466```````````````````````````````` example
2467<style
2468  type="text/css">
2469
2470foo
2471.
2472<style
2473  type="text/css">
2474
2475foo
2476````````````````````````````````
2477
2478
2479```````````````````````````````` example
2480> <div>
2481> foo
2482
2483bar
2484.
2485<blockquote>
2486<div>
2487foo
2488</blockquote>
2489<p>bar</p>
2490````````````````````````````````
2491
2492
2493```````````````````````````````` example
2494- <div>
2495- foo
2496.
2497<ul>
2498<li>
2499<div>
2500</li>
2501<li>foo</li>
2502</ul>
2503````````````````````````````````
2504
2505
2506The end tag can occur on the same line as the start tag:
2507
2508```````````````````````````````` example
2509<style>p{color:red;}</style>
2510*foo*
2511.
2512<style>p{color:red;}</style>
2513<p><em>foo</em></p>
2514````````````````````````````````
2515
2516
2517```````````````````````````````` example
2518<!-- foo -->*bar*
2519*baz*
2520.
2521<!-- foo -->*bar*
2522<p><em>baz</em></p>
2523````````````````````````````````
2524
2525
2526Note that anything on the last line after the
2527end tag will be included in the [HTML block]:
2528
2529```````````````````````````````` example
2530<script>
2531foo
2532</script>1. *bar*
2533.
2534<script>
2535foo
2536</script>1. *bar*
2537````````````````````````````````
2538
2539
2540A comment (type 2):
2541
2542```````````````````````````````` example
2543<!-- Foo
2544
2545bar
2546   baz -->
2547okay
2548.
2549<!-- Foo
2550
2551bar
2552   baz -->
2553<p>okay</p>
2554````````````````````````````````
2555
2556
2557
2558A processing instruction (type 3):
2559
2560```````````````````````````````` example
2561<?php
2562
2563  echo '>';
2564
2565?>
2566okay
2567.
2568<?php
2569
2570  echo '>';
2571
2572?>
2573<p>okay</p>
2574````````````````````````````````
2575
2576
2577A declaration (type 4):
2578
2579```````````````````````````````` example
2580<!DOCTYPE html>
2581.
2582<!DOCTYPE html>
2583````````````````````````````````
2584
2585
2586CDATA (type 5):
2587
2588```````````````````````````````` example
2589<![CDATA[
2590function matchwo(a,b)
2591{
2592  if (a < b && a < 0) then {
2593    return 1;
2594
2595  } else {
2596
2597    return 0;
2598  }
2599}
2600]]>
2601okay
2602.
2603<![CDATA[
2604function matchwo(a,b)
2605{
2606  if (a < b && a < 0) then {
2607    return 1;
2608
2609  } else {
2610
2611    return 0;
2612  }
2613}
2614]]>
2615<p>okay</p>
2616````````````````````````````````
2617
2618
2619The opening tag can be indented 1-3 spaces, but not 4:
2620
2621```````````````````````````````` example
2622  <!-- foo -->
2623
2624    <!-- foo -->
2625.
2626  <!-- foo -->
2627<pre><code>&lt;!-- foo --&gt;
2628</code></pre>
2629````````````````````````````````
2630
2631
2632```````````````````````````````` example
2633  <div>
2634
2635    <div>
2636.
2637  <div>
2638<pre><code>&lt;div&gt;
2639</code></pre>
2640````````````````````````````````
2641
2642
2643An HTML block of types 1--6 can interrupt a paragraph, and need not be
2644preceded by a blank line.
2645
2646```````````````````````````````` example
2647Foo
2648<div>
2649bar
2650</div>
2651.
2652<p>Foo</p>
2653<div>
2654bar
2655</div>
2656````````````````````````````````
2657
2658
2659However, a following blank line is needed, except at the end of
2660a document, and except for blocks of types 1--5, [above][HTML
2661block]:
2662
2663```````````````````````````````` example
2664<div>
2665bar
2666</div>
2667*foo*
2668.
2669<div>
2670bar
2671</div>
2672*foo*
2673````````````````````````````````
2674
2675
2676HTML blocks of type 7 cannot interrupt a paragraph:
2677
2678```````````````````````````````` example
2679Foo
2680<a href="bar">
2681baz
2682.
2683<p>Foo
2684<a href="bar">
2685baz</p>
2686````````````````````````````````
2687
2688
2689This rule differs from John Gruber's original Markdown syntax
2690specification, which says:
2691
2692> The only restrictions are that block-level HTML elements —
2693> e.g. `<div>`, `<table>`, `<pre>`, `<p>`, etc. — must be separated from
2694> surrounding content by blank lines, and the start and end tags of the
2695> block should not be indented with tabs or spaces.
2696
2697In some ways Gruber's rule is more restrictive than the one given
2698here:
2699
2700- It requires that an HTML block be preceded by a blank line.
2701- It does not allow the start tag to be indented.
2702- It requires a matching end tag, which it also does not allow to
2703  be indented.
2704
2705Most Markdown implementations (including some of Gruber's own) do not
2706respect all of these restrictions.
2707
2708There is one respect, however, in which Gruber's rule is more liberal
2709than the one given here, since it allows blank lines to occur inside
2710an HTML block.  There are two reasons for disallowing them here.
2711First, it removes the need to parse balanced tags, which is
2712expensive and can require backtracking from the end of the document
2713if no matching end tag is found. Second, it provides a very simple
2714and flexible way of including Markdown content inside HTML tags:
2715simply separate the Markdown from the HTML using blank lines:
2716
2717Compare:
2718
2719```````````````````````````````` example
2720<div>
2721
2722*Emphasized* text.
2723
2724</div>
2725.
2726<div>
2727<p><em>Emphasized</em> text.</p>
2728</div>
2729````````````````````````````````
2730
2731
2732```````````````````````````````` example
2733<div>
2734*Emphasized* text.
2735</div>
2736.
2737<div>
2738*Emphasized* text.
2739</div>
2740````````````````````````````````
2741
2742
2743Some Markdown implementations have adopted a convention of
2744interpreting content inside tags as text if the open tag has
2745the attribute `markdown=1`.  The rule given above seems a simpler and
2746more elegant way of achieving the same expressive power, which is also
2747much simpler to parse.
2748
2749The main potential drawback is that one can no longer paste HTML
2750blocks into Markdown documents with 100% reliability.  However,
2751*in most cases* this will work fine, because the blank lines in
2752HTML are usually followed by HTML block tags.  For example:
2753
2754```````````````````````````````` example
2755<table>
2756
2757<tr>
2758
2759<td>
2760Hi
2761</td>
2762
2763</tr>
2764
2765</table>
2766.
2767<table>
2768<tr>
2769<td>
2770Hi
2771</td>
2772</tr>
2773</table>
2774````````````````````````````````
2775
2776
2777There are problems, however, if the inner tags are indented
2778*and* separated by spaces, as then they will be interpreted as
2779an indented code block:
2780
2781```````````````````````````````` example
2782<table>
2783
2784  <tr>
2785
2786    <td>
2787      Hi
2788    </td>
2789
2790  </tr>
2791
2792</table>
2793.
2794<table>
2795  <tr>
2796<pre><code>&lt;td&gt;
2797  Hi
2798&lt;/td&gt;
2799</code></pre>
2800  </tr>
2801</table>
2802````````````````````````````````
2803
2804
2805Fortunately, blank lines are usually not necessary and can be
2806deleted.  The exception is inside `<pre>` tags, but as described
2807[above][HTML blocks], raw HTML blocks starting with `<pre>`
2808*can* contain blank lines.
2809
2810## Link reference definitions
2811
2812A [link reference definition](@)
2813consists of a [link label], indented up to three spaces, followed
2814by a colon (`:`), optional [whitespace] (including up to one
2815[line ending]), a [link destination],
2816optional [whitespace] (including up to one
2817[line ending]), and an optional [link
2818title], which if it is present must be separated
2819from the [link destination] by [whitespace].
2820No further [non-whitespace characters] may occur on the line.
2821
2822A [link reference definition]
2823does not correspond to a structural element of a document.  Instead, it
2824defines a label which can be used in [reference links]
2825and reference-style [images] elsewhere in the document.  [Link
2826reference definitions] can come either before or after the links that use
2827them.
2828
2829```````````````````````````````` example
2830[foo]: /url "title"
2831
2832[foo]
2833.
2834<p><a href="/url" title="title">foo</a></p>
2835````````````````````````````````
2836
2837
2838```````````````````````````````` example
2839   [foo]:
2840      /url
2841           'the title'
2842
2843[foo]
2844.
2845<p><a href="/url" title="the title">foo</a></p>
2846````````````````````````````````
2847
2848
2849```````````````````````````````` example
2850[Foo*bar\]]:my_(url) 'title (with parens)'
2851
2852[Foo*bar\]]
2853.
2854<p><a href="my_(url)" title="title (with parens)">Foo*bar]</a></p>
2855````````````````````````````````
2856
2857
2858```````````````````````````````` example
2859[Foo bar]:
2860<my url>
2861'title'
2862
2863[Foo bar]
2864.
2865<p><a href="my%20url" title="title">Foo bar</a></p>
2866````````````````````````````````
2867
2868
2869The title may extend over multiple lines:
2870
2871```````````````````````````````` example
2872[foo]: /url '
2873title
2874line1
2875line2
2876'
2877
2878[foo]
2879.
2880<p><a href="/url" title="
2881title
2882line1
2883line2
2884">foo</a></p>
2885````````````````````````````````
2886
2887
2888However, it may not contain a [blank line]:
2889
2890```````````````````````````````` example
2891[foo]: /url 'title
2892
2893with blank line'
2894
2895[foo]
2896.
2897<p>[foo]: /url 'title</p>
2898<p>with blank line'</p>
2899<p>[foo]</p>
2900````````````````````````````````
2901
2902
2903The title may be omitted:
2904
2905```````````````````````````````` example
2906[foo]:
2907/url
2908
2909[foo]
2910.
2911<p><a href="/url">foo</a></p>
2912````````````````````````````````
2913
2914
2915The link destination may not be omitted:
2916
2917```````````````````````````````` example
2918[foo]:
2919
2920[foo]
2921.
2922<p>[foo]:</p>
2923<p>[foo]</p>
2924````````````````````````````````
2925
2926 However, an empty link destination may be specified using
2927 angle brackets:
2928
2929```````````````````````````````` example
2930[foo]: <>
2931
2932[foo]
2933.
2934<p><a href="">foo</a></p>
2935````````````````````````````````
2936
2937The title must be separated from the link destination by
2938whitespace:
2939
2940```````````````````````````````` example
2941[foo]: <bar>(baz)
2942
2943[foo]
2944.
2945<p>[foo]: <bar>(baz)</p>
2946<p>[foo]</p>
2947````````````````````````````````
2948
2949
2950Both title and destination can contain backslash escapes
2951and literal backslashes:
2952
2953```````````````````````````````` example
2954[foo]: /url\bar\*baz "foo\"bar\baz"
2955
2956[foo]
2957.
2958<p><a href="/url%5Cbar*baz" title="foo&quot;bar\baz">foo</a></p>
2959````````````````````````````````
2960
2961
2962A link can come before its corresponding definition:
2963
2964```````````````````````````````` example
2965[foo]
2966
2967[foo]: url
2968.
2969<p><a href="url">foo</a></p>
2970````````````````````````````````
2971
2972
2973If there are several matching definitions, the first one takes
2974precedence:
2975
2976```````````````````````````````` example
2977[foo]
2978
2979[foo]: first
2980[foo]: second
2981.
2982<p><a href="first">foo</a></p>
2983````````````````````````````````
2984
2985
2986As noted in the section on [Links], matching of labels is
2987case-insensitive (see [matches]).
2988
2989```````````````````````````````` example
2990[FOO]: /url
2991
2992[Foo]
2993.
2994<p><a href="/url">Foo</a></p>
2995````````````````````````````````
2996
2997
2998```````````````````````````````` example
2999[ΑΓΩ]: /φου
3000
3001[αγω]
3002.
3003<p><a href="/%CF%86%CE%BF%CF%85">αγω</a></p>
3004````````````````````````````````
3005
3006
3007Here is a link reference definition with no corresponding link.
3008It contributes nothing to the document.
3009
3010```````````````````````````````` example
3011[foo]: /url
3012.
3013````````````````````````````````
3014
3015
3016Here is another one:
3017
3018```````````````````````````````` example
3019[
3020foo
3021]: /url
3022bar
3023.
3024<p>bar</p>
3025````````````````````````````````
3026
3027
3028This is not a link reference definition, because there are
3029[non-whitespace characters] after the title:
3030
3031```````````````````````````````` example
3032[foo]: /url "title" ok
3033.
3034<p>[foo]: /url &quot;title&quot; ok</p>
3035````````````````````````````````
3036
3037
3038This is a link reference definition, but it has no title:
3039
3040```````````````````````````````` example
3041[foo]: /url
3042"title" ok
3043.
3044<p>&quot;title&quot; ok</p>
3045````````````````````````````````
3046
3047
3048This is not a link reference definition, because it is indented
3049four spaces:
3050
3051```````````````````````````````` example
3052    [foo]: /url "title"
3053
3054[foo]
3055.
3056<pre><code>[foo]: /url &quot;title&quot;
3057</code></pre>
3058<p>[foo]</p>
3059````````````````````````````````
3060
3061
3062This is not a link reference definition, because it occurs inside
3063a code block:
3064
3065```````````````````````````````` example
3066```
3067[foo]: /url
3068```
3069
3070[foo]
3071.
3072<pre><code>[foo]: /url
3073</code></pre>
3074<p>[foo]</p>
3075````````````````````````````````
3076
3077
3078A [link reference definition] cannot interrupt a paragraph.
3079
3080```````````````````````````````` example
3081Foo
3082[bar]: /baz
3083
3084[bar]
3085.
3086<p>Foo
3087[bar]: /baz</p>
3088<p>[bar]</p>
3089````````````````````````````````
3090
3091
3092However, it can directly follow other block elements, such as headings
3093and thematic breaks, and it need not be followed by a blank line.
3094
3095```````````````````````````````` example
3096# [Foo]
3097[foo]: /url
3098> bar
3099.
3100<h1><a href="/url">Foo</a></h1>
3101<blockquote>
3102<p>bar</p>
3103</blockquote>
3104````````````````````````````````
3105
3106```````````````````````````````` example
3107[foo]: /url
3108bar
3109===
3110[foo]
3111.
3112<h1>bar</h1>
3113<p><a href="/url">foo</a></p>
3114````````````````````````````````
3115
3116```````````````````````````````` example
3117[foo]: /url
3118===
3119[foo]
3120.
3121<p>===
3122<a href="/url">foo</a></p>
3123````````````````````````````````
3124
3125
3126Several [link reference definitions]
3127can occur one after another, without intervening blank lines.
3128
3129```````````````````````````````` example
3130[foo]: /foo-url "foo"
3131[bar]: /bar-url
3132  "bar"
3133[baz]: /baz-url
3134
3135[foo],
3136[bar],
3137[baz]
3138.
3139<p><a href="/foo-url" title="foo">foo</a>,
3140<a href="/bar-url" title="bar">bar</a>,
3141<a href="/baz-url">baz</a></p>
3142````````````````````````````````
3143
3144
3145[Link reference definitions] can occur
3146inside block containers, like lists and block quotations.  They
3147affect the entire document, not just the container in which they
3148are defined:
3149
3150```````````````````````````````` example
3151[foo]
3152
3153> [foo]: /url
3154.
3155<p><a href="/url">foo</a></p>
3156<blockquote>
3157</blockquote>
3158````````````````````````````````
3159
3160
3161Whether something is a [link reference definition] is
3162independent of whether the link reference it defines is
3163used in the document.  Thus, for example, the following
3164document contains just a link reference definition, and
3165no visible content:
3166
3167```````````````````````````````` example
3168[foo]: /url
3169.
3170````````````````````````````````
3171
3172
3173## Paragraphs
3174
3175A sequence of non-blank lines that cannot be interpreted as other
3176kinds of blocks forms a [paragraph](@).
3177The contents of the paragraph are the result of parsing the
3178paragraph's raw content as inlines.  The paragraph's raw content
3179is formed by concatenating the lines and removing initial and final
3180[whitespace].
3181
3182A simple example with two paragraphs:
3183
3184```````````````````````````````` example
3185aaa
3186
3187bbb
3188.
3189<p>aaa</p>
3190<p>bbb</p>
3191````````````````````````````````
3192
3193
3194Paragraphs can contain multiple lines, but no blank lines:
3195
3196```````````````````````````````` example
3197aaa
3198bbb
3199
3200ccc
3201ddd
3202.
3203<p>aaa
3204bbb</p>
3205<p>ccc
3206ddd</p>
3207````````````````````````````````
3208
3209
3210Multiple blank lines between paragraph have no effect:
3211
3212```````````````````````````````` example
3213aaa
3214
3215
3216bbb
3217.
3218<p>aaa</p>
3219<p>bbb</p>
3220````````````````````````````````
3221
3222
3223Leading spaces are skipped:
3224
3225```````````````````````````````` example
3226  aaa
3227 bbb
3228.
3229<p>aaa
3230bbb</p>
3231````````````````````````````````
3232
3233
3234Lines after the first may be indented any amount, since indented
3235code blocks cannot interrupt paragraphs.
3236
3237```````````````````````````````` example
3238aaa
3239             bbb
3240                                       ccc
3241.
3242<p>aaa
3243bbb
3244ccc</p>
3245````````````````````````````````
3246
3247
3248However, the first line may be indented at most three spaces,
3249or an indented code block will be triggered:
3250
3251```````````````````````````````` example
3252   aaa
3253bbb
3254.
3255<p>aaa
3256bbb</p>
3257````````````````````````````````
3258
3259
3260```````````````````````````````` example
3261    aaa
3262bbb
3263.
3264<pre><code>aaa
3265</code></pre>
3266<p>bbb</p>
3267````````````````````````````````
3268
3269
3270Final spaces are stripped before inline parsing, so a paragraph
3271that ends with two or more spaces will not end with a [hard line
3272break]:
3273
3274```````````````````````````````` example
3275aaa
3276bbb
3277.
3278<p>aaa<br />
3279bbb</p>
3280````````````````````````````````
3281
3282
3283## Blank lines
3284
3285[Blank lines] between block-level elements are ignored,
3286except for the role they play in determining whether a [list]
3287is [tight] or [loose].
3288
3289Blank lines at the beginning and end of the document are also ignored.
3290
3291```````````````````````````````` example
3292
3293
3294aaa
3295
3296
3297# aaa
3298
3299
3300.
3301<p>aaa</p>
3302<h1>aaa</h1>
3303````````````````````````````````
3304
3305<div class="extension">
3306
3307## Tables (extension)
3308
3309GFM enables the `table` extension, where an additional leaf block type is
3310available.
3311
3312A [table](@) is an arrangement of data with rows and columns, consisting of a
3313single header row, a [delimiter row] separating the header from the data, and
3314zero or more data rows.
3315
3316Each row consists of cells containing arbitrary text, in which [inlines] are
3317parsed, separated by pipes (`|`).  A leading and trailing pipe is also
3318recommended for clarity of reading, and if there's otherwise parsing ambiguity.
3319Spaces between pipes and cell content are trimmed.  Block-level elements cannot
3320be inserted in a table.
3321
3322The [delimiter row](@) consists of cells whose only content are hyphens (`-`),
3323and optionally, a leading or trailing colon (`:`), or both, to indicate left,
3324right, or center alignment respectively.
3325
3326```````````````````````````````` example table
3327| foo | bar |
3328| --- | --- |
3329| baz | bim |
3330.
3331<table>
3332<thead>
3333<tr>
3334<th>foo</th>
3335<th>bar</th>
3336</tr>
3337</thead>
3338<tbody>
3339<tr>
3340<td>baz</td>
3341<td>bim</td>
3342</tr>
3343</tbody>
3344</table>
3345````````````````````````````````
3346
3347Cells in one column don't need to match length, though it's easier to read if
3348they are. Likewise, use of leading and trailing pipes may be inconsistent:
3349
3350```````````````````````````````` example table
3351| abc | defghi |
3352:-: | -----------:
3353bar | baz
3354.
3355<table>
3356<thead>
3357<tr>
3358<th align="center">abc</th>
3359<th align="right">defghi</th>
3360</tr>
3361</thead>
3362<tbody>
3363<tr>
3364<td align="center">bar</td>
3365<td align="right">baz</td>
3366</tr>
3367</tbody>
3368</table>
3369````````````````````````````````
3370
3371Include a pipe in a cell's content by escaping it, including inside other
3372inline spans:
3373
3374```````````````````````````````` example table
3375| f\|oo  |
3376| ------ |
3377| b `\|` az |
3378| b **\|** im |
3379.
3380<table>
3381<thead>
3382<tr>
3383<th>f|oo</th>
3384</tr>
3385</thead>
3386<tbody>
3387<tr>
3388<td>b <code>|</code> az</td>
3389</tr>
3390<tr>
3391<td>b <strong>|</strong> im</td>
3392</tr>
3393</tbody>
3394</table>
3395````````````````````````````````
3396
3397The table is broken at the first empty line, or beginning of another
3398block-level structure:
3399
3400```````````````````````````````` example table
3401| abc | def |
3402| --- | --- |
3403| bar | baz |
3404> bar
3405.
3406<table>
3407<thead>
3408<tr>
3409<th>abc</th>
3410<th>def</th>
3411</tr>
3412</thead>
3413<tbody>
3414<tr>
3415<td>bar</td>
3416<td>baz</td>
3417</tr>
3418</tbody>
3419</table>
3420<blockquote>
3421<p>bar</p>
3422</blockquote>
3423````````````````````````````````
3424
3425```````````````````````````````` example table
3426| abc | def |
3427| --- | --- |
3428| bar | baz |
3429bar
3430
3431bar
3432.
3433<table>
3434<thead>
3435<tr>
3436<th>abc</th>
3437<th>def</th>
3438</tr>
3439</thead>
3440<tbody>
3441<tr>
3442<td>bar</td>
3443<td>baz</td>
3444</tr>
3445<tr>
3446<td>bar</td>
3447<td></td>
3448</tr>
3449</tbody>
3450</table>
3451<p>bar</p>
3452````````````````````````````````
3453
3454The header row must match the [delimiter row] in the number of cells.  If not,
3455a table will not be recognized:
3456
3457```````````````````````````````` example table
3458| abc | def |
3459| --- |
3460| bar |
3461.
3462<p>| abc | def |
3463| --- |
3464| bar |</p>
3465````````````````````````````````
3466
3467The remainder of the table's rows may vary in the number of cells.  If there
3468are a number of cells fewer than the number of cells in the header row, empty
3469cells are inserted.  If there are greater, the excess is ignored:
3470
3471```````````````````````````````` example table
3472| abc | def |
3473| --- | --- |
3474| bar |
3475| bar | baz | boo |
3476.
3477<table>
3478<thead>
3479<tr>
3480<th>abc</th>
3481<th>def</th>
3482</tr>
3483</thead>
3484<tbody>
3485<tr>
3486<td>bar</td>
3487<td></td>
3488</tr>
3489<tr>
3490<td>bar</td>
3491<td>baz</td>
3492</tr>
3493</tbody>
3494</table>
3495````````````````````````````````
3496
3497If there are no rows in the body, no `<tbody>` is generated in HTML output:
3498
3499```````````````````````````````` example table
3500| abc | def |
3501| --- | --- |
3502.
3503<table>
3504<thead>
3505<tr>
3506<th>abc</th>
3507<th>def</th>
3508</tr>
3509</thead>
3510</table>
3511````````````````````````````````
3512
3513</div>
3514
3515# Container blocks
3516
3517A [container block](#container-blocks) is a block that has other
3518blocks as its contents.  There are two basic kinds of container blocks:
3519[block quotes] and [list items].
3520[Lists] are meta-containers for [list items].
3521
3522We define the syntax for container blocks recursively.  The general
3523form of the definition is:
3524
3525> If X is a sequence of blocks, then the result of
3526> transforming X in such-and-such a way is a container of type Y
3527> with these blocks as its content.
3528
3529So, we explain what counts as a block quote or list item by explaining
3530how these can be *generated* from their contents. This should suffice
3531to define the syntax, although it does not give a recipe for *parsing*
3532these constructions.  (A recipe is provided below in the section entitled
3533[A parsing strategy](#appendix-a-parsing-strategy).)
3534
3535## Block quotes
3536
3537A [block quote marker](@)
3538consists of 0-3 spaces of initial indent, plus (a) the character `>` together
3539with a following space, or (b) a single character `>` not followed by a space.
3540
3541The following rules define [block quotes]:
3542
35431.  **Basic case.**  If a string of lines *Ls* constitute a sequence
3544    of blocks *Bs*, then the result of prepending a [block quote
3545    marker] to the beginning of each line in *Ls*
3546    is a [block quote](#block-quotes) containing *Bs*.
3547
35482.  **Laziness.**  If a string of lines *Ls* constitute a [block
3549    quote](#block-quotes) with contents *Bs*, then the result of deleting
3550    the initial [block quote marker] from one or
3551    more lines in which the next [non-whitespace character] after the [block
3552    quote marker] is [paragraph continuation
3553    text] is a block quote with *Bs* as its content.
3554    [Paragraph continuation text](@) is text
3555    that will be parsed as part of the content of a paragraph, but does
3556    not occur at the beginning of the paragraph.
3557
35583.  **Consecutiveness.**  A document cannot contain two [block
3559    quotes] in a row unless there is a [blank line] between them.
3560
3561Nothing else counts as a [block quote](#block-quotes).
3562
3563Here is a simple example:
3564
3565```````````````````````````````` example
3566> # Foo
3567> bar
3568> baz
3569.
3570<blockquote>
3571<h1>Foo</h1>
3572<p>bar
3573baz</p>
3574</blockquote>
3575````````````````````````````````
3576
3577
3578The spaces after the `>` characters can be omitted:
3579
3580```````````````````````````````` example
3581># Foo
3582>bar
3583> baz
3584.
3585<blockquote>
3586<h1>Foo</h1>
3587<p>bar
3588baz</p>
3589</blockquote>
3590````````````````````````````````
3591
3592
3593The `>` characters can be indented 1-3 spaces:
3594
3595```````````````````````````````` example
3596   > # Foo
3597   > bar
3598 > baz
3599.
3600<blockquote>
3601<h1>Foo</h1>
3602<p>bar
3603baz</p>
3604</blockquote>
3605````````````````````````````````
3606
3607
3608Four spaces gives us a code block:
3609
3610```````````````````````````````` example
3611    > # Foo
3612    > bar
3613    > baz
3614.
3615<pre><code>&gt; # Foo
3616&gt; bar
3617&gt; baz
3618</code></pre>
3619````````````````````````````````
3620
3621
3622The Laziness clause allows us to omit the `>` before
3623[paragraph continuation text]:
3624
3625```````````````````````````````` example
3626> # Foo
3627> bar
3628baz
3629.
3630<blockquote>
3631<h1>Foo</h1>
3632<p>bar
3633baz</p>
3634</blockquote>
3635````````````````````````````````
3636
3637
3638A block quote can contain some lazy and some non-lazy
3639continuation lines:
3640
3641```````````````````````````````` example
3642> bar
3643baz
3644> foo
3645.
3646<blockquote>
3647<p>bar
3648baz
3649foo</p>
3650</blockquote>
3651````````````````````````````````
3652
3653
3654Laziness only applies to lines that would have been continuations of
3655paragraphs had they been prepended with [block quote markers].
3656For example, the `> ` cannot be omitted in the second line of
3657
3658``` markdown
3659> foo
3660> ---
3661```
3662
3663without changing the meaning:
3664
3665```````````````````````````````` example
3666> foo
3667---
3668.
3669<blockquote>
3670<p>foo</p>
3671</blockquote>
3672<hr />
3673````````````````````````````````
3674
3675
3676Similarly, if we omit the `> ` in the second line of
3677
3678``` markdown
3679> - foo
3680> - bar
3681```
3682
3683then the block quote ends after the first line:
3684
3685```````````````````````````````` example
3686> - foo
3687- bar
3688.
3689<blockquote>
3690<ul>
3691<li>foo</li>
3692</ul>
3693</blockquote>
3694<ul>
3695<li>bar</li>
3696</ul>
3697````````````````````````````````
3698
3699
3700For the same reason, we can't omit the `> ` in front of
3701subsequent lines of an indented or fenced code block:
3702
3703```````````````````````````````` example
3704>     foo
3705    bar
3706.
3707<blockquote>
3708<pre><code>foo
3709</code></pre>
3710</blockquote>
3711<pre><code>bar
3712</code></pre>
3713````````````````````````````````
3714
3715
3716```````````````````````````````` example
3717> ```
3718foo
3719```
3720.
3721<blockquote>
3722<pre><code></code></pre>
3723</blockquote>
3724<p>foo</p>
3725<pre><code></code></pre>
3726````````````````````````````````
3727
3728
3729Note that in the following case, we have a [lazy
3730continuation line]:
3731
3732```````````````````````````````` example
3733> foo
3734    - bar
3735.
3736<blockquote>
3737<p>foo
3738- bar</p>
3739</blockquote>
3740````````````````````````````````
3741
3742
3743To see why, note that in
3744
3745```markdown
3746> foo
3747>     - bar
3748```
3749
3750the `- bar` is indented too far to start a list, and can't
3751be an indented code block because indented code blocks cannot
3752interrupt paragraphs, so it is [paragraph continuation text].
3753
3754A block quote can be empty:
3755
3756```````````````````````````````` example
3757>
3758.
3759<blockquote>
3760</blockquote>
3761````````````````````````````````
3762
3763
3764```````````````````````````````` example
3765>
3766>
3767>
3768.
3769<blockquote>
3770</blockquote>
3771````````````````````````````````
3772
3773
3774A block quote can have initial or final blank lines:
3775
3776```````````````````````````````` example
3777>
3778> foo
3779>
3780.
3781<blockquote>
3782<p>foo</p>
3783</blockquote>
3784````````````````````````````````
3785
3786
3787A blank line always separates block quotes:
3788
3789```````````````````````````````` example
3790> foo
3791
3792> bar
3793.
3794<blockquote>
3795<p>foo</p>
3796</blockquote>
3797<blockquote>
3798<p>bar</p>
3799</blockquote>
3800````````````````````````````````
3801
3802
3803(Most current Markdown implementations, including John Gruber's
3804original `Markdown.pl`, will parse this example as a single block quote
3805with two paragraphs.  But it seems better to allow the author to decide
3806whether two block quotes or one are wanted.)
3807
3808Consecutiveness means that if we put these block quotes together,
3809we get a single block quote:
3810
3811```````````````````````````````` example
3812> foo
3813> bar
3814.
3815<blockquote>
3816<p>foo
3817bar</p>
3818</blockquote>
3819````````````````````````````````
3820
3821
3822To get a block quote with two paragraphs, use:
3823
3824```````````````````````````````` example
3825> foo
3826>
3827> bar
3828.
3829<blockquote>
3830<p>foo</p>
3831<p>bar</p>
3832</blockquote>
3833````````````````````````````````
3834
3835
3836Block quotes can interrupt paragraphs:
3837
3838```````````````````````````````` example
3839foo
3840> bar
3841.
3842<p>foo</p>
3843<blockquote>
3844<p>bar</p>
3845</blockquote>
3846````````````````````````````````
3847
3848
3849In general, blank lines are not needed before or after block
3850quotes:
3851
3852```````````````````````````````` example
3853> aaa
3854***
3855> bbb
3856.
3857<blockquote>
3858<p>aaa</p>
3859</blockquote>
3860<hr />
3861<blockquote>
3862<p>bbb</p>
3863</blockquote>
3864````````````````````````````````
3865
3866
3867However, because of laziness, a blank line is needed between
3868a block quote and a following paragraph:
3869
3870```````````````````````````````` example
3871> bar
3872baz
3873.
3874<blockquote>
3875<p>bar
3876baz</p>
3877</blockquote>
3878````````````````````````````````
3879
3880
3881```````````````````````````````` example
3882> bar
3883
3884baz
3885.
3886<blockquote>
3887<p>bar</p>
3888</blockquote>
3889<p>baz</p>
3890````````````````````````````````
3891
3892
3893```````````````````````````````` example
3894> bar
3895>
3896baz
3897.
3898<blockquote>
3899<p>bar</p>
3900</blockquote>
3901<p>baz</p>
3902````````````````````````````````
3903
3904
3905It is a consequence of the Laziness rule that any number
3906of initial `>`s may be omitted on a continuation line of a
3907nested block quote:
3908
3909```````````````````````````````` example
3910> > > foo
3911bar
3912.
3913<blockquote>
3914<blockquote>
3915<blockquote>
3916<p>foo
3917bar</p>
3918</blockquote>
3919</blockquote>
3920</blockquote>
3921````````````````````````````````
3922
3923
3924```````````````````````````````` example
3925>>> foo
3926> bar
3927>>baz
3928.
3929<blockquote>
3930<blockquote>
3931<blockquote>
3932<p>foo
3933bar
3934baz</p>
3935</blockquote>
3936</blockquote>
3937</blockquote>
3938````````````````````````````````
3939
3940
3941When including an indented code block in a block quote,
3942remember that the [block quote marker] includes
3943both the `>` and a following space.  So *five spaces* are needed after
3944the `>`:
3945
3946```````````````````````````````` example
3947>     code
3948
3949>    not code
3950.
3951<blockquote>
3952<pre><code>code
3953</code></pre>
3954</blockquote>
3955<blockquote>
3956<p>not code</p>
3957</blockquote>
3958````````````````````````````````
3959
3960
3961
3962## List items
3963
3964A [list marker](@) is a
3965[bullet list marker] or an [ordered list marker].
3966
3967A [bullet list marker](@)
3968is a `-`, `+`, or `*` character.
3969
3970An [ordered list marker](@)
3971is a sequence of 1--9 arabic digits (`0-9`), followed by either a
3972`.` character or a `)` character.  (The reason for the length
3973limit is that with 10 digits we start seeing integer overflows
3974in some browsers.)
3975
3976The following rules define [list items]:
3977
39781.  **Basic case.**  If a sequence of lines *Ls* constitute a sequence of
3979    blocks *Bs* starting with a [non-whitespace character], and *M* is a
3980    list marker of width *W* followed by 1 ≤ *N* ≤ 4 spaces, then the result
3981    of prepending *M* and the following spaces to the first line of
3982    *Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a
3983    list item with *Bs* as its contents.  The type of the list item
3984    (bullet or ordered) is determined by the type of its list marker.
3985    If the list item is ordered, then it is also assigned a start
3986    number, based on the ordered list marker.
3987
3988    Exceptions:
3989
3990    1. When the first list item in a [list] interrupts
3991    a paragraph---that is, when it starts on a line that would
3992    otherwise count as [paragraph continuation text]---then (a)
3993    the lines *Ls* must not begin with a blank line, and (b) if
3994    the list item is ordered, the start number must be 1.
3995    2. If any line is a [thematic break][thematic breaks] then
3996       that line is not a list item.
3997
3998For example, let *Ls* be the lines
3999
4000```````````````````````````````` example
4001A paragraph
4002with two lines.
4003
4004    indented code
4005
4006> A block quote.
4007.
4008<p>A paragraph
4009with two lines.</p>
4010<pre><code>indented code
4011</code></pre>
4012<blockquote>
4013<p>A block quote.</p>
4014</blockquote>
4015````````````````````````````````
4016
4017
4018And let *M* be the marker `1.`, and *N* = 2.  Then rule #1 says
4019that the following is an ordered list item with start number 1,
4020and the same contents as *Ls*:
4021
4022```````````````````````````````` example
40231.  A paragraph
4024    with two lines.
4025
4026        indented code
4027
4028    > A block quote.
4029.
4030<ol>
4031<li>
4032<p>A paragraph
4033with two lines.</p>
4034<pre><code>indented code
4035</code></pre>
4036<blockquote>
4037<p>A block quote.</p>
4038</blockquote>
4039</li>
4040</ol>
4041````````````````````````````````
4042
4043
4044The most important thing to notice is that the position of
4045the text after the list marker determines how much indentation
4046is needed in subsequent blocks in the list item.  If the list
4047marker takes up two spaces, and there are three spaces between
4048the list marker and the next [non-whitespace character], then blocks
4049must be indented five spaces in order to fall under the list
4050item.
4051
4052Here are some examples showing how far content must be indented to be
4053put under the list item:
4054
4055```````````````````````````````` example
4056- one
4057
4058 two
4059.
4060<ul>
4061<li>one</li>
4062</ul>
4063<p>two</p>
4064````````````````````````````````
4065
4066
4067```````````````````````````````` example
4068- one
4069
4070  two
4071.
4072<ul>
4073<li>
4074<p>one</p>
4075<p>two</p>
4076</li>
4077</ul>
4078````````````````````````````````
4079
4080
4081```````````````````````````````` example
4082 -    one
4083
4084     two
4085.
4086<ul>
4087<li>one</li>
4088</ul>
4089<pre><code> two
4090</code></pre>
4091````````````````````````````````
4092
4093
4094```````````````````````````````` example
4095 -    one
4096
4097      two
4098.
4099<ul>
4100<li>
4101<p>one</p>
4102<p>two</p>
4103</li>
4104</ul>
4105````````````````````````````````
4106
4107
4108It is tempting to think of this in terms of columns:  the continuation
4109blocks must be indented at least to the column of the first
4110[non-whitespace character] after the list marker. However, that is not quite right.
4111The spaces after the list marker determine how much relative indentation
4112is needed.  Which column this indentation reaches will depend on
4113how the list item is embedded in other constructions, as shown by
4114this example:
4115
4116```````````````````````````````` example
4117   > > 1.  one
4118>>
4119>>     two
4120.
4121<blockquote>
4122<blockquote>
4123<ol>
4124<li>
4125<p>one</p>
4126<p>two</p>
4127</li>
4128</ol>
4129</blockquote>
4130</blockquote>
4131````````````````````````````````
4132
4133
4134Here `two` occurs in the same column as the list marker `1.`,
4135but is actually contained in the list item, because there is
4136sufficient indentation after the last containing blockquote marker.
4137
4138The converse is also possible.  In the following example, the word `two`
4139occurs far to the right of the initial text of the list item, `one`, but
4140it is not considered part of the list item, because it is not indented
4141far enough past the blockquote marker:
4142
4143```````````````````````````````` example
4144>>- one
4145>>
4146  >  > two
4147.
4148<blockquote>
4149<blockquote>
4150<ul>
4151<li>one</li>
4152</ul>
4153<p>two</p>
4154</blockquote>
4155</blockquote>
4156````````````````````````````````
4157
4158
4159Note that at least one space is needed between the list marker and
4160any following content, so these are not list items:
4161
4162```````````````````````````````` example
4163-one
4164
41652.two
4166.
4167<p>-one</p>
4168<p>2.two</p>
4169````````````````````````````````
4170
4171
4172A list item may contain blocks that are separated by more than
4173one blank line.
4174
4175```````````````````````````````` example
4176- foo
4177
4178
4179  bar
4180.
4181<ul>
4182<li>
4183<p>foo</p>
4184<p>bar</p>
4185</li>
4186</ul>
4187````````````````````````````````
4188
4189
4190A list item may contain any kind of block:
4191
4192```````````````````````````````` example
41931.  foo
4194
4195    ```
4196    bar
4197    ```
4198
4199    baz
4200
4201    > bam
4202.
4203<ol>
4204<li>
4205<p>foo</p>
4206<pre><code>bar
4207</code></pre>
4208<p>baz</p>
4209<blockquote>
4210<p>bam</p>
4211</blockquote>
4212</li>
4213</ol>
4214````````````````````````````````
4215
4216
4217A list item that contains an indented code block will preserve
4218empty lines within the code block verbatim.
4219
4220```````````````````````````````` example
4221- Foo
4222
4223      bar
4224
4225
4226      baz
4227.
4228<ul>
4229<li>
4230<p>Foo</p>
4231<pre><code>bar
4232
4233
4234baz
4235</code></pre>
4236</li>
4237</ul>
4238````````````````````````````````
4239
4240Note that ordered list start numbers must be nine digits or less:
4241
4242```````````````````````````````` example
4243123456789. ok
4244.
4245<ol start="123456789">
4246<li>ok</li>
4247</ol>
4248````````````````````````````````
4249
4250
4251```````````````````````````````` example
42521234567890. not ok
4253.
4254<p>1234567890. not ok</p>
4255````````````````````````````````
4256
4257
4258A start number may begin with 0s:
4259
4260```````````````````````````````` example
42610. ok
4262.
4263<ol start="0">
4264<li>ok</li>
4265</ol>
4266````````````````````````````````
4267
4268
4269```````````````````````````````` example
4270003. ok
4271.
4272<ol start="3">
4273<li>ok</li>
4274</ol>
4275````````````````````````````````
4276
4277
4278A start number may not be negative:
4279
4280```````````````````````````````` example
4281-1. not ok
4282.
4283<p>-1. not ok</p>
4284````````````````````````````````
4285
4286
4287
42882.  **Item starting with indented code.**  If a sequence of lines *Ls*
4289    constitute a sequence of blocks *Bs* starting with an indented code
4290    block, and *M* is a list marker of width *W* followed by
4291    one space, then the result of prepending *M* and the following
4292    space to the first line of *Ls*, and indenting subsequent lines of
4293    *Ls* by *W + 1* spaces, is a list item with *Bs* as its contents.
4294    If a line is empty, then it need not be indented.  The type of the
4295    list item (bullet or ordered) is determined by the type of its list
4296    marker.  If the list item is ordered, then it is also assigned a
4297    start number, based on the ordered list marker.
4298
4299An indented code block will have to be indented four spaces beyond
4300the edge of the region where text will be included in the list item.
4301In the following case that is 6 spaces:
4302
4303```````````````````````````````` example
4304- foo
4305
4306      bar
4307.
4308<ul>
4309<li>
4310<p>foo</p>
4311<pre><code>bar
4312</code></pre>
4313</li>
4314</ul>
4315````````````````````````````````
4316
4317
4318And in this case it is 11 spaces:
4319
4320```````````````````````````````` example
4321  10.  foo
4322
4323           bar
4324.
4325<ol start="10">
4326<li>
4327<p>foo</p>
4328<pre><code>bar
4329</code></pre>
4330</li>
4331</ol>
4332````````````````````````````````
4333
4334
4335If the *first* block in the list item is an indented code block,
4336then by rule #2, the contents must be indented *one* space after the
4337list marker:
4338
4339```````````````````````````````` example
4340    indented code
4341
4342paragraph
4343
4344    more code
4345.
4346<pre><code>indented code
4347</code></pre>
4348<p>paragraph</p>
4349<pre><code>more code
4350</code></pre>
4351````````````````````````````````
4352
4353
4354```````````````````````````````` example
43551.     indented code
4356
4357   paragraph
4358
4359       more code
4360.
4361<ol>
4362<li>
4363<pre><code>indented code
4364</code></pre>
4365<p>paragraph</p>
4366<pre><code>more code
4367</code></pre>
4368</li>
4369</ol>
4370````````````````````````````````
4371
4372
4373Note that an additional space indent is interpreted as space
4374inside the code block:
4375
4376```````````````````````````````` example
43771.      indented code
4378
4379   paragraph
4380
4381       more code
4382.
4383<ol>
4384<li>
4385<pre><code> indented code
4386</code></pre>
4387<p>paragraph</p>
4388<pre><code>more code
4389</code></pre>
4390</li>
4391</ol>
4392````````````````````````````````
4393
4394
4395Note that rules #1 and #2 only apply to two cases:  (a) cases
4396in which the lines to be included in a list item begin with a
4397[non-whitespace character], and (b) cases in which
4398they begin with an indented code
4399block.  In a case like the following, where the first block begins with
4400a three-space indent, the rules do not allow us to form a list item by
4401indenting the whole thing and prepending a list marker:
4402
4403```````````````````````````````` example
4404   foo
4405
4406bar
4407.
4408<p>foo</p>
4409<p>bar</p>
4410````````````````````````````````
4411
4412
4413```````````````````````````````` example
4414-    foo
4415
4416  bar
4417.
4418<ul>
4419<li>foo</li>
4420</ul>
4421<p>bar</p>
4422````````````````````````````````
4423
4424
4425This is not a significant restriction, because when a block begins
4426with 1-3 spaces indent, the indentation can always be removed without
4427a change in interpretation, allowing rule #1 to be applied.  So, in
4428the above case:
4429
4430```````````````````````````````` example
4431-  foo
4432
4433   bar
4434.
4435<ul>
4436<li>
4437<p>foo</p>
4438<p>bar</p>
4439</li>
4440</ul>
4441````````````````````````````````
4442
4443
44443.  **Item starting with a blank line.**  If a sequence of lines *Ls*
4445    starting with a single [blank line] constitute a (possibly empty)
4446    sequence of blocks *Bs*, not separated from each other by more than
4447    one blank line, and *M* is a list marker of width *W*,
4448    then the result of prepending *M* to the first line of *Ls*, and
4449    indenting subsequent lines of *Ls* by *W + 1* spaces, is a list
4450    item with *Bs* as its contents.
4451    If a line is empty, then it need not be indented.  The type of the
4452    list item (bullet or ordered) is determined by the type of its list
4453    marker.  If the list item is ordered, then it is also assigned a
4454    start number, based on the ordered list marker.
4455
4456Here are some list items that start with a blank line but are not empty:
4457
4458```````````````````````````````` example
4459-
4460  foo
4461-
4462  ```
4463  bar
4464  ```
4465-
4466      baz
4467.
4468<ul>
4469<li>foo</li>
4470<li>
4471<pre><code>bar
4472</code></pre>
4473</li>
4474<li>
4475<pre><code>baz
4476</code></pre>
4477</li>
4478</ul>
4479````````````````````````````````
4480
4481When the list item starts with a blank line, the number of spaces
4482following the list marker doesn't change the required indentation:
4483
4484```````````````````````````````` example
4485-
4486  foo
4487.
4488<ul>
4489<li>foo</li>
4490</ul>
4491````````````````````````````````
4492
4493
4494A list item can begin with at most one blank line.
4495In the following example, `foo` is not part of the list
4496item:
4497
4498```````````````````````````````` example
4499-
4500
4501  foo
4502.
4503<ul>
4504<li></li>
4505</ul>
4506<p>foo</p>
4507````````````````````````````````
4508
4509
4510Here is an empty bullet list item:
4511
4512```````````````````````````````` example
4513- foo
4514-
4515- bar
4516.
4517<ul>
4518<li>foo</li>
4519<li></li>
4520<li>bar</li>
4521</ul>
4522````````````````````````````````
4523
4524
4525It does not matter whether there are spaces following the [list marker]:
4526
4527```````````````````````````````` example
4528- foo
4529-
4530- bar
4531.
4532<ul>
4533<li>foo</li>
4534<li></li>
4535<li>bar</li>
4536</ul>
4537````````````````````````````````
4538
4539
4540Here is an empty ordered list item:
4541
4542```````````````````````````````` example
45431. foo
45442.
45453. bar
4546.
4547<ol>
4548<li>foo</li>
4549<li></li>
4550<li>bar</li>
4551</ol>
4552````````````````````````````````
4553
4554
4555A list may start or end with an empty list item:
4556
4557```````````````````````````````` example
4558*
4559.
4560<ul>
4561<li></li>
4562</ul>
4563````````````````````````````````
4564
4565However, an empty list item cannot interrupt a paragraph:
4566
4567```````````````````````````````` example
4568foo
4569*
4570
4571foo
45721.
4573.
4574<p>foo
4575*</p>
4576<p>foo
45771.</p>
4578````````````````````````````````
4579
4580
45814.  **Indentation.**  If a sequence of lines *Ls* constitutes a list item
4582    according to rule #1, #2, or #3, then the result of indenting each line
4583    of *Ls* by 1-3 spaces (the same for each line) also constitutes a
4584    list item with the same contents and attributes.  If a line is
4585    empty, then it need not be indented.
4586
4587Indented one space:
4588
4589```````````````````````````````` example
4590 1.  A paragraph
4591     with two lines.
4592
4593         indented code
4594
4595     > A block quote.
4596.
4597<ol>
4598<li>
4599<p>A paragraph
4600with two lines.</p>
4601<pre><code>indented code
4602</code></pre>
4603<blockquote>
4604<p>A block quote.</p>
4605</blockquote>
4606</li>
4607</ol>
4608````````````````````````````````
4609
4610
4611Indented two spaces:
4612
4613```````````````````````````````` example
4614  1.  A paragraph
4615      with two lines.
4616
4617          indented code
4618
4619      > A block quote.
4620.
4621<ol>
4622<li>
4623<p>A paragraph
4624with two lines.</p>
4625<pre><code>indented code
4626</code></pre>
4627<blockquote>
4628<p>A block quote.</p>
4629</blockquote>
4630</li>
4631</ol>
4632````````````````````````````````
4633
4634
4635Indented three spaces:
4636
4637```````````````````````````````` example
4638   1.  A paragraph
4639       with two lines.
4640
4641           indented code
4642
4643       > A block quote.
4644.
4645<ol>
4646<li>
4647<p>A paragraph
4648with two lines.</p>
4649<pre><code>indented code
4650</code></pre>
4651<blockquote>
4652<p>A block quote.</p>
4653</blockquote>
4654</li>
4655</ol>
4656````````````````````````````````
4657
4658
4659Four spaces indent gives a code block:
4660
4661```````````````````````````````` example
4662    1.  A paragraph
4663        with two lines.
4664
4665            indented code
4666
4667        > A block quote.
4668.
4669<pre><code>1.  A paragraph
4670    with two lines.
4671
4672        indented code
4673
4674    &gt; A block quote.
4675</code></pre>
4676````````````````````````````````
4677
4678
4679
46805.  **Laziness.**  If a string of lines *Ls* constitute a [list
4681    item](#list-items) with contents *Bs*, then the result of deleting
4682    some or all of the indentation from one or more lines in which the
4683    next [non-whitespace character] after the indentation is
4684    [paragraph continuation text] is a
4685    list item with the same contents and attributes.  The unindented
4686    lines are called
4687    [lazy continuation line](@)s.
4688
4689Here is an example with [lazy continuation lines]:
4690
4691```````````````````````````````` example
4692  1.  A paragraph
4693with two lines.
4694
4695          indented code
4696
4697      > A block quote.
4698.
4699<ol>
4700<li>
4701<p>A paragraph
4702with two lines.</p>
4703<pre><code>indented code
4704</code></pre>
4705<blockquote>
4706<p>A block quote.</p>
4707</blockquote>
4708</li>
4709</ol>
4710````````````````````````````````
4711
4712
4713Indentation can be partially deleted:
4714
4715```````````````````````````````` example
4716  1.  A paragraph
4717    with two lines.
4718.
4719<ol>
4720<li>A paragraph
4721with two lines.</li>
4722</ol>
4723````````````````````````````````
4724
4725
4726These examples show how laziness can work in nested structures:
4727
4728```````````````````````````````` example
4729> 1. > Blockquote
4730continued here.
4731.
4732<blockquote>
4733<ol>
4734<li>
4735<blockquote>
4736<p>Blockquote
4737continued here.</p>
4738</blockquote>
4739</li>
4740</ol>
4741</blockquote>
4742````````````````````````````````
4743
4744
4745```````````````````````````````` example
4746> 1. > Blockquote
4747> continued here.
4748.
4749<blockquote>
4750<ol>
4751<li>
4752<blockquote>
4753<p>Blockquote
4754continued here.</p>
4755</blockquote>
4756</li>
4757</ol>
4758</blockquote>
4759````````````````````````````````
4760
4761
4762
47636.  **That's all.** Nothing that is not counted as a list item by rules
4764    #1--5 counts as a [list item](#list-items).
4765
4766The rules for sublists follow from the general rules
4767[above][List items].  A sublist must be indented the same number
4768of spaces a paragraph would need to be in order to be included
4769in the list item.
4770
4771So, in this case we need two spaces indent:
4772
4773```````````````````````````````` example
4774- foo
4775  - bar
4776    - baz
4777      - boo
4778.
4779<ul>
4780<li>foo
4781<ul>
4782<li>bar
4783<ul>
4784<li>baz
4785<ul>
4786<li>boo</li>
4787</ul>
4788</li>
4789</ul>
4790</li>
4791</ul>
4792</li>
4793</ul>
4794````````````````````````````````
4795
4796
4797One is not enough:
4798
4799```````````````````````````````` example
4800- foo
4801 - bar
4802  - baz
4803   - boo
4804.
4805<ul>
4806<li>foo</li>
4807<li>bar</li>
4808<li>baz</li>
4809<li>boo</li>
4810</ul>
4811````````````````````````````````
4812
4813
4814Here we need four, because the list marker is wider:
4815
4816```````````````````````````````` example
481710) foo
4818    - bar
4819.
4820<ol start="10">
4821<li>foo
4822<ul>
4823<li>bar</li>
4824</ul>
4825</li>
4826</ol>
4827````````````````````````````````
4828
4829
4830Three is not enough:
4831
4832```````````````````````````````` example
483310) foo
4834   - bar
4835.
4836<ol start="10">
4837<li>foo</li>
4838</ol>
4839<ul>
4840<li>bar</li>
4841</ul>
4842````````````````````````````````
4843
4844
4845A list may be the first block in a list item:
4846
4847```````````````````````````````` example
4848- - foo
4849.
4850<ul>
4851<li>
4852<ul>
4853<li>foo</li>
4854</ul>
4855</li>
4856</ul>
4857````````````````````````````````
4858
4859
4860```````````````````````````````` example
48611. - 2. foo
4862.
4863<ol>
4864<li>
4865<ul>
4866<li>
4867<ol start="2">
4868<li>foo</li>
4869</ol>
4870</li>
4871</ul>
4872</li>
4873</ol>
4874````````````````````````````````
4875
4876
4877A list item can contain a heading:
4878
4879```````````````````````````````` example
4880- # Foo
4881- Bar
4882  ---
4883  baz
4884.
4885<ul>
4886<li>
4887<h1>Foo</h1>
4888</li>
4889<li>
4890<h2>Bar</h2>
4891baz</li>
4892</ul>
4893````````````````````````````````
4894
4895
4896### Motivation
4897
4898John Gruber's Markdown spec says the following about list items:
4899
49001. "List markers typically start at the left margin, but may be indented
4901   by up to three spaces. List markers must be followed by one or more
4902   spaces or a tab."
4903
49042. "To make lists look nice, you can wrap items with hanging indents....
4905   But if you don't want to, you don't have to."
4906
49073. "List items may consist of multiple paragraphs. Each subsequent
4908   paragraph in a list item must be indented by either 4 spaces or one
4909   tab."
4910
49114. "It looks nice if you indent every line of the subsequent paragraphs,
4912   but here again, Markdown will allow you to be lazy."
4913
49145. "To put a blockquote within a list item, the blockquote's `>`
4915   delimiters need to be indented."
4916
49176. "To put a code block within a list item, the code block needs to be
4918   indented twice — 8 spaces or two tabs."
4919
4920These rules specify that a paragraph under a list item must be indented
4921four spaces (presumably, from the left margin, rather than the start of
4922the list marker, but this is not said), and that code under a list item
4923must be indented eight spaces instead of the usual four.  They also say
4924that a block quote must be indented, but not by how much; however, the
4925example given has four spaces indentation.  Although nothing is said
4926about other kinds of block-level content, it is certainly reasonable to
4927infer that *all* block elements under a list item, including other
4928lists, must be indented four spaces.  This principle has been called the
4929*four-space rule*.
4930
4931The four-space rule is clear and principled, and if the reference
4932implementation `Markdown.pl` had followed it, it probably would have
4933become the standard.  However, `Markdown.pl` allowed paragraphs and
4934sublists to start with only two spaces indentation, at least on the
4935outer level.  Worse, its behavior was inconsistent: a sublist of an
4936outer-level list needed two spaces indentation, but a sublist of this
4937sublist needed three spaces.  It is not surprising, then, that different
4938implementations of Markdown have developed very different rules for
4939determining what comes under a list item.  (Pandoc and python-Markdown,
4940for example, stuck with Gruber's syntax description and the four-space
4941rule, while discount, redcarpet, marked, PHP Markdown, and others
4942followed `Markdown.pl`'s behavior more closely.)
4943
4944Unfortunately, given the divergences between implementations, there
4945is no way to give a spec for list items that will be guaranteed not
4946to break any existing documents.  However, the spec given here should
4947correctly handle lists formatted with either the four-space rule or
4948the more forgiving `Markdown.pl` behavior, provided they are laid out
4949in a way that is natural for a human to read.
4950
4951The strategy here is to let the width and indentation of the list marker
4952determine the indentation necessary for blocks to fall under the list
4953item, rather than having a fixed and arbitrary number.  The writer can
4954think of the body of the list item as a unit which gets indented to the
4955right enough to fit the list marker (and any indentation on the list
4956marker).  (The laziness rule, #5, then allows continuation lines to be
4957unindented if needed.)
4958
4959This rule is superior, we claim, to any rule requiring a fixed level of
4960indentation from the margin.  The four-space rule is clear but
4961unnatural. It is quite unintuitive that
4962
4963``` markdown
4964- foo
4965
4966  bar
4967
4968  - baz
4969```
4970
4971should be parsed as two lists with an intervening paragraph,
4972
4973``` html
4974<ul>
4975<li>foo</li>
4976</ul>
4977<p>bar</p>
4978<ul>
4979<li>baz</li>
4980</ul>
4981```
4982
4983as the four-space rule demands, rather than a single list,
4984
4985``` html
4986<ul>
4987<li>
4988<p>foo</p>
4989<p>bar</p>
4990<ul>
4991<li>baz</li>
4992</ul>
4993</li>
4994</ul>
4995```
4996
4997The choice of four spaces is arbitrary.  It can be learned, but it is
4998not likely to be guessed, and it trips up beginners regularly.
4999
5000Would it help to adopt a two-space rule?  The problem is that such
5001a rule, together with the rule allowing 1--3 spaces indentation of the
5002initial list marker, allows text that is indented *less than* the
5003original list marker to be included in the list item. For example,
5004`Markdown.pl` parses
5005
5006``` markdown
5007   - one
5008
5009  two
5010```
5011
5012as a single list item, with `two` a continuation paragraph:
5013
5014``` html
5015<ul>
5016<li>
5017<p>one</p>
5018<p>two</p>
5019</li>
5020</ul>
5021```
5022
5023and similarly
5024
5025``` markdown
5026>   - one
5027>
5028>  two
5029```
5030
5031as
5032
5033``` html
5034<blockquote>
5035<ul>
5036<li>
5037<p>one</p>
5038<p>two</p>
5039</li>
5040</ul>
5041</blockquote>
5042```
5043
5044This is extremely unintuitive.
5045
5046Rather than requiring a fixed indent from the margin, we could require
5047a fixed indent (say, two spaces, or even one space) from the list marker (which
5048may itself be indented).  This proposal would remove the last anomaly
5049discussed.  Unlike the spec presented above, it would count the following
5050as a list item with a subparagraph, even though the paragraph `bar`
5051is not indented as far as the first paragraph `foo`:
5052
5053``` markdown
5054 10. foo
5055
5056   bar
5057```
5058
5059Arguably this text does read like a list item with `bar` as a subparagraph,
5060which may count in favor of the proposal.  However, on this proposal indented
5061code would have to be indented six spaces after the list marker.  And this
5062would break a lot of existing Markdown, which has the pattern:
5063
5064``` markdown
50651.  foo
5066
5067        indented code
5068```
5069
5070where the code is indented eight spaces.  The spec above, by contrast, will
5071parse this text as expected, since the code block's indentation is measured
5072from the beginning of `foo`.
5073
5074The one case that needs special treatment is a list item that *starts*
5075with indented code.  How much indentation is required in that case, since
5076we don't have a "first paragraph" to measure from?  Rule #2 simply stipulates
5077that in such cases, we require one space indentation from the list marker
5078(and then the normal four spaces for the indented code).  This will match the
5079four-space rule in cases where the list marker plus its initial indentation
5080takes four spaces (a common case), but diverge in other cases.
5081
5082<div class="extension">
5083
5084## Task list items (extension)
5085
5086GFM enables the `tasklist` extension, where an additional processing step is
5087performed on [list items].
5088
5089A [task list item](@) is a [list item][list items] where the first block in it
5090is a paragraph which begins with a [task list item marker] and at least one
5091whitespace character before any other content.
5092
5093A [task list item marker](@) consists of an optional number of spaces, a left
5094bracket (`[`), either a whitespace character or the letter `x` in either
5095lowercase or uppercase, and then a right bracket (`]`).
5096
5097When rendered, the [task list item marker] is replaced with a semantic checkbox element;
5098in an HTML output, this would be an `<input type="checkbox">` element.
5099
5100If the character between the brackets is a whitespace character, the checkbox
5101is unchecked.  Otherwise, the checkbox is checked.
5102
5103This spec does not define how the checkbox elements are interacted with: in practice,
5104implementors are free to render the checkboxes as disabled or inmutable elements,
5105or they may dynamically handle dynamic interactions (i.e. checking, unchecking) in
5106the final rendered document.
5107
5108```````````````````````````````` example disabled
5109- [ ] foo
5110- [x] bar
5111.
5112<ul>
5113<li><input disabled="" type="checkbox"> foo</li>
5114<li><input checked="" disabled="" type="checkbox"> bar</li>
5115</ul>
5116````````````````````````````````
5117
5118Task lists can be arbitrarily nested:
5119
5120```````````````````````````````` example disabled
5121- [x] foo
5122  - [ ] bar
5123  - [x] baz
5124- [ ] bim
5125.
5126<ul>
5127<li><input checked="" disabled="" type="checkbox"> foo
5128<ul>
5129<li><input disabled="" type="checkbox"> bar</li>
5130<li><input checked="" disabled="" type="checkbox"> baz</li>
5131</ul>
5132</li>
5133<li><input disabled="" type="checkbox"> bim</li>
5134</ul>
5135````````````````````````````````
5136
5137</div>
5138
5139## Lists
5140
5141A [list](@) is a sequence of one or more
5142list items [of the same type].  The list items
5143may be separated by any number of blank lines.
5144
5145Two list items are [of the same type](@)
5146if they begin with a [list marker] of the same type.
5147Two list markers are of the
5148same type if (a) they are bullet list markers using the same character
5149(`-`, `+`, or `*`) or (b) they are ordered list numbers with the same
5150delimiter (either `.` or `)`).
5151
5152A list is an [ordered list](@)
5153if its constituent list items begin with
5154[ordered list markers], and a
5155[bullet list](@) if its constituent list
5156items begin with [bullet list markers].
5157
5158The [start number](@)
5159of an [ordered list] is determined by the list number of
5160its initial list item.  The numbers of subsequent list items are
5161disregarded.
5162
5163A list is [loose](@) if any of its constituent
5164list items are separated by blank lines, or if any of its constituent
5165list items directly contain two block-level elements with a blank line
5166between them.  Otherwise a list is [tight](@).
5167(The difference in HTML output is that paragraphs in a loose list are
5168wrapped in `<p>` tags, while paragraphs in a tight list are not.)
5169
5170Changing the bullet or ordered list delimiter starts a new list:
5171
5172```````````````````````````````` example
5173- foo
5174- bar
5175+ baz
5176.
5177<ul>
5178<li>foo</li>
5179<li>bar</li>
5180</ul>
5181<ul>
5182<li>baz</li>
5183</ul>
5184````````````````````````````````
5185
5186
5187```````````````````````````````` example
51881. foo
51892. bar
51903) baz
5191.
5192<ol>
5193<li>foo</li>
5194<li>bar</li>
5195</ol>
5196<ol start="3">
5197<li>baz</li>
5198</ol>
5199````````````````````````````````
5200
5201
5202In CommonMark, a list can interrupt a paragraph. That is,
5203no blank line is needed to separate a paragraph from a following
5204list:
5205
5206```````````````````````````````` example
5207Foo
5208- bar
5209- baz
5210.
5211<p>Foo</p>
5212<ul>
5213<li>bar</li>
5214<li>baz</li>
5215</ul>
5216````````````````````````````````
5217
5218`Markdown.pl` does not allow this, through fear of triggering a list
5219via a numeral in a hard-wrapped line:
5220
5221``` markdown
5222The number of windows in my house is
522314.  The number of doors is 6.
5224```
5225
5226Oddly, though, `Markdown.pl` *does* allow a blockquote to
5227interrupt a paragraph, even though the same considerations might
5228apply.
5229
5230In CommonMark, we do allow lists to interrupt paragraphs, for
5231two reasons.  First, it is natural and not uncommon for people
5232to start lists without blank lines:
5233
5234``` markdown
5235I need to buy
5236- new shoes
5237- a coat
5238- a plane ticket
5239```
5240
5241Second, we are attracted to a
5242
5243> [principle of uniformity](@):
5244> if a chunk of text has a certain
5245> meaning, it will continue to have the same meaning when put into a
5246> container block (such as a list item or blockquote).
5247
5248(Indeed, the spec for [list items] and [block quotes] presupposes
5249this principle.) This principle implies that if
5250
5251``` markdown
5252  * I need to buy
5253    - new shoes
5254    - a coat
5255    - a plane ticket
5256```
5257
5258is a list item containing a paragraph followed by a nested sublist,
5259as all Markdown implementations agree it is (though the paragraph
5260may be rendered without `<p>` tags, since the list is "tight"),
5261then
5262
5263``` markdown
5264I need to buy
5265- new shoes
5266- a coat
5267- a plane ticket
5268```
5269
5270by itself should be a paragraph followed by a nested sublist.
5271
5272Since it is well established Markdown practice to allow lists to
5273interrupt paragraphs inside list items, the [principle of
5274uniformity] requires us to allow this outside list items as
5275well.  ([reStructuredText](http://docutils.sourceforge.net/rst.html)
5276takes a different approach, requiring blank lines before lists
5277even inside other list items.)
5278
5279In order to solve the problem of unwanted lists in paragraphs with
5280hard-wrapped numerals, we allow only lists starting with `1` to
5281interrupt paragraphs.  Thus,
5282
5283```````````````````````````````` example
5284The number of windows in my house is
528514.  The number of doors is 6.
5286.
5287<p>The number of windows in my house is
528814.  The number of doors is 6.</p>
5289````````````````````````````````
5290
5291We may still get an unintended result in cases like
5292
5293```````````````````````````````` example
5294The number of windows in my house is
52951.  The number of doors is 6.
5296.
5297<p>The number of windows in my house is</p>
5298<ol>
5299<li>The number of doors is 6.</li>
5300</ol>
5301````````````````````````````````
5302
5303but this rule should prevent most spurious list captures.
5304
5305There can be any number of blank lines between items:
5306
5307```````````````````````````````` example
5308- foo
5309
5310- bar
5311
5312
5313- baz
5314.
5315<ul>
5316<li>
5317<p>foo</p>
5318</li>
5319<li>
5320<p>bar</p>
5321</li>
5322<li>
5323<p>baz</p>
5324</li>
5325</ul>
5326````````````````````````````````
5327
5328```````````````````````````````` example
5329- foo
5330  - bar
5331    - baz
5332
5333
5334      bim
5335.
5336<ul>
5337<li>foo
5338<ul>
5339<li>bar
5340<ul>
5341<li>
5342<p>baz</p>
5343<p>bim</p>
5344</li>
5345</ul>
5346</li>
5347</ul>
5348</li>
5349</ul>
5350````````````````````````````````
5351
5352
5353To separate consecutive lists of the same type, or to separate a
5354list from an indented code block that would otherwise be parsed
5355as a subparagraph of the final list item, you can insert a blank HTML
5356comment:
5357
5358```````````````````````````````` example
5359- foo
5360- bar
5361
5362<!-- -->
5363
5364- baz
5365- bim
5366.
5367<ul>
5368<li>foo</li>
5369<li>bar</li>
5370</ul>
5371<!-- -->
5372<ul>
5373<li>baz</li>
5374<li>bim</li>
5375</ul>
5376````````````````````````````````
5377
5378
5379```````````````````````````````` example
5380-   foo
5381
5382    notcode
5383
5384-   foo
5385
5386<!-- -->
5387
5388    code
5389.
5390<ul>
5391<li>
5392<p>foo</p>
5393<p>notcode</p>
5394</li>
5395<li>
5396<p>foo</p>
5397</li>
5398</ul>
5399<!-- -->
5400<pre><code>code
5401</code></pre>
5402````````````````````````````````
5403
5404
5405List items need not be indented to the same level.  The following
5406list items will be treated as items at the same list level,
5407since none is indented enough to belong to the previous list
5408item:
5409
5410```````````````````````````````` example
5411- a
5412 - b
5413  - c
5414   - d
5415  - e
5416 - f
5417- g
5418.
5419<ul>
5420<li>a</li>
5421<li>b</li>
5422<li>c</li>
5423<li>d</li>
5424<li>e</li>
5425<li>f</li>
5426<li>g</li>
5427</ul>
5428````````````````````````````````
5429
5430
5431```````````````````````````````` example
54321. a
5433
5434  2. b
5435
5436   3. c
5437.
5438<ol>
5439<li>
5440<p>a</p>
5441</li>
5442<li>
5443<p>b</p>
5444</li>
5445<li>
5446<p>c</p>
5447</li>
5448</ol>
5449````````````````````````````````
5450
5451Note, however, that list items may not be indented more than
5452three spaces.  Here `- e` is treated as a paragraph continuation
5453line, because it is indented more than three spaces:
5454
5455```````````````````````````````` example
5456- a
5457 - b
5458  - c
5459   - d
5460    - e
5461.
5462<ul>
5463<li>a</li>
5464<li>b</li>
5465<li>c</li>
5466<li>d
5467- e</li>
5468</ul>
5469````````````````````````````````
5470
5471And here, `3. c` is treated as in indented code block,
5472because it is indented four spaces and preceded by a
5473blank line.
5474
5475```````````````````````````````` example
54761. a
5477
5478  2. b
5479
5480    3. c
5481.
5482<ol>
5483<li>
5484<p>a</p>
5485</li>
5486<li>
5487<p>b</p>
5488</li>
5489</ol>
5490<pre><code>3. c
5491</code></pre>
5492````````````````````````````````
5493
5494
5495This is a loose list, because there is a blank line between
5496two of the list items:
5497
5498```````````````````````````````` example
5499- a
5500- b
5501
5502- c
5503.
5504<ul>
5505<li>
5506<p>a</p>
5507</li>
5508<li>
5509<p>b</p>
5510</li>
5511<li>
5512<p>c</p>
5513</li>
5514</ul>
5515````````````````````````````````
5516
5517
5518So is this, with a empty second item:
5519
5520```````````````````````````````` example
5521* a
5522*
5523
5524* c
5525.
5526<ul>
5527<li>
5528<p>a</p>
5529</li>
5530<li></li>
5531<li>
5532<p>c</p>
5533</li>
5534</ul>
5535````````````````````````````````
5536
5537
5538These are loose lists, even though there is no space between the items,
5539because one of the items directly contains two block-level elements
5540with a blank line between them:
5541
5542```````````````````````````````` example
5543- a
5544- b
5545
5546  c
5547- d
5548.
5549<ul>
5550<li>
5551<p>a</p>
5552</li>
5553<li>
5554<p>b</p>
5555<p>c</p>
5556</li>
5557<li>
5558<p>d</p>
5559</li>
5560</ul>
5561````````````````````````````````
5562
5563
5564```````````````````````````````` example
5565- a
5566- b
5567
5568  [ref]: /url
5569- d
5570.
5571<ul>
5572<li>
5573<p>a</p>
5574</li>
5575<li>
5576<p>b</p>
5577</li>
5578<li>
5579<p>d</p>
5580</li>
5581</ul>
5582````````````````````````````````
5583
5584
5585This is a tight list, because the blank lines are in a code block:
5586
5587```````````````````````````````` example
5588- a
5589- ```
5590  b
5591
5592
5593  ```
5594- c
5595.
5596<ul>
5597<li>a</li>
5598<li>
5599<pre><code>b
5600
5601
5602</code></pre>
5603</li>
5604<li>c</li>
5605</ul>
5606````````````````````````````````
5607
5608
5609This is a tight list, because the blank line is between two
5610paragraphs of a sublist.  So the sublist is loose while
5611the outer list is tight:
5612
5613```````````````````````````````` example
5614- a
5615  - b
5616
5617    c
5618- d
5619.
5620<ul>
5621<li>a
5622<ul>
5623<li>
5624<p>b</p>
5625<p>c</p>
5626</li>
5627</ul>
5628</li>
5629<li>d</li>
5630</ul>
5631````````````````````````````````
5632
5633
5634This is a tight list, because the blank line is inside the
5635block quote:
5636
5637```````````````````````````````` example
5638* a
5639  > b
5640  >
5641* c
5642.
5643<ul>
5644<li>a
5645<blockquote>
5646<p>b</p>
5647</blockquote>
5648</li>
5649<li>c</li>
5650</ul>
5651````````````````````````````````
5652
5653
5654This list is tight, because the consecutive block elements
5655are not separated by blank lines:
5656
5657```````````````````````````````` example
5658- a
5659  > b
5660  ```
5661  c
5662  ```
5663- d
5664.
5665<ul>
5666<li>a
5667<blockquote>
5668<p>b</p>
5669</blockquote>
5670<pre><code>c
5671</code></pre>
5672</li>
5673<li>d</li>
5674</ul>
5675````````````````````````````````
5676
5677
5678A single-paragraph list is tight:
5679
5680```````````````````````````````` example
5681- a
5682.
5683<ul>
5684<li>a</li>
5685</ul>
5686````````````````````````````````
5687
5688
5689```````````````````````````````` example
5690- a
5691  - b
5692.
5693<ul>
5694<li>a
5695<ul>
5696<li>b</li>
5697</ul>
5698</li>
5699</ul>
5700````````````````````````````````
5701
5702
5703This list is loose, because of the blank line between the
5704two block elements in the list item:
5705
5706```````````````````````````````` example
57071. ```
5708   foo
5709   ```
5710
5711   bar
5712.
5713<ol>
5714<li>
5715<pre><code>foo
5716</code></pre>
5717<p>bar</p>
5718</li>
5719</ol>
5720````````````````````````````````
5721
5722
5723Here the outer list is loose, the inner list tight:
5724
5725```````````````````````````````` example
5726* foo
5727  * bar
5728
5729  baz
5730.
5731<ul>
5732<li>
5733<p>foo</p>
5734<ul>
5735<li>bar</li>
5736</ul>
5737<p>baz</p>
5738</li>
5739</ul>
5740````````````````````````````````
5741
5742
5743```````````````````````````````` example
5744- a
5745  - b
5746  - c
5747
5748- d
5749  - e
5750  - f
5751.
5752<ul>
5753<li>
5754<p>a</p>
5755<ul>
5756<li>b</li>
5757<li>c</li>
5758</ul>
5759</li>
5760<li>
5761<p>d</p>
5762<ul>
5763<li>e</li>
5764<li>f</li>
5765</ul>
5766</li>
5767</ul>
5768````````````````````````````````
5769
5770
5771# Inlines
5772
5773Inlines are parsed sequentially from the beginning of the character
5774stream to the end (left to right, in left-to-right languages).
5775Thus, for example, in
5776
5777```````````````````````````````` example
5778`hi`lo`
5779.
5780<p><code>hi</code>lo`</p>
5781````````````````````````````````
5782
5783`hi` is parsed as code, leaving the backtick at the end as a literal
5784backtick.
5785
5786
5787## Backslash escapes
5788
5789Any ASCII punctuation character may be backslash-escaped:
5790
5791```````````````````````````````` example
5792\!\"\#\$\%\&\'\(\)\*\+\,\-\.\/\:\;\<\=\>\?\@\[\\\]\^\_\`\{\|\}\~
5793.
5794<p>!&quot;#$%&amp;'()*+,-./:;&lt;=&gt;?@[\]^_`{|}~</p>
5795````````````````````````````````
5796
5797
5798Backslashes before other characters are treated as literal
5799backslashes:
5800
5801```````````````````````````````` example
5802\→\A\a\ \3\φ\«
5803.
5804<p>\→\A\a\ \3\φ\«</p>
5805````````````````````````````````
5806
5807
5808Escaped characters are treated as regular characters and do
5809not have their usual Markdown meanings:
5810
5811```````````````````````````````` example
5812\*not emphasized*
5813\<br/> not a tag
5814\[not a link](/foo)
5815\`not code`
58161\. not a list
5817\* not a list
5818\# not a heading
5819\[foo]: /url "not a reference"
5820\&ouml; not a character entity
5821.
5822<p>*not emphasized*
5823&lt;br/&gt; not a tag
5824[not a link](/foo)
5825`not code`
58261. not a list
5827* not a list
5828# not a heading
5829[foo]: /url &quot;not a reference&quot;
5830&amp;ouml; not a character entity</p>
5831````````````````````````````````
5832
5833
5834If a backslash is itself escaped, the following character is not:
5835
5836```````````````````````````````` example
5837\\*emphasis*
5838.
5839<p>\<em>emphasis</em></p>
5840````````````````````````````````
5841
5842
5843A backslash at the end of the line is a [hard line break]:
5844
5845```````````````````````````````` example
5846foo\
5847bar
5848.
5849<p>foo<br />
5850bar</p>
5851````````````````````````````````
5852
5853
5854Backslash escapes do not work in code blocks, code spans, autolinks, or
5855raw HTML:
5856
5857```````````````````````````````` example
5858`` \[\` ``
5859.
5860<p><code>\[\`</code></p>
5861````````````````````````````````
5862
5863
5864```````````````````````````````` example
5865    \[\]
5866.
5867<pre><code>\[\]
5868</code></pre>
5869````````````````````````````````
5870
5871
5872```````````````````````````````` example
5873~~~
5874\[\]
5875~~~
5876.
5877<pre><code>\[\]
5878</code></pre>
5879````````````````````````````````
5880
5881
5882```````````````````````````````` example
5883<http://example.com?find=\*>
5884.
5885<p><a href="http://example.com?find=%5C*">http://example.com?find=\*</a></p>
5886````````````````````````````````
5887
5888
5889```````````````````````````````` example
5890<a href="/bar\/)">
5891.
5892<a href="/bar\/)">
5893````````````````````````````````
5894
5895
5896But they work in all other contexts, including URLs and link titles,
5897link references, and [info strings] in [fenced code blocks]:
5898
5899```````````````````````````````` example
5900[foo](/bar\* "ti\*tle")
5901.
5902<p><a href="/bar*" title="ti*tle">foo</a></p>
5903````````````````````````````````
5904
5905
5906```````````````````````````````` example
5907[foo]
5908
5909[foo]: /bar\* "ti\*tle"
5910.
5911<p><a href="/bar*" title="ti*tle">foo</a></p>
5912````````````````````````````````
5913
5914
5915```````````````````````````````` example
5916``` foo\+bar
5917foo
5918```
5919.
5920<pre><code class="language-foo+bar">foo
5921</code></pre>
5922````````````````````````````````
5923
5924
5925
5926## Entity and numeric character references
5927
5928Valid HTML entity references and numeric character references
5929can be used in place of the corresponding Unicode character,
5930with the following exceptions:
5931
5932- Entity and character references are not recognized in code
5933  blocks and code spans.
5934
5935- Entity and character references cannot stand in place of
5936  special characters that define structural elements in
5937  CommonMark.  For example, although `&#42;` can be used
5938  in place of a literal `*` character, `&#42;` cannot replace
5939  `*` in emphasis delimiters, bullet list markers, or thematic
5940  breaks.
5941
5942Conforming CommonMark parsers need not store information about
5943whether a particular character was represented in the source
5944using a Unicode character or an entity reference.
5945
5946[Entity references](@) consist of `&` + any of the valid
5947HTML5 entity names + `;`. The
5948document <https://html.spec.whatwg.org/multipage/entities.json>
5949is used as an authoritative source for the valid entity
5950references and their corresponding code points.
5951
5952```````````````````````````````` example
5953&nbsp; &amp; &copy; &AElig; &Dcaron;
5954&frac34; &HilbertSpace; &DifferentialD;
5955&ClockwiseContourIntegral; &ngE;
5956.
5957<p>  &amp; © Æ Ď
5958¾ ℋ ⅆ
5959∲ ≧̸</p>
5960````````````````````````````````
5961
5962
5963[Decimal numeric character
5964references](@)
5965consist of `&#` + a string of 1--7 arabic digits + `;`. A
5966numeric character reference is parsed as the corresponding
5967Unicode character. Invalid Unicode code points will be replaced by
5968the REPLACEMENT CHARACTER (`U+FFFD`).  For security reasons,
5969the code point `U+0000` will also be replaced by `U+FFFD`.
5970
5971```````````````````````````````` example
5972&#35; &#1234; &#992; &#0;
5973.
5974<p># Ӓ Ϡ �</p>
5975````````````````````````````````
5976
5977
5978[Hexadecimal numeric character
5979references](@) consist of `&#` +
5980either `X` or `x` + a string of 1-6 hexadecimal digits + `;`.
5981They too are parsed as the corresponding Unicode character (this
5982time specified with a hexadecimal numeral instead of decimal).
5983
5984```````````````````````````````` example
5985&#X22; &#XD06; &#xcab;
5986.
5987<p>&quot; ആ ಫ</p>
5988````````````````````````````````
5989
5990
5991Here are some nonentities:
5992
5993```````````````````````````````` example
5994&nbsp &x; &#; &#x;
5995&#987654321;
5996&#abcdef0;
5997&ThisIsNotDefined; &hi?;
5998.
5999<p>&amp;nbsp &amp;x; &amp;#; &amp;#x;
6000&amp;#987654321;
6001&amp;#abcdef0;
6002&amp;ThisIsNotDefined; &amp;hi?;</p>
6003````````````````````````````````
6004
6005
6006Although HTML5 does accept some entity references
6007without a trailing semicolon (such as `&copy`), these are not
6008recognized here, because it makes the grammar too ambiguous:
6009
6010```````````````````````````````` example
6011&copy
6012.
6013<p>&amp;copy</p>
6014````````````````````````````````
6015
6016
6017Strings that are not on the list of HTML5 named entities are not
6018recognized as entity references either:
6019
6020```````````````````````````````` example
6021&MadeUpEntity;
6022.
6023<p>&amp;MadeUpEntity;</p>
6024````````````````````````````````
6025
6026
6027Entity and numeric character references are recognized in any
6028context besides code spans or code blocks, including
6029URLs, [link titles], and [fenced code block][] [info strings]:
6030
6031```````````````````````````````` example
6032<a href="&ouml;&ouml;.html">
6033.
6034<a href="&ouml;&ouml;.html">
6035````````````````````````````````
6036
6037
6038```````````````````````````````` example
6039[foo](/f&ouml;&ouml; "f&ouml;&ouml;")
6040.
6041<p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p>
6042````````````````````````````````
6043
6044
6045```````````````````````````````` example
6046[foo]
6047
6048[foo]: /f&ouml;&ouml; "f&ouml;&ouml;"
6049.
6050<p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p>
6051````````````````````````````````
6052
6053
6054```````````````````````````````` example
6055``` f&ouml;&ouml;
6056foo
6057```
6058.
6059<pre><code class="language-föö">foo
6060</code></pre>
6061````````````````````````````````
6062
6063
6064Entity and numeric character references are treated as literal
6065text in code spans and code blocks:
6066
6067```````````````````````````````` example
6068`f&ouml;&ouml;`
6069.
6070<p><code>f&amp;ouml;&amp;ouml;</code></p>
6071````````````````````````````````
6072
6073
6074```````````````````````````````` example
6075    f&ouml;f&ouml;
6076.
6077<pre><code>f&amp;ouml;f&amp;ouml;
6078</code></pre>
6079````````````````````````````````
6080
6081
6082Entity and numeric character references cannot be used
6083in place of symbols indicating structure in CommonMark
6084documents.
6085
6086```````````````````````````````` example
6087&#42;foo&#42;
6088*foo*
6089.
6090<p>*foo*
6091<em>foo</em></p>
6092````````````````````````````````
6093
6094```````````````````````````````` example
6095&#42; foo
6096
6097* foo
6098.
6099<p>* foo</p>
6100<ul>
6101<li>foo</li>
6102</ul>
6103````````````````````````````````
6104
6105```````````````````````````````` example
6106foo&#10;&#10;bar
6107.
6108<p>foo
6109
6110bar</p>
6111````````````````````````````````
6112
6113```````````````````````````````` example
6114&#9;foo
6115.
6116<p>→foo</p>
6117````````````````````````````````
6118
6119
6120```````````````````````````````` example
6121[a](url &quot;tit&quot;)
6122.
6123<p>[a](url &quot;tit&quot;)</p>
6124````````````````````````````````
6125
6126
6127## Code spans
6128
6129A [backtick string](@)
6130is a string of one or more backtick characters (`` ` ``) that is neither
6131preceded nor followed by a backtick.
6132
6133A [code span](@) begins with a backtick string and ends with
6134a backtick string of equal length.  The contents of the code span are
6135the characters between the two backtick strings, normalized in the
6136following ways:
6137
6138- First, [line endings] are converted to [spaces].
6139- If the resulting string both begins *and* ends with a [space]
6140  character, but does not consist entirely of [space]
6141  characters, a single [space] character is removed from the
6142  front and back.  This allows you to include code that begins
6143  or ends with backtick characters, which must be separated by
6144  whitespace from the opening or closing backtick strings.
6145
6146This is a simple code span:
6147
6148```````````````````````````````` example
6149`foo`
6150.
6151<p><code>foo</code></p>
6152````````````````````````````````
6153
6154
6155Here two backticks are used, because the code contains a backtick.
6156This example also illustrates stripping of a single leading and
6157trailing space:
6158
6159```````````````````````````````` example
6160`` foo ` bar ``
6161.
6162<p><code>foo ` bar</code></p>
6163````````````````````````````````
6164
6165
6166This example shows the motivation for stripping leading and trailing
6167spaces:
6168
6169```````````````````````````````` example
6170` `` `
6171.
6172<p><code>``</code></p>
6173````````````````````````````````
6174
6175Note that only *one* space is stripped:
6176
6177```````````````````````````````` example
6178`  ``  `
6179.
6180<p><code> `` </code></p>
6181````````````````````````````````
6182
6183The stripping only happens if the space is on both
6184sides of the string:
6185
6186```````````````````````````````` example
6187` a`
6188.
6189<p><code> a</code></p>
6190````````````````````````````````
6191
6192Only [spaces], and not [unicode whitespace] in general, are
6193stripped in this way:
6194
6195```````````````````````````````` example
6196` b `
6197.
6198<p><code> b </code></p>
6199````````````````````````````````
6200
6201No stripping occurs if the code span contains only spaces:
6202
6203```````````````````````````````` example
6204` `
6205`  `
6206.
6207<p><code> </code>
6208<code>  </code></p>
6209````````````````````````````````
6210
6211
6212[Line endings] are treated like spaces:
6213
6214```````````````````````````````` example
6215``
6216foo
6217bar
6218baz
6219``
6220.
6221<p><code>foo bar   baz</code></p>
6222````````````````````````````````
6223
6224```````````````````````````````` example
6225``
6226foo
6227``
6228.
6229<p><code>foo </code></p>
6230````````````````````````````````
6231
6232
6233Interior spaces are not collapsed:
6234
6235```````````````````````````````` example
6236`foo   bar
6237baz`
6238.
6239<p><code>foo   bar  baz</code></p>
6240````````````````````````````````
6241
6242Note that browsers will typically collapse consecutive spaces
6243when rendering `<code>` elements, so it is recommended that
6244the following CSS be used:
6245
6246    code{white-space: pre-wrap;}
6247
6248
6249Note that backslash escapes do not work in code spans. All backslashes
6250are treated literally:
6251
6252```````````````````````````````` example
6253`foo\`bar`
6254.
6255<p><code>foo\</code>bar`</p>
6256````````````````````````````````
6257
6258
6259Backslash escapes are never needed, because one can always choose a
6260string of *n* backtick characters as delimiters, where the code does
6261not contain any strings of exactly *n* backtick characters.
6262
6263```````````````````````````````` example
6264``foo`bar``
6265.
6266<p><code>foo`bar</code></p>
6267````````````````````````````````
6268
6269```````````````````````````````` example
6270` foo `` bar `
6271.
6272<p><code>foo `` bar</code></p>
6273````````````````````````````````
6274
6275
6276Code span backticks have higher precedence than any other inline
6277constructs except HTML tags and autolinks.  Thus, for example, this is
6278not parsed as emphasized text, since the second `*` is part of a code
6279span:
6280
6281```````````````````````````````` example
6282*foo`*`
6283.
6284<p>*foo<code>*</code></p>
6285````````````````````````````````
6286
6287
6288And this is not parsed as a link:
6289
6290```````````````````````````````` example
6291[not a `link](/foo`)
6292.
6293<p>[not a <code>link](/foo</code>)</p>
6294````````````````````````````````
6295
6296
6297Code spans, HTML tags, and autolinks have the same precedence.
6298Thus, this is code:
6299
6300```````````````````````````````` example
6301`<a href="`">`
6302.
6303<p><code>&lt;a href=&quot;</code>&quot;&gt;`</p>
6304````````````````````````````````
6305
6306
6307But this is an HTML tag:
6308
6309```````````````````````````````` example
6310<a href="`">`
6311.
6312<p><a href="`">`</p>
6313````````````````````````````````
6314
6315
6316And this is code:
6317
6318```````````````````````````````` example
6319`<http://foo.bar.`baz>`
6320.
6321<p><code>&lt;http://foo.bar.</code>baz&gt;`</p>
6322````````````````````````````````
6323
6324
6325But this is an autolink:
6326
6327```````````````````````````````` example
6328<http://foo.bar.`baz>`
6329.
6330<p><a href="http://foo.bar.%60baz">http://foo.bar.`baz</a>`</p>
6331````````````````````````````````
6332
6333
6334When a backtick string is not closed by a matching backtick string,
6335we just have literal backticks:
6336
6337```````````````````````````````` example
6338```foo``
6339.
6340<p>```foo``</p>
6341````````````````````````````````
6342
6343
6344```````````````````````````````` example
6345`foo
6346.
6347<p>`foo</p>
6348````````````````````````````````
6349
6350The following case also illustrates the need for opening and
6351closing backtick strings to be equal in length:
6352
6353```````````````````````````````` example
6354`foo``bar``
6355.
6356<p>`foo<code>bar</code></p>
6357````````````````````````````````
6358
6359
6360## Emphasis and strong emphasis
6361
6362John Gruber's original [Markdown syntax
6363description](http://daringfireball.net/projects/markdown/syntax#em) says:
6364
6365> Markdown treats asterisks (`*`) and underscores (`_`) as indicators of
6366> emphasis. Text wrapped with one `*` or `_` will be wrapped with an HTML
6367> `<em>` tag; double `*`'s or `_`'s will be wrapped with an HTML `<strong>`
6368> tag.
6369
6370This is enough for most users, but these rules leave much undecided,
6371especially when it comes to nested emphasis.  The original
6372`Markdown.pl` test suite makes it clear that triple `***` and
6373`___` delimiters can be used for strong emphasis, and most
6374implementations have also allowed the following patterns:
6375
6376``` markdown
6377***strong emph***
6378***strong** in emph*
6379***emph* in strong**
6380**in strong *emph***
6381*in emph **strong***
6382```
6383
6384The following patterns are less widely supported, but the intent
6385is clear and they are useful (especially in contexts like bibliography
6386entries):
6387
6388``` markdown
6389*emph *with emph* in it*
6390**strong **with strong** in it**
6391```
6392
6393Many implementations have also restricted intraword emphasis to
6394the `*` forms, to avoid unwanted emphasis in words containing
6395internal underscores.  (It is best practice to put these in code
6396spans, but users often do not.)
6397
6398``` markdown
6399internal emphasis: foo*bar*baz
6400no emphasis: foo_bar_baz
6401```
6402
6403The rules given below capture all of these patterns, while allowing
6404for efficient parsing strategies that do not backtrack.
6405
6406First, some definitions.  A [delimiter run](@) is either
6407a sequence of one or more `*` characters that is not preceded or
6408followed by a non-backslash-escaped `*` character, or a sequence
6409of one or more `_` characters that is not preceded or followed by
6410a non-backslash-escaped `_` character.
6411
6412A [left-flanking delimiter run](@) is
6413a [delimiter run] that is (1) not followed by [Unicode whitespace],
6414and either (2a) not followed by a [punctuation character], or
6415(2b) followed by a [punctuation character] and
6416preceded by [Unicode whitespace] or a [punctuation character].
6417For purposes of this definition, the beginning and the end of
6418the line count as Unicode whitespace.
6419
6420A [right-flanking delimiter run](@) is
6421a [delimiter run] that is (1) not preceded by [Unicode whitespace],
6422and either (2a) not preceded by a [punctuation character], or
6423(2b) preceded by a [punctuation character] and
6424followed by [Unicode whitespace] or a [punctuation character].
6425For purposes of this definition, the beginning and the end of
6426the line count as Unicode whitespace.
6427
6428Here are some examples of delimiter runs.
6429
6430  - left-flanking but not right-flanking:
6431
6432    ```
6433    ***abc
6434      _abc
6435    **"abc"
6436     _"abc"
6437    ```
6438
6439  - right-flanking but not left-flanking:
6440
6441    ```
6442     abc***
6443     abc_
6444    "abc"**
6445    "abc"_
6446    ```
6447
6448  - Both left and right-flanking:
6449
6450    ```
6451     abc***def
6452    "abc"_"def"
6453    ```
6454
6455  - Neither left nor right-flanking:
6456
6457    ```
6458    abc *** def
6459    a _ b
6460    ```
6461
6462(The idea of distinguishing left-flanking and right-flanking
6463delimiter runs based on the character before and the character
6464after comes from Roopesh Chander's
6465[vfmd](http://www.vfmd.org/vfmd-spec/specification/#procedure-for-identifying-emphasis-tags).
6466vfmd uses the terminology "emphasis indicator string" instead of "delimiter
6467run," and its rules for distinguishing left- and right-flanking runs
6468are a bit more complex than the ones given here.)
6469
6470The following rules define emphasis and strong emphasis:
6471
64721.  A single `*` character [can open emphasis](@)
6473    iff (if and only if) it is part of a [left-flanking delimiter run].
6474
64752.  A single `_` character [can open emphasis] iff
6476    it is part of a [left-flanking delimiter run]
6477    and either (a) not part of a [right-flanking delimiter run]
6478    or (b) part of a [right-flanking delimiter run]
6479    preceded by punctuation.
6480
64813.  A single `*` character [can close emphasis](@)
6482    iff it is part of a [right-flanking delimiter run].
6483
64844.  A single `_` character [can close emphasis] iff
6485    it is part of a [right-flanking delimiter run]
6486    and either (a) not part of a [left-flanking delimiter run]
6487    or (b) part of a [left-flanking delimiter run]
6488    followed by punctuation.
6489
64905.  A double `**` [can open strong emphasis](@)
6491    iff it is part of a [left-flanking delimiter run].
6492
64936.  A double `__` [can open strong emphasis] iff
6494    it is part of a [left-flanking delimiter run]
6495    and either (a) not part of a [right-flanking delimiter run]
6496    or (b) part of a [right-flanking delimiter run]
6497    preceded by punctuation.
6498
64997.  A double `**` [can close strong emphasis](@)
6500    iff it is part of a [right-flanking delimiter run].
6501
65028.  A double `__` [can close strong emphasis] iff
6503    it is part of a [right-flanking delimiter run]
6504    and either (a) not part of a [left-flanking delimiter run]
6505    or (b) part of a [left-flanking delimiter run]
6506    followed by punctuation.
6507
65089.  Emphasis begins with a delimiter that [can open emphasis] and ends
6509    with a delimiter that [can close emphasis], and that uses the same
6510    character (`_` or `*`) as the opening delimiter.  The
6511    opening and closing delimiters must belong to separate
6512    [delimiter runs].  If one of the delimiters can both
6513    open and close emphasis, then the sum of the lengths of the
6514    delimiter runs containing the opening and closing delimiters
6515    must not be a multiple of 3 unless both lengths are
6516    multiples of 3.
6517
651810. Strong emphasis begins with a delimiter that
6519    [can open strong emphasis] and ends with a delimiter that
6520    [can close strong emphasis], and that uses the same character
6521    (`_` or `*`) as the opening delimiter.  The
6522    opening and closing delimiters must belong to separate
6523    [delimiter runs].  If one of the delimiters can both open
6524    and close strong emphasis, then the sum of the lengths of
6525    the delimiter runs containing the opening and closing
6526    delimiters must not be a multiple of 3 unless both lengths
6527    are multiples of 3.
6528
652911. A literal `*` character cannot occur at the beginning or end of
6530    `*`-delimited emphasis or `**`-delimited strong emphasis, unless it
6531    is backslash-escaped.
6532
653312. A literal `_` character cannot occur at the beginning or end of
6534    `_`-delimited emphasis or `__`-delimited strong emphasis, unless it
6535    is backslash-escaped.
6536
6537Where rules 1--12 above are compatible with multiple parsings,
6538the following principles resolve ambiguity:
6539
654013. The number of nestings should be minimized. Thus, for example,
6541    an interpretation `<strong>...</strong>` is always preferred to
6542    `<em><em>...</em></em>`.
6543
654414. An interpretation `<em><strong>...</strong></em>` is always
6545    preferred to `<strong><em>...</em></strong>`.
6546
654715. When two potential emphasis or strong emphasis spans overlap,
6548    so that the second begins before the first ends and ends after
6549    the first ends, the first takes precedence. Thus, for example,
6550    `*foo _bar* baz_` is parsed as `<em>foo _bar</em> baz_` rather
6551    than `*foo <em>bar* baz</em>`.
6552
655316. When there are two potential emphasis or strong emphasis spans
6554    with the same closing delimiter, the shorter one (the one that
6555    opens later) takes precedence. Thus, for example,
6556    `**foo **bar baz**` is parsed as `**foo <strong>bar baz</strong>`
6557    rather than `<strong>foo **bar baz</strong>`.
6558
655917. Inline code spans, links, images, and HTML tags group more tightly
6560    than emphasis.  So, when there is a choice between an interpretation
6561    that contains one of these elements and one that does not, the
6562    former always wins.  Thus, for example, `*[foo*](bar)` is
6563    parsed as `*<a href="bar">foo*</a>` rather than as
6564    `<em>[foo</em>](bar)`.
6565
6566These rules can be illustrated through a series of examples.
6567
6568Rule 1:
6569
6570```````````````````````````````` example
6571*foo bar*
6572.
6573<p><em>foo bar</em></p>
6574````````````````````````````````
6575
6576
6577This is not emphasis, because the opening `*` is followed by
6578whitespace, and hence not part of a [left-flanking delimiter run]:
6579
6580```````````````````````````````` example
6581a * foo bar*
6582.
6583<p>a * foo bar*</p>
6584````````````````````````````````
6585
6586
6587This is not emphasis, because the opening `*` is preceded
6588by an alphanumeric and followed by punctuation, and hence
6589not part of a [left-flanking delimiter run]:
6590
6591```````````````````````````````` example
6592a*"foo"*
6593.
6594<p>a*&quot;foo&quot;*</p>
6595````````````````````````````````
6596
6597
6598Unicode nonbreaking spaces count as whitespace, too:
6599
6600```````````````````````````````` example
6601* a *
6602.
6603<p>* a *</p>
6604````````````````````````````````
6605
6606
6607Intraword emphasis with `*` is permitted:
6608
6609```````````````````````````````` example
6610foo*bar*
6611.
6612<p>foo<em>bar</em></p>
6613````````````````````````````````
6614
6615
6616```````````````````````````````` example
66175*6*78
6618.
6619<p>5<em>6</em>78</p>
6620````````````````````````````````
6621
6622
6623Rule 2:
6624
6625```````````````````````````````` example
6626_foo bar_
6627.
6628<p><em>foo bar</em></p>
6629````````````````````````````````
6630
6631
6632This is not emphasis, because the opening `_` is followed by
6633whitespace:
6634
6635```````````````````````````````` example
6636_ foo bar_
6637.
6638<p>_ foo bar_</p>
6639````````````````````````````````
6640
6641
6642This is not emphasis, because the opening `_` is preceded
6643by an alphanumeric and followed by punctuation:
6644
6645```````````````````````````````` example
6646a_"foo"_
6647.
6648<p>a_&quot;foo&quot;_</p>
6649````````````````````````````````
6650
6651
6652Emphasis with `_` is not allowed inside words:
6653
6654```````````````````````````````` example
6655foo_bar_
6656.
6657<p>foo_bar_</p>
6658````````````````````````````````
6659
6660
6661```````````````````````````````` example
66625_6_78
6663.
6664<p>5_6_78</p>
6665````````````````````````````````
6666
6667
6668```````````````````````````````` example
6669пристаням_стремятся_
6670.
6671<p>пристаням_стремятся_</p>
6672````````````````````````````````
6673
6674
6675Here `_` does not generate emphasis, because the first delimiter run
6676is right-flanking and the second left-flanking:
6677
6678```````````````````````````````` example
6679aa_"bb"_cc
6680.
6681<p>aa_&quot;bb&quot;_cc</p>
6682````````````````````````````````
6683
6684
6685This is emphasis, even though the opening delimiter is
6686both left- and right-flanking, because it is preceded by
6687punctuation:
6688
6689```````````````````````````````` example
6690foo-_(bar)_
6691.
6692<p>foo-<em>(bar)</em></p>
6693````````````````````````````````
6694
6695
6696Rule 3:
6697
6698This is not emphasis, because the closing delimiter does
6699not match the opening delimiter:
6700
6701```````````````````````````````` example
6702_foo*
6703.
6704<p>_foo*</p>
6705````````````````````````````````
6706
6707
6708This is not emphasis, because the closing `*` is preceded by
6709whitespace:
6710
6711```````````````````````````````` example
6712*foo bar *
6713.
6714<p>*foo bar *</p>
6715````````````````````````````````
6716
6717
6718A newline also counts as whitespace:
6719
6720```````````````````````````````` example
6721*foo bar
6722*
6723.
6724<p>*foo bar
6725*</p>
6726````````````````````````````````
6727
6728
6729This is not emphasis, because the second `*` is
6730preceded by punctuation and followed by an alphanumeric
6731(hence it is not part of a [right-flanking delimiter run]:
6732
6733```````````````````````````````` example
6734*(*foo)
6735.
6736<p>*(*foo)</p>
6737````````````````````````````````
6738
6739
6740The point of this restriction is more easily appreciated
6741with this example:
6742
6743```````````````````````````````` example
6744*(*foo*)*
6745.
6746<p><em>(<em>foo</em>)</em></p>
6747````````````````````````````````
6748
6749
6750Intraword emphasis with `*` is allowed:
6751
6752```````````````````````````````` example
6753*foo*bar
6754.
6755<p><em>foo</em>bar</p>
6756````````````````````````````````
6757
6758
6759
6760Rule 4:
6761
6762This is not emphasis, because the closing `_` is preceded by
6763whitespace:
6764
6765```````````````````````````````` example
6766_foo bar _
6767.
6768<p>_foo bar _</p>
6769````````````````````````````````
6770
6771
6772This is not emphasis, because the second `_` is
6773preceded by punctuation and followed by an alphanumeric:
6774
6775```````````````````````````````` example
6776_(_foo)
6777.
6778<p>_(_foo)</p>
6779````````````````````````````````
6780
6781
6782This is emphasis within emphasis:
6783
6784```````````````````````````````` example
6785_(_foo_)_
6786.
6787<p><em>(<em>foo</em>)</em></p>
6788````````````````````````````````
6789
6790
6791Intraword emphasis is disallowed for `_`:
6792
6793```````````````````````````````` example
6794_foo_bar
6795.
6796<p>_foo_bar</p>
6797````````````````````````````````
6798
6799
6800```````````````````````````````` example
6801_пристаням_стремятся
6802.
6803<p>_пристаням_стремятся</p>
6804````````````````````````````````
6805
6806
6807```````````````````````````````` example
6808_foo_bar_baz_
6809.
6810<p><em>foo_bar_baz</em></p>
6811````````````````````````````````
6812
6813
6814This is emphasis, even though the closing delimiter is
6815both left- and right-flanking, because it is followed by
6816punctuation:
6817
6818```````````````````````````````` example
6819_(bar)_.
6820.
6821<p><em>(bar)</em>.</p>
6822````````````````````````````````
6823
6824
6825Rule 5:
6826
6827```````````````````````````````` example
6828**foo bar**
6829.
6830<p><strong>foo bar</strong></p>
6831````````````````````````````````
6832
6833
6834This is not strong emphasis, because the opening delimiter is
6835followed by whitespace:
6836
6837```````````````````````````````` example
6838** foo bar**
6839.
6840<p>** foo bar**</p>
6841````````````````````````````````
6842
6843
6844This is not strong emphasis, because the opening `**` is preceded
6845by an alphanumeric and followed by punctuation, and hence
6846not part of a [left-flanking delimiter run]:
6847
6848```````````````````````````````` example
6849a**"foo"**
6850.
6851<p>a**&quot;foo&quot;**</p>
6852````````````````````````````````
6853
6854
6855Intraword strong emphasis with `**` is permitted:
6856
6857```````````````````````````````` example
6858foo**bar**
6859.
6860<p>foo<strong>bar</strong></p>
6861````````````````````````````````
6862
6863
6864Rule 6:
6865
6866```````````````````````````````` example
6867__foo bar__
6868.
6869<p><strong>foo bar</strong></p>
6870````````````````````````````````
6871
6872
6873This is not strong emphasis, because the opening delimiter is
6874followed by whitespace:
6875
6876```````````````````````````````` example
6877__ foo bar__
6878.
6879<p>__ foo bar__</p>
6880````````````````````````````````
6881
6882
6883A newline counts as whitespace:
6884```````````````````````````````` example
6885__
6886foo bar__
6887.
6888<p>__
6889foo bar__</p>
6890````````````````````````````````
6891
6892
6893This is not strong emphasis, because the opening `__` is preceded
6894by an alphanumeric and followed by punctuation:
6895
6896```````````````````````````````` example
6897a__"foo"__
6898.
6899<p>a__&quot;foo&quot;__</p>
6900````````````````````````````````
6901
6902
6903Intraword strong emphasis is forbidden with `__`:
6904
6905```````````````````````````````` example
6906foo__bar__
6907.
6908<p>foo__bar__</p>
6909````````````````````````````````
6910
6911
6912```````````````````````````````` example
69135__6__78
6914.
6915<p>5__6__78</p>
6916````````````````````````````````
6917
6918
6919```````````````````````````````` example
6920пристаням__стремятся__
6921.
6922<p>пристаням__стремятся__</p>
6923````````````````````````````````
6924
6925
6926```````````````````````````````` example
6927__foo, __bar__, baz__
6928.
6929<p><strong>foo, bar, baz</strong></p>
6930````````````````````````````````
6931
6932
6933This is strong emphasis, even though the opening delimiter is
6934both left- and right-flanking, because it is preceded by
6935punctuation:
6936
6937```````````````````````````````` example
6938foo-__(bar)__
6939.
6940<p>foo-<strong>(bar)</strong></p>
6941````````````````````````````````
6942
6943
6944
6945Rule 7:
6946
6947This is not strong emphasis, because the closing delimiter is preceded
6948by whitespace:
6949
6950```````````````````````````````` example
6951**foo bar **
6952.
6953<p>**foo bar **</p>
6954````````````````````````````````
6955
6956
6957(Nor can it be interpreted as an emphasized `*foo bar *`, because of
6958Rule 11.)
6959
6960This is not strong emphasis, because the second `**` is
6961preceded by punctuation and followed by an alphanumeric:
6962
6963```````````````````````````````` example
6964**(**foo)
6965.
6966<p>**(**foo)</p>
6967````````````````````````````````
6968
6969
6970The point of this restriction is more easily appreciated
6971with these examples:
6972
6973```````````````````````````````` example
6974*(**foo**)*
6975.
6976<p><em>(<strong>foo</strong>)</em></p>
6977````````````````````````````````
6978
6979
6980```````````````````````````````` example
6981**Gomphocarpus (*Gomphocarpus physocarpus*, syn.
6982*Asclepias physocarpa*)**
6983.
6984<p><strong>Gomphocarpus (<em>Gomphocarpus physocarpus</em>, syn.
6985<em>Asclepias physocarpa</em>)</strong></p>
6986````````````````````````````````
6987
6988
6989```````````````````````````````` example
6990**foo "*bar*" foo**
6991.
6992<p><strong>foo &quot;<em>bar</em>&quot; foo</strong></p>
6993````````````````````````````````
6994
6995
6996Intraword emphasis:
6997
6998```````````````````````````````` example
6999**foo**bar
7000.
7001<p><strong>foo</strong>bar</p>
7002````````````````````````````````
7003
7004
7005Rule 8:
7006
7007This is not strong emphasis, because the closing delimiter is
7008preceded by whitespace:
7009
7010```````````````````````````````` example
7011__foo bar __
7012.
7013<p>__foo bar __</p>
7014````````````````````````````````
7015
7016
7017This is not strong emphasis, because the second `__` is
7018preceded by punctuation and followed by an alphanumeric:
7019
7020```````````````````````````````` example
7021__(__foo)
7022.
7023<p>__(__foo)</p>
7024````````````````````````````````
7025
7026
7027The point of this restriction is more easily appreciated
7028with this example:
7029
7030```````````````````````````````` example
7031_(__foo__)_
7032.
7033<p><em>(<strong>foo</strong>)</em></p>
7034````````````````````````````````
7035
7036
7037Intraword strong emphasis is forbidden with `__`:
7038
7039```````````````````````````````` example
7040__foo__bar
7041.
7042<p>__foo__bar</p>
7043````````````````````````````````
7044
7045
7046```````````````````````````````` example
7047__пристаням__стремятся
7048.
7049<p>__пристаням__стремятся</p>
7050````````````````````````````````
7051
7052
7053```````````````````````````````` example
7054__foo__bar__baz__
7055.
7056<p><strong>foo__bar__baz</strong></p>
7057````````````````````````````````
7058
7059
7060This is strong emphasis, even though the closing delimiter is
7061both left- and right-flanking, because it is followed by
7062punctuation:
7063
7064```````````````````````````````` example
7065__(bar)__.
7066.
7067<p><strong>(bar)</strong>.</p>
7068````````````````````````````````
7069
7070
7071Rule 9:
7072
7073Any nonempty sequence of inline elements can be the contents of an
7074emphasized span.
7075
7076```````````````````````````````` example
7077*foo [bar](/url)*
7078.
7079<p><em>foo <a href="/url">bar</a></em></p>
7080````````````````````````````````
7081
7082
7083```````````````````````````````` example
7084*foo
7085bar*
7086.
7087<p><em>foo
7088bar</em></p>
7089````````````````````````````````
7090
7091
7092In particular, emphasis and strong emphasis can be nested
7093inside emphasis:
7094
7095```````````````````````````````` example
7096_foo __bar__ baz_
7097.
7098<p><em>foo <strong>bar</strong> baz</em></p>
7099````````````````````````````````
7100
7101
7102```````````````````````````````` example
7103_foo _bar_ baz_
7104.
7105<p><em>foo <em>bar</em> baz</em></p>
7106````````````````````````````````
7107
7108
7109```````````````````````````````` example
7110__foo_ bar_
7111.
7112<p><em><em>foo</em> bar</em></p>
7113````````````````````````````````
7114
7115
7116```````````````````````````````` example
7117*foo *bar**
7118.
7119<p><em>foo <em>bar</em></em></p>
7120````````````````````````````````
7121
7122
7123```````````````````````````````` example
7124*foo **bar** baz*
7125.
7126<p><em>foo <strong>bar</strong> baz</em></p>
7127````````````````````````````````
7128
7129```````````````````````````````` example
7130*foo**bar**baz*
7131.
7132<p><em>foo<strong>bar</strong>baz</em></p>
7133````````````````````````````````
7134
7135Note that in the preceding case, the interpretation
7136
7137``` markdown
7138<p><em>foo</em><em>bar<em></em>baz</em></p>
7139```
7140
7141
7142is precluded by the condition that a delimiter that
7143can both open and close (like the `*` after `foo`)
7144cannot form emphasis if the sum of the lengths of
7145the delimiter runs containing the opening and
7146closing delimiters is a multiple of 3 unless
7147both lengths are multiples of 3.
7148
7149
7150For the same reason, we don't get two consecutive
7151emphasis sections in this example:
7152
7153```````````````````````````````` example
7154*foo**bar*
7155.
7156<p><em>foo**bar</em></p>
7157````````````````````````````````
7158
7159
7160The same condition ensures that the following
7161cases are all strong emphasis nested inside
7162emphasis, even when the interior spaces are
7163omitted:
7164
7165
7166```````````````````````````````` example
7167***foo** bar*
7168.
7169<p><em><strong>foo</strong> bar</em></p>
7170````````````````````````````````
7171
7172
7173```````````````````````````````` example
7174*foo **bar***
7175.
7176<p><em>foo <strong>bar</strong></em></p>
7177````````````````````````````````
7178
7179
7180```````````````````````````````` example
7181*foo**bar***
7182.
7183<p><em>foo<strong>bar</strong></em></p>
7184````````````````````````````````
7185
7186
7187When the lengths of the interior closing and opening
7188delimiter runs are *both* multiples of 3, though,
7189they can match to create emphasis:
7190
7191```````````````````````````````` example
7192foo***bar***baz
7193.
7194<p>foo<em><strong>bar</strong></em>baz</p>
7195````````````````````````````````
7196
7197```````````````````````````````` example
7198foo******bar*********baz
7199.
7200<p>foo<strong>bar</strong>***baz</p>
7201````````````````````````````````
7202
7203
7204Indefinite levels of nesting are possible:
7205
7206```````````````````````````````` example
7207*foo **bar *baz* bim** bop*
7208.
7209<p><em>foo <strong>bar <em>baz</em> bim</strong> bop</em></p>
7210````````````````````````````````
7211
7212
7213```````````````````````````````` example
7214*foo [*bar*](/url)*
7215.
7216<p><em>foo <a href="/url"><em>bar</em></a></em></p>
7217````````````````````````````````
7218
7219
7220There can be no empty emphasis or strong emphasis:
7221
7222```````````````````````````````` example
7223** is not an empty emphasis
7224.
7225<p>** is not an empty emphasis</p>
7226````````````````````````````````
7227
7228
7229```````````````````````````````` example
7230**** is not an empty strong emphasis
7231.
7232<p>**** is not an empty strong emphasis</p>
7233````````````````````````````````
7234
7235
7236
7237Rule 10:
7238
7239Any nonempty sequence of inline elements can be the contents of an
7240strongly emphasized span.
7241
7242```````````````````````````````` example
7243**foo [bar](/url)**
7244.
7245<p><strong>foo <a href="/url">bar</a></strong></p>
7246````````````````````````````````
7247
7248
7249```````````````````````````````` example
7250**foo
7251bar**
7252.
7253<p><strong>foo
7254bar</strong></p>
7255````````````````````````````````
7256
7257
7258In particular, emphasis and strong emphasis can be nested
7259inside strong emphasis:
7260
7261```````````````````````````````` example
7262__foo _bar_ baz__
7263.
7264<p><strong>foo <em>bar</em> baz</strong></p>
7265````````````````````````````````
7266
7267
7268```````````````````````````````` example
7269__foo __bar__ baz__
7270.
7271<p><strong>foo bar baz</strong></p>
7272````````````````````````````````
7273
7274
7275```````````````````````````````` example
7276____foo__ bar__
7277.
7278<p><strong>foo bar</strong></p>
7279````````````````````````````````
7280
7281
7282```````````````````````````````` example
7283**foo **bar****
7284.
7285<p><strong>foo bar</strong></p>
7286````````````````````````````````
7287
7288
7289```````````````````````````````` example
7290**foo *bar* baz**
7291.
7292<p><strong>foo <em>bar</em> baz</strong></p>
7293````````````````````````````````
7294
7295
7296```````````````````````````````` example
7297**foo*bar*baz**
7298.
7299<p><strong>foo<em>bar</em>baz</strong></p>
7300````````````````````````````````
7301
7302
7303```````````````````````````````` example
7304***foo* bar**
7305.
7306<p><strong><em>foo</em> bar</strong></p>
7307````````````````````````````````
7308
7309
7310```````````````````````````````` example
7311**foo *bar***
7312.
7313<p><strong>foo <em>bar</em></strong></p>
7314````````````````````````````````
7315
7316
7317Indefinite levels of nesting are possible:
7318
7319```````````````````````````````` example
7320**foo *bar **baz**
7321bim* bop**
7322.
7323<p><strong>foo <em>bar <strong>baz</strong>
7324bim</em> bop</strong></p>
7325````````````````````````````````
7326
7327
7328```````````````````````````````` example
7329**foo [*bar*](/url)**
7330.
7331<p><strong>foo <a href="/url"><em>bar</em></a></strong></p>
7332````````````````````````````````
7333
7334
7335There can be no empty emphasis or strong emphasis:
7336
7337```````````````````````````````` example
7338__ is not an empty emphasis
7339.
7340<p>__ is not an empty emphasis</p>
7341````````````````````````````````
7342
7343
7344```````````````````````````````` example
7345____ is not an empty strong emphasis
7346.
7347<p>____ is not an empty strong emphasis</p>
7348````````````````````````````````
7349
7350
7351
7352Rule 11:
7353
7354```````````````````````````````` example
7355foo ***
7356.
7357<p>foo ***</p>
7358````````````````````````````````
7359
7360
7361```````````````````````````````` example
7362foo *\**
7363.
7364<p>foo <em>*</em></p>
7365````````````````````````````````
7366
7367
7368```````````````````````````````` example
7369foo *_*
7370.
7371<p>foo <em>_</em></p>
7372````````````````````````````````
7373
7374
7375```````````````````````````````` example
7376foo *****
7377.
7378<p>foo *****</p>
7379````````````````````````````````
7380
7381
7382```````````````````````````````` example
7383foo **\***
7384.
7385<p>foo <strong>*</strong></p>
7386````````````````````````````````
7387
7388
7389```````````````````````````````` example
7390foo **_**
7391.
7392<p>foo <strong>_</strong></p>
7393````````````````````````````````
7394
7395
7396Note that when delimiters do not match evenly, Rule 11 determines
7397that the excess literal `*` characters will appear outside of the
7398emphasis, rather than inside it:
7399
7400```````````````````````````````` example
7401**foo*
7402.
7403<p>*<em>foo</em></p>
7404````````````````````````````````
7405
7406
7407```````````````````````````````` example
7408*foo**
7409.
7410<p><em>foo</em>*</p>
7411````````````````````````````````
7412
7413
7414```````````````````````````````` example
7415***foo**
7416.
7417<p>*<strong>foo</strong></p>
7418````````````````````````````````
7419
7420
7421```````````````````````````````` example
7422****foo*
7423.
7424<p>***<em>foo</em></p>
7425````````````````````````````````
7426
7427
7428```````````````````````````````` example
7429**foo***
7430.
7431<p><strong>foo</strong>*</p>
7432````````````````````````````````
7433
7434
7435```````````````````````````````` example
7436*foo****
7437.
7438<p><em>foo</em>***</p>
7439````````````````````````````````
7440
7441
7442
7443Rule 12:
7444
7445```````````````````````````````` example
7446foo ___
7447.
7448<p>foo ___</p>
7449````````````````````````````````
7450
7451
7452```````````````````````````````` example
7453foo _\__
7454.
7455<p>foo <em>_</em></p>
7456````````````````````````````````
7457
7458
7459```````````````````````````````` example
7460foo _*_
7461.
7462<p>foo <em>*</em></p>
7463````````````````````````````````
7464
7465
7466```````````````````````````````` example
7467foo _____
7468.
7469<p>foo _____</p>
7470````````````````````````````````
7471
7472
7473```````````````````````````````` example
7474foo __\___
7475.
7476<p>foo <strong>_</strong></p>
7477````````````````````````````````
7478
7479
7480```````````````````````````````` example
7481foo __*__
7482.
7483<p>foo <strong>*</strong></p>
7484````````````````````````````````
7485
7486
7487```````````````````````````````` example
7488__foo_
7489.
7490<p>_<em>foo</em></p>
7491````````````````````````````````
7492
7493
7494Note that when delimiters do not match evenly, Rule 12 determines
7495that the excess literal `_` characters will appear outside of the
7496emphasis, rather than inside it:
7497
7498```````````````````````````````` example
7499_foo__
7500.
7501<p><em>foo</em>_</p>
7502````````````````````````````````
7503
7504
7505```````````````````````````````` example
7506___foo__
7507.
7508<p>_<strong>foo</strong></p>
7509````````````````````````````````
7510
7511
7512```````````````````````````````` example
7513____foo_
7514.
7515<p>___<em>foo</em></p>
7516````````````````````````````````
7517
7518
7519```````````````````````````````` example
7520__foo___
7521.
7522<p><strong>foo</strong>_</p>
7523````````````````````````````````
7524
7525
7526```````````````````````````````` example
7527_foo____
7528.
7529<p><em>foo</em>___</p>
7530````````````````````````````````
7531
7532
7533Rule 13 implies that if you want emphasis nested directly inside
7534emphasis, you must use different delimiters:
7535
7536```````````````````````````````` example
7537**foo**
7538.
7539<p><strong>foo</strong></p>
7540````````````````````````````````
7541
7542
7543```````````````````````````````` example
7544*_foo_*
7545.
7546<p><em><em>foo</em></em></p>
7547````````````````````````````````
7548
7549
7550```````````````````````````````` example
7551__foo__
7552.
7553<p><strong>foo</strong></p>
7554````````````````````````````````
7555
7556
7557```````````````````````````````` example
7558_*foo*_
7559.
7560<p><em><em>foo</em></em></p>
7561````````````````````````````````
7562
7563
7564However, strong emphasis within strong emphasis is possible without
7565switching delimiters:
7566
7567```````````````````````````````` example
7568****foo****
7569.
7570<p><strong>foo</strong></p>
7571````````````````````````````````
7572
7573
7574```````````````````````````````` example
7575____foo____
7576.
7577<p><strong>foo</strong></p>
7578````````````````````````````````
7579
7580
7581
7582Rule 13 can be applied to arbitrarily long sequences of
7583delimiters:
7584
7585```````````````````````````````` example
7586******foo******
7587.
7588<p><strong>foo</strong></p>
7589````````````````````````````````
7590
7591
7592Rule 14:
7593
7594```````````````````````````````` example
7595***foo***
7596.
7597<p><em><strong>foo</strong></em></p>
7598````````````````````````````````
7599
7600
7601```````````````````````````````` example
7602_____foo_____
7603.
7604<p><em><strong>foo</strong></em></p>
7605````````````````````````````````
7606
7607
7608Rule 15:
7609
7610```````````````````````````````` example
7611*foo _bar* baz_
7612.
7613<p><em>foo _bar</em> baz_</p>
7614````````````````````````````````
7615
7616
7617```````````````````````````````` example
7618*foo __bar *baz bim__ bam*
7619.
7620<p><em>foo <strong>bar *baz bim</strong> bam</em></p>
7621````````````````````````````````
7622
7623
7624Rule 16:
7625
7626```````````````````````````````` example
7627**foo **bar baz**
7628.
7629<p>**foo <strong>bar baz</strong></p>
7630````````````````````````````````
7631
7632
7633```````````````````````````````` example
7634*foo *bar baz*
7635.
7636<p>*foo <em>bar baz</em></p>
7637````````````````````````````````
7638
7639
7640Rule 17:
7641
7642```````````````````````````````` example
7643*[bar*](/url)
7644.
7645<p>*<a href="/url">bar*</a></p>
7646````````````````````````````````
7647
7648
7649```````````````````````````````` example
7650_foo [bar_](/url)
7651.
7652<p>_foo <a href="/url">bar_</a></p>
7653````````````````````````````````
7654
7655
7656```````````````````````````````` example
7657*<img src="foo" title="*"/>
7658.
7659<p>*<img src="foo" title="*"/></p>
7660````````````````````````````````
7661
7662
7663```````````````````````````````` example
7664**<a href="**">
7665.
7666<p>**<a href="**"></p>
7667````````````````````````````````
7668
7669
7670```````````````````````````````` example
7671__<a href="__">
7672.
7673<p>__<a href="__"></p>
7674````````````````````````````````
7675
7676
7677```````````````````````````````` example
7678*a `*`*
7679.
7680<p><em>a <code>*</code></em></p>
7681````````````````````````````````
7682
7683
7684```````````````````````````````` example
7685_a `_`_
7686.
7687<p><em>a <code>_</code></em></p>
7688````````````````````````````````
7689
7690
7691```````````````````````````````` example
7692**a<http://foo.bar/?q=**>
7693.
7694<p>**a<a href="http://foo.bar/?q=**">http://foo.bar/?q=**</a></p>
7695````````````````````````````````
7696
7697
7698```````````````````````````````` example
7699__a<http://foo.bar/?q=__>
7700.
7701<p>__a<a href="http://foo.bar/?q=__">http://foo.bar/?q=__</a></p>
7702````````````````````````````````
7703
7704
7705<div class="extension">
7706
7707## Strikethrough (extension)
7708
7709GFM enables the `strikethrough` extension, where an additional emphasis type is
7710available.
7711
7712Strikethrough text is any text wrapped in two tildes (`~`).
7713
7714```````````````````````````````` example strikethrough
7715~~Hi~~ Hello, world!
7716.
7717<p><del>Hi</del> Hello, world!</p>
7718````````````````````````````````
7719
7720As with regular emphasis delimiters, a new paragraph will cause strikethrough
7721parsing to cease:
7722
7723```````````````````````````````` example strikethrough
7724This ~~has a
7725
7726new paragraph~~.
7727.
7728<p>This ~~has a</p>
7729<p>new paragraph~~.</p>
7730````````````````````````````````
7731
7732</div>
7733
7734## Links
7735
7736A link contains [link text] (the visible text), a [link destination]
7737(the URI that is the link destination), and optionally a [link title].
7738There are two basic kinds of links in Markdown.  In [inline links] the
7739destination and title are given immediately after the link text.  In
7740[reference links] the destination and title are defined elsewhere in
7741the document.
7742
7743A [link text](@) consists of a sequence of zero or more
7744inline elements enclosed by square brackets (`[` and `]`).  The
7745following rules apply:
7746
7747- Links may not contain other links, at any level of nesting. If
7748  multiple otherwise valid link definitions appear nested inside each
7749  other, the inner-most definition is used.
7750
7751- Brackets are allowed in the [link text] only if (a) they
7752  are backslash-escaped or (b) they appear as a matched pair of brackets,
7753  with an open bracket `[`, a sequence of zero or more inlines, and
7754  a close bracket `]`.
7755
7756- Backtick [code spans], [autolinks], and raw [HTML tags] bind more tightly
7757  than the brackets in link text.  Thus, for example,
7758  `` [foo`]` `` could not be a link text, since the second `]`
7759  is part of a code span.
7760
7761- The brackets in link text bind more tightly than markers for
7762  [emphasis and strong emphasis]. Thus, for example, `*[foo*](url)` is a link.
7763
7764A [link destination](@) consists of either
7765
7766- a sequence of zero or more characters between an opening `<` and a
7767  closing `>` that contains no line breaks or unescaped
7768  `<` or `>` characters, or
7769
7770- a nonempty sequence of characters that does not start with
7771  `<`, does not include ASCII space or control characters, and
7772  includes parentheses only if (a) they are backslash-escaped or
7773  (b) they are part of a balanced pair of unescaped parentheses.
7774  (Implementations may impose limits on parentheses nesting to
7775  avoid performance issues, but at least three levels of nesting
7776  should be supported.)
7777
7778A [link title](@)  consists of either
7779
7780- a sequence of zero or more characters between straight double-quote
7781  characters (`"`), including a `"` character only if it is
7782  backslash-escaped, or
7783
7784- a sequence of zero or more characters between straight single-quote
7785  characters (`'`), including a `'` character only if it is
7786  backslash-escaped, or
7787
7788- a sequence of zero or more characters between matching parentheses
7789  (`(...)`), including a `(` or `)` character only if it is
7790  backslash-escaped.
7791
7792Although [link titles] may span multiple lines, they may not contain
7793a [blank line].
7794
7795An [inline link](@) consists of a [link text] followed immediately
7796by a left parenthesis `(`, optional [whitespace], an optional
7797[link destination], an optional [link title] separated from the link
7798destination by [whitespace], optional [whitespace], and a right
7799parenthesis `)`. The link's text consists of the inlines contained
7800in the [link text] (excluding the enclosing square brackets).
7801The link's URI consists of the link destination, excluding enclosing
7802`<...>` if present, with backslash-escapes in effect as described
7803above.  The link's title consists of the link title, excluding its
7804enclosing delimiters, with backslash-escapes in effect as described
7805above.
7806
7807Here is a simple inline link:
7808
7809```````````````````````````````` example
7810[link](/uri "title")
7811.
7812<p><a href="/uri" title="title">link</a></p>
7813````````````````````````````````
7814
7815
7816The title may be omitted:
7817
7818```````````````````````````````` example
7819[link](/uri)
7820.
7821<p><a href="/uri">link</a></p>
7822````````````````````````````````
7823
7824
7825Both the title and the destination may be omitted:
7826
7827```````````````````````````````` example
7828[link]()
7829.
7830<p><a href="">link</a></p>
7831````````````````````````````````
7832
7833
7834```````````````````````````````` example
7835[link](<>)
7836.
7837<p><a href="">link</a></p>
7838````````````````````````````````
7839
7840The destination can only contain spaces if it is
7841enclosed in pointy brackets:
7842
7843```````````````````````````````` example
7844[link](/my uri)
7845.
7846<p>[link](/my uri)</p>
7847````````````````````````````````
7848
7849```````````````````````````````` example
7850[link](</my uri>)
7851.
7852<p><a href="/my%20uri">link</a></p>
7853````````````````````````````````
7854
7855The destination cannot contain line breaks,
7856even if enclosed in pointy brackets:
7857
7858```````````````````````````````` example
7859[link](foo
7860bar)
7861.
7862<p>[link](foo
7863bar)</p>
7864````````````````````````````````
7865
7866```````````````````````````````` example
7867[link](<foo
7868bar>)
7869.
7870<p>[link](<foo
7871bar>)</p>
7872````````````````````````````````
7873
7874The destination can contain `)` if it is enclosed
7875in pointy brackets:
7876
7877```````````````````````````````` example
7878[a](<b)c>)
7879.
7880<p><a href="b)c">a</a></p>
7881````````````````````````````````
7882
7883Pointy brackets that enclose links must be unescaped:
7884
7885```````````````````````````````` example
7886[link](<foo\>)
7887.
7888<p>[link](&lt;foo&gt;)</p>
7889````````````````````````````````
7890
7891These are not links, because the opening pointy bracket
7892is not matched properly:
7893
7894```````````````````````````````` example
7895[a](<b)c
7896[a](<b)c>
7897[a](<b>c)
7898.
7899<p>[a](&lt;b)c
7900[a](&lt;b)c&gt;
7901[a](<b>c)</p>
7902````````````````````````````````
7903
7904Parentheses inside the link destination may be escaped:
7905
7906```````````````````````````````` example
7907[link](\(foo\))
7908.
7909<p><a href="(foo)">link</a></p>
7910````````````````````````````````
7911
7912Any number of parentheses are allowed without escaping, as long as they are
7913balanced:
7914
7915```````````````````````````````` example
7916[link](foo(and(bar)))
7917.
7918<p><a href="foo(and(bar))">link</a></p>
7919````````````````````````````````
7920
7921However, if you have unbalanced parentheses, you need to escape or use the
7922`<...>` form:
7923
7924```````````````````````````````` example
7925[link](foo\(and\(bar\))
7926.
7927<p><a href="foo(and(bar)">link</a></p>
7928````````````````````````````````
7929
7930
7931```````````````````````````````` example
7932[link](<foo(and(bar)>)
7933.
7934<p><a href="foo(and(bar)">link</a></p>
7935````````````````````````````````
7936
7937
7938Parentheses and other symbols can also be escaped, as usual
7939in Markdown:
7940
7941```````````````````````````````` example
7942[link](foo\)\:)
7943.
7944<p><a href="foo):">link</a></p>
7945````````````````````````````````
7946
7947
7948A link can contain fragment identifiers and queries:
7949
7950```````````````````````````````` example
7951[link](#fragment)
7952
7953[link](http://example.com#fragment)
7954
7955[link](http://example.com?foo=3#frag)
7956.
7957<p><a href="#fragment">link</a></p>
7958<p><a href="http://example.com#fragment">link</a></p>
7959<p><a href="http://example.com?foo=3#frag">link</a></p>
7960````````````````````````````````
7961
7962
7963Note that a backslash before a non-escapable character is
7964just a backslash:
7965
7966```````````````````````````````` example
7967[link](foo\bar)
7968.
7969<p><a href="foo%5Cbar">link</a></p>
7970````````````````````````````````
7971
7972
7973URL-escaping should be left alone inside the destination, as all
7974URL-escaped characters are also valid URL characters. Entity and
7975numerical character references in the destination will be parsed
7976into the corresponding Unicode code points, as usual.  These may
7977be optionally URL-escaped when written as HTML, but this spec
7978does not enforce any particular policy for rendering URLs in
7979HTML or other formats.  Renderers may make different decisions
7980about how to escape or normalize URLs in the output.
7981
7982```````````````````````````````` example
7983[link](foo%20b&auml;)
7984.
7985<p><a href="foo%20b%C3%A4">link</a></p>
7986````````````````````````````````
7987
7988
7989Note that, because titles can often be parsed as destinations,
7990if you try to omit the destination and keep the title, you'll
7991get unexpected results:
7992
7993```````````````````````````````` example
7994[link]("title")
7995.
7996<p><a href="%22title%22">link</a></p>
7997````````````````````````````````
7998
7999
8000Titles may be in single quotes, double quotes, or parentheses:
8001
8002```````````````````````````````` example
8003[link](/url "title")
8004[link](/url 'title')
8005[link](/url (title))
8006.
8007<p><a href="/url" title="title">link</a>
8008<a href="/url" title="title">link</a>
8009<a href="/url" title="title">link</a></p>
8010````````````````````````````````
8011
8012
8013Backslash escapes and entity and numeric character references
8014may be used in titles:
8015
8016```````````````````````````````` example
8017[link](/url "title \"&quot;")
8018.
8019<p><a href="/url" title="title &quot;&quot;">link</a></p>
8020````````````````````````````````
8021
8022
8023Titles must be separated from the link using a [whitespace].
8024Other [Unicode whitespace] like non-breaking space doesn't work.
8025
8026```````````````````````````````` example
8027[link](/url "title")
8028.
8029<p><a href="/url%C2%A0%22title%22">link</a></p>
8030````````````````````````````````
8031
8032
8033Nested balanced quotes are not allowed without escaping:
8034
8035```````````````````````````````` example
8036[link](/url "title "and" title")
8037.
8038<p>[link](/url &quot;title &quot;and&quot; title&quot;)</p>
8039````````````````````````````````
8040
8041
8042But it is easy to work around this by using a different quote type:
8043
8044```````````````````````````````` example
8045[link](/url 'title "and" title')
8046.
8047<p><a href="/url" title="title &quot;and&quot; title">link</a></p>
8048````````````````````````````````
8049
8050
8051(Note:  `Markdown.pl` did allow double quotes inside a double-quoted
8052title, and its test suite included a test demonstrating this.
8053But it is hard to see a good rationale for the extra complexity this
8054brings, since there are already many ways---backslash escaping,
8055entity and numeric character references, or using a different
8056quote type for the enclosing title---to write titles containing
8057double quotes.  `Markdown.pl`'s handling of titles has a number
8058of other strange features.  For example, it allows single-quoted
8059titles in inline links, but not reference links.  And, in
8060reference links but not inline links, it allows a title to begin
8061with `"` and end with `)`.  `Markdown.pl` 1.0.1 even allows
8062titles with no closing quotation mark, though 1.0.2b8 does not.
8063It seems preferable to adopt a simple, rational rule that works
8064the same way in inline links and link reference definitions.)
8065
8066[Whitespace] is allowed around the destination and title:
8067
8068```````````````````````````````` example
8069[link](   /uri
8070  "title"  )
8071.
8072<p><a href="/uri" title="title">link</a></p>
8073````````````````````````````````
8074
8075
8076But it is not allowed between the link text and the
8077following parenthesis:
8078
8079```````````````````````````````` example
8080[link] (/uri)
8081.
8082<p>[link] (/uri)</p>
8083````````````````````````````````
8084
8085
8086The link text may contain balanced brackets, but not unbalanced ones,
8087unless they are escaped:
8088
8089```````````````````````````````` example
8090[link [foo [bar]]](/uri)
8091.
8092<p><a href="/uri">link [foo [bar]]</a></p>
8093````````````````````````````````
8094
8095
8096```````````````````````````````` example
8097[link] bar](/uri)
8098.
8099<p>[link] bar](/uri)</p>
8100````````````````````````````````
8101
8102
8103```````````````````````````````` example
8104[link [bar](/uri)
8105.
8106<p>[link <a href="/uri">bar</a></p>
8107````````````````````````````````
8108
8109
8110```````````````````````````````` example
8111[link \[bar](/uri)
8112.
8113<p><a href="/uri">link [bar</a></p>
8114````````````````````````````````
8115
8116
8117The link text may contain inline content:
8118
8119```````````````````````````````` example
8120[link *foo **bar** `#`*](/uri)
8121.
8122<p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p>
8123````````````````````````````````
8124
8125
8126```````````````````````````````` example
8127[![moon](moon.jpg)](/uri)
8128.
8129<p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p>
8130````````````````````````````````
8131
8132
8133However, links may not contain other links, at any level of nesting.
8134
8135```````````````````````````````` example
8136[foo [bar](/uri)](/uri)
8137.
8138<p>[foo <a href="/uri">bar</a>](/uri)</p>
8139````````````````````````````````
8140
8141
8142```````````````````````````````` example
8143[foo *[bar [baz](/uri)](/uri)*](/uri)
8144.
8145<p>[foo <em>[bar <a href="/uri">baz</a>](/uri)</em>](/uri)</p>
8146````````````````````````````````
8147
8148
8149```````````````````````````````` example
8150![[[foo](uri1)](uri2)](uri3)
8151.
8152<p><img src="uri3" alt="[foo](uri2)" /></p>
8153````````````````````````````````
8154
8155
8156These cases illustrate the precedence of link text grouping over
8157emphasis grouping:
8158
8159```````````````````````````````` example
8160*[foo*](/uri)
8161.
8162<p>*<a href="/uri">foo*</a></p>
8163````````````````````````````````
8164
8165
8166```````````````````````````````` example
8167[foo *bar](baz*)
8168.
8169<p><a href="baz*">foo *bar</a></p>
8170````````````````````````````````
8171
8172
8173Note that brackets that *aren't* part of links do not take
8174precedence:
8175
8176```````````````````````````````` example
8177*foo [bar* baz]
8178.
8179<p><em>foo [bar</em> baz]</p>
8180````````````````````````````````
8181
8182
8183These cases illustrate the precedence of HTML tags, code spans,
8184and autolinks over link grouping:
8185
8186```````````````````````````````` example
8187[foo <bar attr="](baz)">
8188.
8189<p>[foo <bar attr="](baz)"></p>
8190````````````````````````````````
8191
8192
8193```````````````````````````````` example
8194[foo`](/uri)`
8195.
8196<p>[foo<code>](/uri)</code></p>
8197````````````````````````````````
8198
8199
8200```````````````````````````````` example
8201[foo<http://example.com/?search=](uri)>
8202.
8203<p>[foo<a href="http://example.com/?search=%5D(uri)">http://example.com/?search=](uri)</a></p>
8204````````````````````````````````
8205
8206
8207There are three kinds of [reference link](@)s:
8208[full](#full-reference-link), [collapsed](#collapsed-reference-link),
8209and [shortcut](#shortcut-reference-link).
8210
8211A [full reference link](@)
8212consists of a [link text] immediately followed by a [link label]
8213that [matches] a [link reference definition] elsewhere in the document.
8214
8215A [link label](@)  begins with a left bracket (`[`) and ends
8216with the first right bracket (`]`) that is not backslash-escaped.
8217Between these brackets there must be at least one [non-whitespace character].
8218Unescaped square bracket characters are not allowed inside the
8219opening and closing square brackets of [link labels].  A link
8220label can have at most 999 characters inside the square
8221brackets.
8222
8223One label [matches](@)
8224another just in case their normalized forms are equal.  To normalize a
8225label, strip off the opening and closing brackets,
8226perform the *Unicode case fold*, strip leading and trailing
8227[whitespace] and collapse consecutive internal
8228[whitespace] to a single space.  If there are multiple
8229matching reference link definitions, the one that comes first in the
8230document is used.  (It is desirable in such cases to emit a warning.)
8231
8232The contents of the first link label are parsed as inlines, which are
8233used as the link's text.  The link's URI and title are provided by the
8234matching [link reference definition].
8235
8236Here is a simple example:
8237
8238```````````````````````````````` example
8239[foo][bar]
8240
8241[bar]: /url "title"
8242.
8243<p><a href="/url" title="title">foo</a></p>
8244````````````````````````````````
8245
8246
8247The rules for the [link text] are the same as with
8248[inline links].  Thus:
8249
8250The link text may contain balanced brackets, but not unbalanced ones,
8251unless they are escaped:
8252
8253```````````````````````````````` example
8254[link [foo [bar]]][ref]
8255
8256[ref]: /uri
8257.
8258<p><a href="/uri">link [foo [bar]]</a></p>
8259````````````````````````````````
8260
8261
8262```````````````````````````````` example
8263[link \[bar][ref]
8264
8265[ref]: /uri
8266.
8267<p><a href="/uri">link [bar</a></p>
8268````````````````````````````````
8269
8270
8271The link text may contain inline content:
8272
8273```````````````````````````````` example
8274[link *foo **bar** `#`*][ref]
8275
8276[ref]: /uri
8277.
8278<p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p>
8279````````````````````````````````
8280
8281
8282```````````````````````````````` example
8283[![moon](moon.jpg)][ref]
8284
8285[ref]: /uri
8286.
8287<p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p>
8288````````````````````````````````
8289
8290
8291However, links may not contain other links, at any level of nesting.
8292
8293```````````````````````````````` example
8294[foo [bar](/uri)][ref]
8295
8296[ref]: /uri
8297.
8298<p>[foo <a href="/uri">bar</a>]<a href="/uri">ref</a></p>
8299````````````````````````````````
8300
8301
8302```````````````````````````````` example
8303[foo *bar [baz][ref]*][ref]
8304
8305[ref]: /uri
8306.
8307<p>[foo <em>bar <a href="/uri">baz</a></em>]<a href="/uri">ref</a></p>
8308````````````````````````````````
8309
8310
8311(In the examples above, we have two [shortcut reference links]
8312instead of one [full reference link].)
8313
8314The following cases illustrate the precedence of link text grouping over
8315emphasis grouping:
8316
8317```````````````````````````````` example
8318*[foo*][ref]
8319
8320[ref]: /uri
8321.
8322<p>*<a href="/uri">foo*</a></p>
8323````````````````````````````````
8324
8325
8326```````````````````````````````` example
8327[foo *bar][ref]
8328
8329[ref]: /uri
8330.
8331<p><a href="/uri">foo *bar</a></p>
8332````````````````````````````````
8333
8334
8335These cases illustrate the precedence of HTML tags, code spans,
8336and autolinks over link grouping:
8337
8338```````````````````````````````` example
8339[foo <bar attr="][ref]">
8340
8341[ref]: /uri
8342.
8343<p>[foo <bar attr="][ref]"></p>
8344````````````````````````````````
8345
8346
8347```````````````````````````````` example
8348[foo`][ref]`
8349
8350[ref]: /uri
8351.
8352<p>[foo<code>][ref]</code></p>
8353````````````````````````````````
8354
8355
8356```````````````````````````````` example
8357[foo<http://example.com/?search=][ref]>
8358
8359[ref]: /uri
8360.
8361<p>[foo<a href="http://example.com/?search=%5D%5Bref%5D">http://example.com/?search=][ref]</a></p>
8362````````````````````````````````
8363
8364
8365Matching is case-insensitive:
8366
8367```````````````````````````````` example
8368[foo][BaR]
8369
8370[bar]: /url "title"
8371.
8372<p><a href="/url" title="title">foo</a></p>
8373````````````````````````````````
8374
8375
8376Unicode case fold is used:
8377
8378```````````````````````````````` example
8379[Толпой][Толпой] is a Russian word.
8380
8381[ТОЛПОЙ]: /url
8382.
8383<p><a href="/url">Толпой</a> is a Russian word.</p>
8384````````````````````````````````
8385
8386
8387Consecutive internal [whitespace] is treated as one space for
8388purposes of determining matching:
8389
8390```````````````````````````````` example
8391[Foo
8392  bar]: /url
8393
8394[Baz][Foo bar]
8395.
8396<p><a href="/url">Baz</a></p>
8397````````````````````````````````
8398
8399
8400No [whitespace] is allowed between the [link text] and the
8401[link label]:
8402
8403```````````````````````````````` example
8404[foo] [bar]
8405
8406[bar]: /url "title"
8407.
8408<p>[foo] <a href="/url" title="title">bar</a></p>
8409````````````````````````````````
8410
8411
8412```````````````````````````````` example
8413[foo]
8414[bar]
8415
8416[bar]: /url "title"
8417.
8418<p>[foo]
8419<a href="/url" title="title">bar</a></p>
8420````````````````````````````````
8421
8422
8423This is a departure from John Gruber's original Markdown syntax
8424description, which explicitly allows whitespace between the link
8425text and the link label.  It brings reference links in line with
8426[inline links], which (according to both original Markdown and
8427this spec) cannot have whitespace after the link text.  More
8428importantly, it prevents inadvertent capture of consecutive
8429[shortcut reference links]. If whitespace is allowed between the
8430link text and the link label, then in the following we will have
8431a single reference link, not two shortcut reference links, as
8432intended:
8433
8434``` markdown
8435[foo]
8436[bar]
8437
8438[foo]: /url1
8439[bar]: /url2
8440```
8441
8442(Note that [shortcut reference links] were introduced by Gruber
8443himself in a beta version of `Markdown.pl`, but never included
8444in the official syntax description.  Without shortcut reference
8445links, it is harmless to allow space between the link text and
8446link label; but once shortcut references are introduced, it is
8447too dangerous to allow this, as it frequently leads to
8448unintended results.)
8449
8450When there are multiple matching [link reference definitions],
8451the first is used:
8452
8453```````````````````````````````` example
8454[foo]: /url1
8455
8456[foo]: /url2
8457
8458[bar][foo]
8459.
8460<p><a href="/url1">bar</a></p>
8461````````````````````````````````
8462
8463
8464Note that matching is performed on normalized strings, not parsed
8465inline content.  So the following does not match, even though the
8466labels define equivalent inline content:
8467
8468```````````````````````````````` example
8469[bar][foo\!]
8470
8471[foo!]: /url
8472.
8473<p>[bar][foo!]</p>
8474````````````````````````````````
8475
8476
8477[Link labels] cannot contain brackets, unless they are
8478backslash-escaped:
8479
8480```````````````````````````````` example
8481[foo][ref[]
8482
8483[ref[]: /uri
8484.
8485<p>[foo][ref[]</p>
8486<p>[ref[]: /uri</p>
8487````````````````````````````````
8488
8489
8490```````````````````````````````` example
8491[foo][ref[bar]]
8492
8493[ref[bar]]: /uri
8494.
8495<p>[foo][ref[bar]]</p>
8496<p>[ref[bar]]: /uri</p>
8497````````````````````````````````
8498
8499
8500```````````````````````````````` example
8501[[[foo]]]
8502
8503[[[foo]]]: /url
8504.
8505<p>[[[foo]]]</p>
8506<p>[[[foo]]]: /url</p>
8507````````````````````````````````
8508
8509
8510```````````````````````````````` example
8511[foo][ref\[]
8512
8513[ref\[]: /uri
8514.
8515<p><a href="/uri">foo</a></p>
8516````````````````````````````````
8517
8518
8519Note that in this example `]` is not backslash-escaped:
8520
8521```````````````````````````````` example
8522[bar\\]: /uri
8523
8524[bar\\]
8525.
8526<p><a href="/uri">bar\</a></p>
8527````````````````````````````````
8528
8529
8530A [link label] must contain at least one [non-whitespace character]:
8531
8532```````````````````````````````` example
8533[]
8534
8535[]: /uri
8536.
8537<p>[]</p>
8538<p>[]: /uri</p>
8539````````````````````````````````
8540
8541
8542```````````````````````````````` example
8543[
8544 ]
8545
8546[
8547 ]: /uri
8548.
8549<p>[
8550]</p>
8551<p>[
8552]: /uri</p>
8553````````````````````````````````
8554
8555
8556A [collapsed reference link](@)
8557consists of a [link label] that [matches] a
8558[link reference definition] elsewhere in the
8559document, followed by the string `[]`.
8560The contents of the first link label are parsed as inlines,
8561which are used as the link's text.  The link's URI and title are
8562provided by the matching reference link definition.  Thus,
8563`[foo][]` is equivalent to `[foo][foo]`.
8564
8565```````````````````````````````` example
8566[foo][]
8567
8568[foo]: /url "title"
8569.
8570<p><a href="/url" title="title">foo</a></p>
8571````````````````````````````````
8572
8573
8574```````````````````````````````` example
8575[*foo* bar][]
8576
8577[*foo* bar]: /url "title"
8578.
8579<p><a href="/url" title="title"><em>foo</em> bar</a></p>
8580````````````````````````````````
8581
8582
8583The link labels are case-insensitive:
8584
8585```````````````````````````````` example
8586[Foo][]
8587
8588[foo]: /url "title"
8589.
8590<p><a href="/url" title="title">Foo</a></p>
8591````````````````````````````````
8592
8593
8594
8595As with full reference links, [whitespace] is not
8596allowed between the two sets of brackets:
8597
8598```````````````````````````````` example
8599[foo]
8600[]
8601
8602[foo]: /url "title"
8603.
8604<p><a href="/url" title="title">foo</a>
8605[]</p>
8606````````````````````````````````
8607
8608
8609A [shortcut reference link](@)
8610consists of a [link label] that [matches] a
8611[link reference definition] elsewhere in the
8612document and is not followed by `[]` or a link label.
8613The contents of the first link label are parsed as inlines,
8614which are used as the link's text.  The link's URI and title
8615are provided by the matching link reference definition.
8616Thus, `[foo]` is equivalent to `[foo][]`.
8617
8618```````````````````````````````` example
8619[foo]
8620
8621[foo]: /url "title"
8622.
8623<p><a href="/url" title="title">foo</a></p>
8624````````````````````````````````
8625
8626
8627```````````````````````````````` example
8628[*foo* bar]
8629
8630[*foo* bar]: /url "title"
8631.
8632<p><a href="/url" title="title"><em>foo</em> bar</a></p>
8633````````````````````````````````
8634
8635
8636```````````````````````````````` example
8637[[*foo* bar]]
8638
8639[*foo* bar]: /url "title"
8640.
8641<p>[<a href="/url" title="title"><em>foo</em> bar</a>]</p>
8642````````````````````````````````
8643
8644
8645```````````````````````````````` example
8646[[bar [foo]
8647
8648[foo]: /url
8649.
8650<p>[[bar <a href="/url">foo</a></p>
8651````````````````````````````````
8652
8653
8654The link labels are case-insensitive:
8655
8656```````````````````````````````` example
8657[Foo]
8658
8659[foo]: /url "title"
8660.
8661<p><a href="/url" title="title">Foo</a></p>
8662````````````````````````````````
8663
8664
8665A space after the link text should be preserved:
8666
8667```````````````````````````````` example
8668[foo] bar
8669
8670[foo]: /url
8671.
8672<p><a href="/url">foo</a> bar</p>
8673````````````````````````````````
8674
8675
8676If you just want bracketed text, you can backslash-escape the
8677opening bracket to avoid links:
8678
8679```````````````````````````````` example
8680\[foo]
8681
8682[foo]: /url "title"
8683.
8684<p>[foo]</p>
8685````````````````````````````````
8686
8687
8688Note that this is a link, because a link label ends with the first
8689following closing bracket:
8690
8691```````````````````````````````` example
8692[foo*]: /url
8693
8694*[foo*]
8695.
8696<p>*<a href="/url">foo*</a></p>
8697````````````````````````````````
8698
8699
8700Full and compact references take precedence over shortcut
8701references:
8702
8703```````````````````````````````` example
8704[foo][bar]
8705
8706[foo]: /url1
8707[bar]: /url2
8708.
8709<p><a href="/url2">foo</a></p>
8710````````````````````````````````
8711
8712```````````````````````````````` example
8713[foo][]
8714
8715[foo]: /url1
8716.
8717<p><a href="/url1">foo</a></p>
8718````````````````````````````````
8719
8720Inline links also take precedence:
8721
8722```````````````````````````````` example
8723[foo]()
8724
8725[foo]: /url1
8726.
8727<p><a href="">foo</a></p>
8728````````````````````````````````
8729
8730```````````````````````````````` example
8731[foo](not a link)
8732
8733[foo]: /url1
8734.
8735<p><a href="/url1">foo</a>(not a link)</p>
8736````````````````````````````````
8737
8738In the following case `[bar][baz]` is parsed as a reference,
8739`[foo]` as normal text:
8740
8741```````````````````````````````` example
8742[foo][bar][baz]
8743
8744[baz]: /url
8745.
8746<p>[foo]<a href="/url">bar</a></p>
8747````````````````````````````````
8748
8749
8750Here, though, `[foo][bar]` is parsed as a reference, since
8751`[bar]` is defined:
8752
8753```````````````````````````````` example
8754[foo][bar][baz]
8755
8756[baz]: /url1
8757[bar]: /url2
8758.
8759<p><a href="/url2">foo</a><a href="/url1">baz</a></p>
8760````````````````````````````````
8761
8762
8763Here `[foo]` is not parsed as a shortcut reference, because it
8764is followed by a link label (even though `[bar]` is not defined):
8765
8766```````````````````````````````` example
8767[foo][bar][baz]
8768
8769[baz]: /url1
8770[foo]: /url2
8771.
8772<p>[foo]<a href="/url1">bar</a></p>
8773````````````````````````````````
8774
8775
8776
8777## Images
8778
8779Syntax for images is like the syntax for links, with one
8780difference. Instead of [link text], we have an
8781[image description](@).  The rules for this are the
8782same as for [link text], except that (a) an
8783image description starts with `![` rather than `[`, and
8784(b) an image description may contain links.
8785An image description has inline elements
8786as its contents.  When an image is rendered to HTML,
8787this is standardly used as the image's `alt` attribute.
8788
8789```````````````````````````````` example
8790![foo](/url "title")
8791.
8792<p><img src="/url" alt="foo" title="title" /></p>
8793````````````````````````````````
8794
8795
8796```````````````````````````````` example
8797![foo *bar*]
8798
8799[foo *bar*]: train.jpg "train & tracks"
8800.
8801<p><img src="train.jpg" alt="foo bar" title="train &amp; tracks" /></p>
8802````````````````````````````````
8803
8804
8805```````````````````````````````` example
8806![foo ![bar](/url)](/url2)
8807.
8808<p><img src="/url2" alt="foo bar" /></p>
8809````````````````````````````````
8810
8811
8812```````````````````````````````` example
8813![foo [bar](/url)](/url2)
8814.
8815<p><img src="/url2" alt="foo bar" /></p>
8816````````````````````````````````
8817
8818
8819Though this spec is concerned with parsing, not rendering, it is
8820recommended that in rendering to HTML, only the plain string content
8821of the [image description] be used.  Note that in
8822the above example, the alt attribute's value is `foo bar`, not `foo
8823[bar](/url)` or `foo <a href="/url">bar</a>`.  Only the plain string
8824content is rendered, without formatting.
8825
8826```````````````````````````````` example
8827![foo *bar*][]
8828
8829[foo *bar*]: train.jpg "train & tracks"
8830.
8831<p><img src="train.jpg" alt="foo bar" title="train &amp; tracks" /></p>
8832````````````````````````````````
8833
8834
8835```````````````````````````````` example
8836![foo *bar*][foobar]
8837
8838[FOOBAR]: train.jpg "train & tracks"
8839.
8840<p><img src="train.jpg" alt="foo bar" title="train &amp; tracks" /></p>
8841````````````````````````````````
8842
8843
8844```````````````````````````````` example
8845![foo](train.jpg)
8846.
8847<p><img src="train.jpg" alt="foo" /></p>
8848````````````````````````````````
8849
8850
8851```````````````````````````````` example
8852My ![foo bar](/path/to/train.jpg  "title"   )
8853.
8854<p>My <img src="/path/to/train.jpg" alt="foo bar" title="title" /></p>
8855````````````````````````````````
8856
8857
8858```````````````````````````````` example
8859![foo](<url>)
8860.
8861<p><img src="url" alt="foo" /></p>
8862````````````````````````````````
8863
8864
8865```````````````````````````````` example
8866![](/url)
8867.
8868<p><img src="/url" alt="" /></p>
8869````````````````````````````````
8870
8871
8872Reference-style:
8873
8874```````````````````````````````` example
8875![foo][bar]
8876
8877[bar]: /url
8878.
8879<p><img src="/url" alt="foo" /></p>
8880````````````````````````````````
8881
8882
8883```````````````````````````````` example
8884![foo][bar]
8885
8886[BAR]: /url
8887.
8888<p><img src="/url" alt="foo" /></p>
8889````````````````````````````````
8890
8891
8892Collapsed:
8893
8894```````````````````````````````` example
8895![foo][]
8896
8897[foo]: /url "title"
8898.
8899<p><img src="/url" alt="foo" title="title" /></p>
8900````````````````````````````````
8901
8902
8903```````````````````````````````` example
8904![*foo* bar][]
8905
8906[*foo* bar]: /url "title"
8907.
8908<p><img src="/url" alt="foo bar" title="title" /></p>
8909````````````````````````````````
8910
8911
8912The labels are case-insensitive:
8913
8914```````````````````````````````` example
8915![Foo][]
8916
8917[foo]: /url "title"
8918.
8919<p><img src="/url" alt="Foo" title="title" /></p>
8920````````````````````````````````
8921
8922
8923As with reference links, [whitespace] is not allowed
8924between the two sets of brackets:
8925
8926```````````````````````````````` example
8927![foo]
8928[]
8929
8930[foo]: /url "title"
8931.
8932<p><img src="/url" alt="foo" title="title" />
8933[]</p>
8934````````````````````````````````
8935
8936
8937Shortcut:
8938
8939```````````````````````````````` example
8940![foo]
8941
8942[foo]: /url "title"
8943.
8944<p><img src="/url" alt="foo" title="title" /></p>
8945````````````````````````````````
8946
8947
8948```````````````````````````````` example
8949![*foo* bar]
8950
8951[*foo* bar]: /url "title"
8952.
8953<p><img src="/url" alt="foo bar" title="title" /></p>
8954````````````````````````````````
8955
8956
8957Note that link labels cannot contain unescaped brackets:
8958
8959```````````````````````````````` example
8960![[foo]]
8961
8962[[foo]]: /url "title"
8963.
8964<p>![[foo]]</p>
8965<p>[[foo]]: /url &quot;title&quot;</p>
8966````````````````````````````````
8967
8968
8969The link labels are case-insensitive:
8970
8971```````````````````````````````` example
8972![Foo]
8973
8974[foo]: /url "title"
8975.
8976<p><img src="/url" alt="Foo" title="title" /></p>
8977````````````````````````````````
8978
8979
8980If you just want a literal `!` followed by bracketed text, you can
8981backslash-escape the opening `[`:
8982
8983```````````````````````````````` example
8984!\[foo]
8985
8986[foo]: /url "title"
8987.
8988<p>![foo]</p>
8989````````````````````````````````
8990
8991
8992If you want a link after a literal `!`, backslash-escape the
8993`!`:
8994
8995```````````````````````````````` example
8996\![foo]
8997
8998[foo]: /url "title"
8999.
9000<p>!<a href="/url" title="title">foo</a></p>
9001````````````````````````````````
9002
9003
9004## Autolinks
9005
9006[Autolink](@)s are absolute URIs and email addresses inside
9007`<` and `>`. They are parsed as links, with the URL or email address
9008as the link label.
9009
9010A [URI autolink](@) consists of `<`, followed by an
9011[absolute URI] followed by `>`.  It is parsed as
9012a link to the URI, with the URI as the link's label.
9013
9014An [absolute URI](@),
9015for these purposes, consists of a [scheme] followed by a colon (`:`)
9016followed by zero or more characters other than ASCII
9017[whitespace] and control characters, `<`, and `>`.  If
9018the URI includes these characters, they must be percent-encoded
9019(e.g. `%20` for a space).
9020
9021For purposes of this spec, a [scheme](@) is any sequence
9022of 2--32 characters beginning with an ASCII letter and followed
9023by any combination of ASCII letters, digits, or the symbols plus
9024("+"), period ("."), or hyphen ("-").
9025
9026Here are some valid autolinks:
9027
9028```````````````````````````````` example
9029<http://foo.bar.baz>
9030.
9031<p><a href="http://foo.bar.baz">http://foo.bar.baz</a></p>
9032````````````````````````````````
9033
9034
9035```````````````````````````````` example
9036<http://foo.bar.baz/test?q=hello&id=22&boolean>
9037.
9038<p><a href="http://foo.bar.baz/test?q=hello&amp;id=22&amp;boolean">http://foo.bar.baz/test?q=hello&amp;id=22&amp;boolean</a></p>
9039````````````````````````````````
9040
9041
9042```````````````````````````````` example
9043<irc://foo.bar:2233/baz>
9044.
9045<p><a href="irc://foo.bar:2233/baz">irc://foo.bar:2233/baz</a></p>
9046````````````````````````````````
9047
9048
9049Uppercase is also fine:
9050
9051```````````````````````````````` example
9052<MAILTO:FOO@BAR.BAZ>
9053.
9054<p><a href="MAILTO:FOO@BAR.BAZ">MAILTO:FOO@BAR.BAZ</a></p>
9055````````````````````````````````
9056
9057
9058Note that many strings that count as [absolute URIs] for
9059purposes of this spec are not valid URIs, because their
9060schemes are not registered or because of other problems
9061with their syntax:
9062
9063```````````````````````````````` example
9064<a+b+c:d>
9065.
9066<p><a href="a+b+c:d">a+b+c:d</a></p>
9067````````````````````````````````
9068
9069
9070```````````````````````````````` example
9071<made-up-scheme://foo,bar>
9072.
9073<p><a href="made-up-scheme://foo,bar">made-up-scheme://foo,bar</a></p>
9074````````````````````````````````
9075
9076
9077```````````````````````````````` example
9078<http://../>
9079.
9080<p><a href="http://../">http://../</a></p>
9081````````````````````````````````
9082
9083
9084```````````````````````````````` example
9085<localhost:5001/foo>
9086.
9087<p><a href="localhost:5001/foo">localhost:5001/foo</a></p>
9088````````````````````````````````
9089
9090
9091Spaces are not allowed in autolinks:
9092
9093```````````````````````````````` example
9094<http://foo.bar/baz bim>
9095.
9096<p>&lt;http://foo.bar/baz bim&gt;</p>
9097````````````````````````````````
9098
9099
9100Backslash-escapes do not work inside autolinks:
9101
9102```````````````````````````````` example
9103<http://example.com/\[\>
9104.
9105<p><a href="http://example.com/%5C%5B%5C">http://example.com/\[\</a></p>
9106````````````````````````````````
9107
9108
9109An [email autolink](@)
9110consists of `<`, followed by an [email address],
9111followed by `>`.  The link's label is the email address,
9112and the URL is `mailto:` followed by the email address.
9113
9114An [email address](@),
9115for these purposes, is anything that matches
9116the [non-normative regex from the HTML5
9117spec](https://html.spec.whatwg.org/multipage/forms.html#e-mail-state-(type=email)):
9118
9119    /^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?
9120    (?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/
9121
9122Examples of email autolinks:
9123
9124```````````````````````````````` example
9125<foo@bar.example.com>
9126.
9127<p><a href="mailto:foo@bar.example.com">foo@bar.example.com</a></p>
9128````````````````````````````````
9129
9130
9131```````````````````````````````` example
9132<foo+special@Bar.baz-bar0.com>
9133.
9134<p><a href="mailto:foo+special@Bar.baz-bar0.com">foo+special@Bar.baz-bar0.com</a></p>
9135````````````````````````````````
9136
9137
9138Backslash-escapes do not work inside email autolinks:
9139
9140```````````````````````````````` example
9141<foo\+@bar.example.com>
9142.
9143<p>&lt;foo+@bar.example.com&gt;</p>
9144````````````````````````````````
9145
9146
9147These are not autolinks:
9148
9149```````````````````````````````` example
9150<>
9151.
9152<p>&lt;&gt;</p>
9153````````````````````````````````
9154
9155
9156```````````````````````````````` example
9157< http://foo.bar >
9158.
9159<p>&lt; http://foo.bar &gt;</p>
9160````````````````````````````````
9161
9162
9163```````````````````````````````` example
9164<m:abc>
9165.
9166<p>&lt;m:abc&gt;</p>
9167````````````````````````````````
9168
9169
9170```````````````````````````````` example
9171<foo.bar.baz>
9172.
9173<p>&lt;foo.bar.baz&gt;</p>
9174````````````````````````````````
9175
9176
9177```````````````````````````````` example
9178http://example.com
9179.
9180<p>http://example.com</p>
9181````````````````````````````````
9182
9183
9184```````````````````````````````` example
9185foo@bar.example.com
9186.
9187<p>foo@bar.example.com</p>
9188````````````````````````````````
9189
9190<div class="extension">
9191
9192## Autolinks (extension)
9193
9194GFM enables the `autolink` extension, where autolinks will be recognised in a
9195greater number of conditions.
9196
9197[Autolink]s can also be constructed without requiring the use of `<` and to `>`
9198to delimit them, although they will be recognized under a smaller set of
9199circumstances.  All such recognized autolinks can only come at the beginning of
9200a line, after whitespace, or any of the delimiting characters `*`, `_`, `~`,
9201and `(`.
9202
9203An [extended www autolink](@) will be recognized
9204when the text `www.` is found followed by a [valid domain].
9205A [valid domain](@) consists of segments
9206of alphanumeric characters, underscores (`_`) and hyphens (`-`)
9207separated by periods (`.`).
9208There must be at least one period,
9209and no underscores may be present in the last two segments of the domain.
9210
9211The scheme `http` will be inserted automatically:
9212
9213```````````````````````````````` example autolink
9214www.commonmark.org
9215.
9216<p><a href="http://www.commonmark.org">www.commonmark.org</a></p>
9217````````````````````````````````
9218
9219After a [valid domain], zero or more non-space non-`<` characters may follow:
9220
9221```````````````````````````````` example autolink
9222Visit www.commonmark.org/help for more information.
9223.
9224<p>Visit <a href="http://www.commonmark.org/help">www.commonmark.org/help</a> for more information.</p>
9225````````````````````````````````
9226
9227We then apply [extended autolink path validation](@) as follows:
9228
9229Trailing punctuation (specifically, `?`, `!`, `.`, `,`, `:`, `*`, `_`, and `~`)
9230will not be considered part of the autolink, though they may be included in the
9231interior of the link:
9232
9233```````````````````````````````` example autolink
9234Visit www.commonmark.org.
9235
9236Visit www.commonmark.org/a.b.
9237.
9238<p>Visit <a href="http://www.commonmark.org">www.commonmark.org</a>.</p>
9239<p>Visit <a href="http://www.commonmark.org/a.b">www.commonmark.org/a.b</a>.</p>
9240````````````````````````````````
9241
9242When an autolink ends in `)`, we scan the entire autolink for the total number
9243of parentheses.  If there is a greater number of closing parentheses than
9244opening ones, we don't consider the unmatched trailing parentheses part of the
9245autolink, in order to facilitate including an autolink inside a parenthesis:
9246
9247```````````````````````````````` example autolink
9248www.google.com/search?q=Markup+(business)
9249
9250www.google.com/search?q=Markup+(business)))
9251
9252(www.google.com/search?q=Markup+(business))
9253
9254(www.google.com/search?q=Markup+(business)
9255.
9256<p><a href="http://www.google.com/search?q=Markup+(business)">www.google.com/search?q=Markup+(business)</a></p>
9257<p><a href="http://www.google.com/search?q=Markup+(business)">www.google.com/search?q=Markup+(business)</a>))</p>
9258<p>(<a href="http://www.google.com/search?q=Markup+(business)">www.google.com/search?q=Markup+(business)</a>)</p>
9259<p>(<a href="http://www.google.com/search?q=Markup+(business)">www.google.com/search?q=Markup+(business)</a></p>
9260````````````````````````````````
9261
9262This check is only done when the link ends in a closing parentheses `)`, so if
9263the only parentheses are in the interior of the autolink, no special rules are
9264applied:
9265
9266```````````````````````````````` example autolink
9267www.google.com/search?q=(business))+ok
9268.
9269<p><a href="http://www.google.com/search?q=(business))+ok">www.google.com/search?q=(business))+ok</a></p>
9270````````````````````````````````
9271
9272If an autolink ends in a semicolon (`;`), we check to see if it appears to
9273resemble an [entity reference][entity references]; if the preceding text is `&`
9274followed by one or more alphanumeric characters.  If so, it is excluded from
9275the autolink:
9276
9277```````````````````````````````` example autolink
9278www.google.com/search?q=commonmark&hl=en
9279
9280www.google.com/search?q=commonmark&hl;
9281.
9282<p><a href="http://www.google.com/search?q=commonmark&amp;hl=en">www.google.com/search?q=commonmark&amp;hl=en</a></p>
9283<p><a href="http://www.google.com/search?q=commonmark">www.google.com/search?q=commonmark</a>&amp;hl;</p>
9284````````````````````````````````
9285
9286`<` immediately ends an autolink.
9287
9288```````````````````````````````` example autolink
9289www.commonmark.org/he<lp
9290.
9291<p><a href="http://www.commonmark.org/he">www.commonmark.org/he</a>&lt;lp</p>
9292````````````````````````````````
9293
9294An [extended url autolink](@) will be recognised when one of the schemes
9295`http://`, `https://`, or `ftp://`, followed by a [valid domain], then zero or
9296more non-space non-`<` characters according to
9297[extended autolink path validation]:
9298
9299```````````````````````````````` example autolink
9300http://commonmark.org
9301
9302(Visit https://encrypted.google.com/search?q=Markup+(business))
9303
9304Anonymous FTP is available at ftp://foo.bar.baz.
9305.
9306<p><a href="http://commonmark.org">http://commonmark.org</a></p>
9307<p>(Visit <a href="https://encrypted.google.com/search?q=Markup+(business)">https://encrypted.google.com/search?q=Markup+(business)</a>)</p>
9308<p>Anonymous FTP is available at <a href="ftp://foo.bar.baz">ftp://foo.bar.baz</a>.</p>
9309````````````````````````````````
9310
9311
9312An [extended email autolink](@) will be recognised when an email address is
9313recognised within any text node.  Email addresses are recognised according to
9314the following rules:
9315
9316* One ore more characters which are alphanumeric, or `.`, `-`, `_`, or `+`.
9317* An `@` symbol.
9318* One or more characters which are alphanumeric, or `-` or `_`,
9319  separated by periods (`.`).
9320  There must be at least one period.
9321  The last character must not be one of `-` or `_`.
9322
9323The scheme `mailto:` will automatically be added to the generated link:
9324
9325```````````````````````````````` example autolink
9326foo@bar.baz
9327.
9328<p><a href="mailto:foo@bar.baz">foo@bar.baz</a></p>
9329````````````````````````````````
9330
9331`+` can occur before the `@`, but not after.
9332
9333```````````````````````````````` example autolink
9334hello@mail+xyz.example isn't valid, but hello+xyz@mail.example is.
9335.
9336<p>hello@mail+xyz.example isn't valid, but <a href="mailto:hello+xyz@mail.example">hello+xyz@mail.example</a> is.</p>
9337````````````````````````````````
9338
9339`.`, `-`, and `_` can occur on both sides of the `@`, but only `.` may occur at
9340the end of the email address, in which case it will not be considered part of
9341the address:
9342
9343```````````````````````````````` example autolink
9344a.b-c_d@a.b
9345
9346a.b-c_d@a.b.
9347
9348a.b-c_d@a.b-
9349
9350a.b-c_d@a.b_
9351.
9352<p><a href="mailto:a.b-c_d@a.b">a.b-c_d@a.b</a></p>
9353<p><a href="mailto:a.b-c_d@a.b">a.b-c_d@a.b</a>.</p>
9354<p>a.b-c_d@a.b-</p>
9355<p>a.b-c_d@a.b_</p>
9356````````````````````````````````
9357
9358</div>
9359
9360## Raw HTML
9361
9362Text between `<` and `>` that looks like an HTML tag is parsed as a
9363raw HTML tag and will be rendered in HTML without escaping.
9364Tag and attribute names are not limited to current HTML tags,
9365so custom tags (and even, say, DocBook tags) may be used.
9366
9367Here is the grammar for tags:
9368
9369A [tag name](@) consists of an ASCII letter
9370followed by zero or more ASCII letters, digits, or
9371hyphens (`-`).
9372
9373An [attribute](@) consists of [whitespace],
9374an [attribute name], and an optional
9375[attribute value specification].
9376
9377An [attribute name](@)
9378consists of an ASCII letter, `_`, or `:`, followed by zero or more ASCII
9379letters, digits, `_`, `.`, `:`, or `-`.  (Note:  This is the XML
9380specification restricted to ASCII.  HTML5 is laxer.)
9381
9382An [attribute value specification](@)
9383consists of optional [whitespace],
9384a `=` character, optional [whitespace], and an [attribute
9385value].
9386
9387An [attribute value](@)
9388consists of an [unquoted attribute value],
9389a [single-quoted attribute value], or a [double-quoted attribute value].
9390
9391An [unquoted attribute value](@)
9392is a nonempty string of characters not
9393including [whitespace], `"`, `'`, `=`, `<`, `>`, or `` ` ``.
9394
9395A [single-quoted attribute value](@)
9396consists of `'`, zero or more
9397characters not including `'`, and a final `'`.
9398
9399A [double-quoted attribute value](@)
9400consists of `"`, zero or more
9401characters not including `"`, and a final `"`.
9402
9403An [open tag](@) consists of a `<` character, a [tag name],
9404zero or more [attributes], optional [whitespace], an optional `/`
9405character, and a `>` character.
9406
9407A [closing tag](@) consists of the string `</`, a
9408[tag name], optional [whitespace], and the character `>`.
9409
9410An [HTML comment](@) consists of `<!-->`, `<!--->`, or  `<!--`, a string of
9411characters not including the string `-->`, and `-->` (see the
9412[HTML spec](https://html.spec.whatwg.org/multipage/parsing.html#markup-declaration-open-state)).
9413
9414A [processing instruction](@)
9415consists of the string `<?`, a string
9416of characters not including the string `?>`, and the string
9417`?>`.
9418
9419A [declaration](@) consists of the
9420string `<!`, a name consisting of one or more uppercase ASCII letters,
9421[whitespace], a string of characters not including the
9422character `>`, and the character `>`.
9423
9424A [CDATA section](@) consists of
9425the string `<![CDATA[`, a string of characters not including the string
9426`]]>`, and the string `]]>`.
9427
9428An [HTML tag](@) consists of an [open tag], a [closing tag],
9429an [HTML comment], a [processing instruction], a [declaration],
9430or a [CDATA section].
9431
9432Here are some simple open tags:
9433
9434```````````````````````````````` example
9435<a><bab><c2c>
9436.
9437<p><a><bab><c2c></p>
9438````````````````````````````````
9439
9440
9441Empty elements:
9442
9443```````````````````````````````` example
9444<a/><b2/>
9445.
9446<p><a/><b2/></p>
9447````````````````````````````````
9448
9449
9450[Whitespace] is allowed:
9451
9452```````````````````````````````` example
9453<a  /><b2
9454data="foo" >
9455.
9456<p><a  /><b2
9457data="foo" ></p>
9458````````````````````````````````
9459
9460
9461With attributes:
9462
9463```````````````````````````````` example
9464<a foo="bar" bam = 'baz <em>"</em>'
9465_boolean zoop:33=zoop:33 />
9466.
9467<p><a foo="bar" bam = 'baz <em>"</em>'
9468_boolean zoop:33=zoop:33 /></p>
9469````````````````````````````````
9470
9471
9472Custom tag names can be used:
9473
9474```````````````````````````````` example
9475Foo <responsive-image src="foo.jpg" />
9476.
9477<p>Foo <responsive-image src="foo.jpg" /></p>
9478````````````````````````````````
9479
9480
9481Illegal tag names, not parsed as HTML:
9482
9483```````````````````````````````` example
9484<33> <__>
9485.
9486<p>&lt;33&gt; &lt;__&gt;</p>
9487````````````````````````````````
9488
9489
9490Illegal attribute names:
9491
9492```````````````````````````````` example
9493<a h*#ref="hi">
9494.
9495<p>&lt;a h*#ref=&quot;hi&quot;&gt;</p>
9496````````````````````````````````
9497
9498
9499Illegal attribute values:
9500
9501```````````````````````````````` example
9502<a href="hi'> <a href=hi'>
9503.
9504<p>&lt;a href=&quot;hi'&gt; &lt;a href=hi'&gt;</p>
9505````````````````````````````````
9506
9507
9508Illegal [whitespace]:
9509
9510```````````````````````````````` example
9511< a><
9512foo><bar/ >
9513<foo bar=baz
9514bim!bop />
9515.
9516<p>&lt; a&gt;&lt;
9517foo&gt;&lt;bar/ &gt;
9518&lt;foo bar=baz
9519bim!bop /&gt;</p>
9520````````````````````````````````
9521
9522
9523Missing [whitespace]:
9524
9525```````````````````````````````` example
9526<a href='bar'title=title>
9527.
9528<p>&lt;a href='bar'title=title&gt;</p>
9529````````````````````````````````
9530
9531
9532Closing tags:
9533
9534```````````````````````````````` example
9535</a></foo >
9536.
9537<p></a></foo ></p>
9538````````````````````````````````
9539
9540
9541Illegal attributes in closing tag:
9542
9543```````````````````````````````` example
9544</a href="foo">
9545.
9546<p>&lt;/a href=&quot;foo&quot;&gt;</p>
9547````````````````````````````````
9548
9549
9550Comments:
9551
9552```````````````````````````````` example
9553foo <!-- this is a --
9554comment - with hyphens -->
9555.
9556<p>foo <!-- this is a --
9557comment - with hyphens --></p>
9558````````````````````````````````
9559
9560```````````````````````````````` example
9561foo <!--> foo -->
9562
9563foo <!---> foo -->
9564.
9565<p>foo <!--> foo --&gt;</p>
9566<p>foo <!---> foo --&gt;</p>
9567````````````````````````````````
9568
9569
9570Processing instructions:
9571
9572```````````````````````````````` example
9573foo <?php echo $a; ?>
9574.
9575<p>foo <?php echo $a; ?></p>
9576````````````````````````````````
9577
9578
9579Declarations:
9580
9581```````````````````````````````` example
9582foo <!ELEMENT br EMPTY>
9583.
9584<p>foo <!ELEMENT br EMPTY></p>
9585````````````````````````````````
9586
9587
9588CDATA sections:
9589
9590```````````````````````````````` example
9591foo <![CDATA[>&<]]>
9592.
9593<p>foo <![CDATA[>&<]]></p>
9594````````````````````````````````
9595
9596
9597Entity and numeric character references are preserved in HTML
9598attributes:
9599
9600```````````````````````````````` example
9601foo <a href="&ouml;">
9602.
9603<p>foo <a href="&ouml;"></p>
9604````````````````````````````````
9605
9606
9607Backslash escapes do not work in HTML attributes:
9608
9609```````````````````````````````` example
9610foo <a href="\*">
9611.
9612<p>foo <a href="\*"></p>
9613````````````````````````````````
9614
9615
9616```````````````````````````````` example
9617<a href="\"">
9618.
9619<p>&lt;a href=&quot;&quot;&quot;&gt;</p>
9620````````````````````````````````
9621
9622
9623<div class="extension">
9624
9625## Disallowed Raw HTML (extension)
9626
9627GFM enables the `tagfilter` extension, where the following HTML tags will be
9628filtered when rendering HTML output:
9629
9630* `<title>`
9631* `<textarea>`
9632* `<style>`
9633* `<xmp>`
9634* `<iframe>`
9635* `<noembed>`
9636* `<noframes>`
9637* `<script>`
9638* `<plaintext>`
9639
9640Filtering is done by replacing the leading `<` with the entity `&lt;`.  These
9641tags are chosen in particular as they change how HTML is interpreted in a way
9642unique to them (i.e. nested HTML is interpreted differently), and this is
9643usually undesireable in the context of other rendered Markdown content.
9644
9645All other HTML tags are left untouched.
9646
9647```````````````````````````````` example tagfilter
9648<strong> <title> <style> <em>
9649
9650<blockquote>
9651  <xmp> is disallowed.  <XMP> is also disallowed.
9652</blockquote>
9653.
9654<p><strong> &lt;title> &lt;style> <em></p>
9655<blockquote>
9656  &lt;xmp> is disallowed.  &lt;XMP> is also disallowed.
9657</blockquote>
9658````````````````````````````````
9659
9660</div>
9661
9662## Hard line breaks
9663
9664A line break (not in a code span or HTML tag) that is preceded
9665by two or more spaces and does not occur at the end of a block
9666is parsed as a [hard line break](@) (rendered
9667in HTML as a `<br />` tag):
9668
9669```````````````````````````````` example
9670foo
9671baz
9672.
9673<p>foo<br />
9674baz</p>
9675````````````````````````````````
9676
9677
9678For a more visible alternative, a backslash before the
9679[line ending] may be used instead of two spaces:
9680
9681```````````````````````````````` example
9682foo\
9683baz
9684.
9685<p>foo<br />
9686baz</p>
9687````````````````````````````````
9688
9689
9690More than two spaces can be used:
9691
9692```````````````````````````````` example
9693foo
9694baz
9695.
9696<p>foo<br />
9697baz</p>
9698````````````````````````````````
9699
9700
9701Leading spaces at the beginning of the next line are ignored:
9702
9703```````````````````````````````` example
9704foo
9705     bar
9706.
9707<p>foo<br />
9708bar</p>
9709````````````````````````````````
9710
9711
9712```````````````````````````````` example
9713foo\
9714     bar
9715.
9716<p>foo<br />
9717bar</p>
9718````````````````````````````````
9719
9720
9721Line breaks can occur inside emphasis, links, and other constructs
9722that allow inline content:
9723
9724```````````````````````````````` example
9725*foo
9726bar*
9727.
9728<p><em>foo<br />
9729bar</em></p>
9730````````````````````````````````
9731
9732
9733```````````````````````````````` example
9734*foo\
9735bar*
9736.
9737<p><em>foo<br />
9738bar</em></p>
9739````````````````````````````````
9740
9741
9742Line breaks do not occur inside code spans
9743
9744```````````````````````````````` example
9745`code
9746span`
9747.
9748<p><code>code   span</code></p>
9749````````````````````````````````
9750
9751
9752```````````````````````````````` example
9753`code\
9754span`
9755.
9756<p><code>code\ span</code></p>
9757````````````````````````````````
9758
9759
9760or HTML tags:
9761
9762```````````````````````````````` example
9763<a href="foo
9764bar">
9765.
9766<p><a href="foo
9767bar"></p>
9768````````````````````````````````
9769
9770
9771```````````````````````````````` example
9772<a href="foo\
9773bar">
9774.
9775<p><a href="foo\
9776bar"></p>
9777````````````````````````````````
9778
9779
9780Hard line breaks are for separating inline content within a block.
9781Neither syntax for hard line breaks works at the end of a paragraph or
9782other block element:
9783
9784```````````````````````````````` example
9785foo\
9786.
9787<p>foo\</p>
9788````````````````````````````````
9789
9790
9791```````````````````````````````` example
9792foo
9793.
9794<p>foo</p>
9795````````````````````````````````
9796
9797
9798```````````````````````````````` example
9799### foo\
9800.
9801<h3>foo\</h3>
9802````````````````````````````````
9803
9804
9805```````````````````````````````` example
9806### foo
9807.
9808<h3>foo</h3>
9809````````````````````````````````
9810
9811
9812## Soft line breaks
9813
9814A regular line break (not in a code span or HTML tag) that is not
9815preceded by two or more spaces or a backslash is parsed as a
9816[softbreak](@).  (A softbreak may be rendered in HTML either as a
9817[line ending] or as a space. The result will be the same in
9818browsers. In the examples here, a [line ending] will be used.)
9819
9820```````````````````````````````` example
9821foo
9822baz
9823.
9824<p>foo
9825baz</p>
9826````````````````````````````````
9827
9828
9829Spaces at the end of the line and beginning of the next line are
9830removed:
9831
9832```````````````````````````````` example
9833foo
9834 baz
9835.
9836<p>foo
9837baz</p>
9838````````````````````````````````
9839
9840
9841A conforming parser may render a soft line break in HTML either as a
9842line break or as a space.
9843
9844A renderer may also provide an option to render soft line breaks
9845as hard line breaks.
9846
9847## Textual content
9848
9849Any characters not given an interpretation by the above rules will
9850be parsed as plain textual content.
9851
9852```````````````````````````````` example
9853hello $.;'there
9854.
9855<p>hello $.;'there</p>
9856````````````````````````````````
9857
9858
9859```````````````````````````````` example
9860Foo χρῆν
9861.
9862<p>Foo χρῆν</p>
9863````````````````````````````````
9864
9865
9866Internal spaces are preserved verbatim:
9867
9868```````````````````````````````` example
9869Multiple     spaces
9870.
9871<p>Multiple     spaces</p>
9872````````````````````````````````
9873
9874
9875<!-- END TESTS -->
9876
9877# Appendix: A parsing strategy
9878
9879In this appendix we describe some features of the parsing strategy
9880used in the CommonMark reference implementations.
9881
9882## Overview
9883
9884Parsing has two phases:
9885
98861. In the first phase, lines of input are consumed and the block
9887structure of the document---its division into paragraphs, block quotes,
9888list items, and so on---is constructed.  Text is assigned to these
9889blocks but not parsed. Link reference definitions are parsed and a
9890map of links is constructed.
9891
98922. In the second phase, the raw text contents of paragraphs and headings
9893are parsed into sequences of Markdown inline elements (strings,
9894code spans, links, emphasis, and so on), using the map of link
9895references constructed in phase 1.
9896
9897At each point in processing, the document is represented as a tree of
9898**blocks**.  The root of the tree is a `document` block.  The `document`
9899may have any number of other blocks as **children**.  These children
9900may, in turn, have other blocks as children.  The last child of a block
9901is normally considered **open**, meaning that subsequent lines of input
9902can alter its contents.  (Blocks that are not open are **closed**.)
9903Here, for example, is a possible document tree, with the open blocks
9904marked by arrows:
9905
9906``` tree
9907-> document
9908  -> block_quote
9909       paragraph
9910         "Lorem ipsum dolor\nsit amet."
9911    -> list (type=bullet tight=true bullet_char=-)
9912         list_item
9913           paragraph
9914             "Qui *quodsi iracundia*"
9915      -> list_item
9916        -> paragraph
9917             "aliquando id"
9918```
9919
9920## Phase 1: block structure
9921
9922Each line that is processed has an effect on this tree.  The line is
9923analyzed and, depending on its contents, the document may be altered
9924in one or more of the following ways:
9925
99261. One or more open blocks may be closed.
99272. One or more new blocks may be created as children of the
9928   last open block.
99293. Text may be added to the last (deepest) open block remaining
9930   on the tree.
9931
9932Once a line has been incorporated into the tree in this way,
9933it can be discarded, so input can be read in a stream.
9934
9935For each line, we follow this procedure:
9936
99371. First we iterate through the open blocks, starting with the
9938root document, and descending through last children down to the last
9939open block.  Each block imposes a condition that the line must satisfy
9940if the block is to remain open.  For example, a block quote requires a
9941`>` character.  A paragraph requires a non-blank line.
9942In this phase we may match all or just some of the open
9943blocks.  But we cannot close unmatched blocks yet, because we may have a
9944[lazy continuation line].
9945
99462.  Next, after consuming the continuation markers for existing
9947blocks, we look for new block starts (e.g. `>` for a block quote).
9948If we encounter a new block start, we close any blocks unmatched
9949in step 1 before creating the new block as a child of the last
9950matched block.
9951
99523.  Finally, we look at the remainder of the line (after block
9953markers like `>`, list markers, and indentation have been consumed).
9954This is text that can be incorporated into the last open
9955block (a paragraph, code block, heading, or raw HTML).
9956
9957Setext headings are formed when we see a line of a paragraph
9958that is a [setext heading underline].
9959
9960Reference link definitions are detected when a paragraph is closed;
9961the accumulated text lines are parsed to see if they begin with
9962one or more reference link definitions.  Any remainder becomes a
9963normal paragraph.
9964
9965We can see how this works by considering how the tree above is
9966generated by four lines of Markdown:
9967
9968``` markdown
9969> Lorem ipsum dolor
9970sit amet.
9971> - Qui *quodsi iracundia*
9972> - aliquando id
9973```
9974
9975At the outset, our document model is just
9976
9977``` tree
9978-> document
9979```
9980
9981The first line of our text,
9982
9983``` markdown
9984> Lorem ipsum dolor
9985```
9986
9987causes a `block_quote` block to be created as a child of our
9988open `document` block, and a `paragraph` block as a child of
9989the `block_quote`.  Then the text is added to the last open
9990block, the `paragraph`:
9991
9992``` tree
9993-> document
9994  -> block_quote
9995    -> paragraph
9996         "Lorem ipsum dolor"
9997```
9998
9999The next line,
10000
10001``` markdown
10002sit amet.
10003```
10004
10005is a "lazy continuation" of the open `paragraph`, so it gets added
10006to the paragraph's text:
10007
10008``` tree
10009-> document
10010  -> block_quote
10011    -> paragraph
10012         "Lorem ipsum dolor\nsit amet."
10013```
10014
10015The third line,
10016
10017``` markdown
10018> - Qui *quodsi iracundia*
10019```
10020
10021causes the `paragraph` block to be closed, and a new `list` block
10022opened as a child of the `block_quote`.  A `list_item` is also
10023added as a child of the `list`, and a `paragraph` as a child of
10024the `list_item`.  The text is then added to the new `paragraph`:
10025
10026``` tree
10027-> document
10028  -> block_quote
10029       paragraph
10030         "Lorem ipsum dolor\nsit amet."
10031    -> list (type=bullet tight=true bullet_char=-)
10032      -> list_item
10033        -> paragraph
10034             "Qui *quodsi iracundia*"
10035```
10036
10037The fourth line,
10038
10039``` markdown
10040> - aliquando id
10041```
10042
10043causes the `list_item` (and its child the `paragraph`) to be closed,
10044and a new `list_item` opened up as child of the `list`.  A `paragraph`
10045is added as a child of the new `list_item`, to contain the text.
10046We thus obtain the final tree:
10047
10048``` tree
10049-> document
10050  -> block_quote
10051       paragraph
10052         "Lorem ipsum dolor\nsit amet."
10053    -> list (type=bullet tight=true bullet_char=-)
10054         list_item
10055           paragraph
10056             "Qui *quodsi iracundia*"
10057      -> list_item
10058        -> paragraph
10059             "aliquando id"
10060```
10061
10062## Phase 2: inline structure
10063
10064Once all of the input has been parsed, all open blocks are closed.
10065
10066We then "walk the tree," visiting every node, and parse raw
10067string contents of paragraphs and headings as inlines.  At this
10068point we have seen all the link reference definitions, so we can
10069resolve reference links as we go.
10070
10071``` tree
10072document
10073  block_quote
10074    paragraph
10075      str "Lorem ipsum dolor"
10076      softbreak
10077      str "sit amet."
10078    list (type=bullet tight=true bullet_char=-)
10079      list_item
10080        paragraph
10081          str "Qui "
10082          emph
10083            str "quodsi iracundia"
10084      list_item
10085        paragraph
10086          str "aliquando id"
10087```
10088
10089Notice how the [line ending] in the first paragraph has
10090been parsed as a `softbreak`, and the asterisks in the first list item
10091have become an `emph`.
10092
10093### An algorithm for parsing nested emphasis and links
10094
10095By far the trickiest part of inline parsing is handling emphasis,
10096strong emphasis, links, and images.  This is done using the following
10097algorithm.
10098
10099When we're parsing inlines and we hit either
10100
10101- a run of `*` or `_` characters, or
10102- a `[` or `![`
10103
10104we insert a text node with these symbols as its literal content, and we
10105add a pointer to this text node to the [delimiter stack](@).
10106
10107The [delimiter stack] is a doubly linked list.  Each
10108element contains a pointer to a text node, plus information about
10109
10110- the type of delimiter (`[`, `![`, `*`, `_`)
10111- the number of delimiters,
10112- whether the delimiter is "active" (all are active to start), and
10113- whether the delimiter is a potential opener, a potential closer,
10114  or both (which depends on what sort of characters precede
10115  and follow the delimiters).
10116
10117When we hit a `]` character, we call the *look for link or image*
10118procedure (see below).
10119
10120When we hit the end of the input, we call the *process emphasis*
10121procedure (see below), with `stack_bottom` = NULL.
10122
10123#### *look for link or image*
10124
10125Starting at the top of the delimiter stack, we look backwards
10126through the stack for an opening `[` or `![` delimiter.
10127
10128- If we don't find one, we return a literal text node `]`.
10129
10130- If we do find one, but it's not *active*, we remove the inactive
10131  delimiter from the stack, and return a literal text node `]`.
10132
10133- If we find one and it's active, then we parse ahead to see if
10134  we have an inline link/image, reference link/image, compact reference
10135  link/image, or shortcut reference link/image.
10136
10137  + If we don't, then we remove the opening delimiter from the
10138    delimiter stack and return a literal text node `]`.
10139
10140  + If we do, then
10141
10142    * We return a link or image node whose children are the inlines
10143      after the text node pointed to by the opening delimiter.
10144
10145    * We run *process emphasis* on these inlines, with the `[` opener
10146      as `stack_bottom`.
10147
10148    * We remove the opening delimiter.
10149
10150    * If we have a link (and not an image), we also set all
10151      `[` delimiters before the opening delimiter to *inactive*.  (This
10152      will prevent us from getting links within links.)
10153
10154#### *process emphasis*
10155
10156Parameter `stack_bottom` sets a lower bound to how far we
10157descend in the [delimiter stack].  If it is NULL, we can
10158go all the way to the bottom.  Otherwise, we stop before
10159visiting `stack_bottom`.
10160
10161Let `current_position` point to the element on the [delimiter stack]
10162just above `stack_bottom` (or the first element if `stack_bottom`
10163is NULL).
10164
10165We keep track of the `openers_bottom` for each delimiter
10166type (`*`, `_`) and each length of the closing delimiter run
10167(modulo 3).  Initialize this to `stack_bottom`.
10168
10169Then we repeat the following until we run out of potential
10170closers:
10171
10172- Move `current_position` forward in the delimiter stack (if needed)
10173  until we find the first potential closer with delimiter `*` or `_`.
10174  (This will be the potential closer closest
10175  to the beginning of the input -- the first one in parse order.)
10176
10177- Now, look back in the stack (staying above `stack_bottom` and
10178  the `openers_bottom` for this delimiter type) for the
10179  first matching potential opener ("matching" means same delimiter).
10180
10181- If one is found:
10182
10183  + Figure out whether we have emphasis or strong emphasis:
10184    if both closer and opener spans have length >= 2, we have
10185    strong, otherwise regular.
10186
10187  + Insert an emph or strong emph node accordingly, after
10188    the text node corresponding to the opener.
10189
10190  + Remove any delimiters between the opener and closer from
10191    the delimiter stack.
10192
10193  + Remove 1 (for regular emph) or 2 (for strong emph) delimiters
10194    from the opening and closing text nodes.  If they become empty
10195    as a result, remove them and remove the corresponding element
10196    of the delimiter stack.  If the closing node is removed, reset
10197    `current_position` to the next element in the stack.
10198
10199- If none is found:
10200
10201  + Set `openers_bottom` to the element before `current_position`.
10202    (We know that there are no openers for this kind of closer up to and
10203    including this point, so this puts a lower bound on future searches.)
10204
10205  + If the closer at `current_position` is not a potential opener,
10206    remove it from the delimiter stack (since we know it can't
10207    be a closer either).
10208
10209  + Advance `current_position` to the next element in the stack.
10210
10211After we're done, we remove all delimiters above `stack_bottom` from the
10212delimiter stack.
10213