1--- 2title: GitHub Flavored Markdown Spec 3version: 0.29 4date: '2019-04-06' 5license: '[CC-BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/)' 6... 7 8# Introduction 9 10## What is GitHub Flavored Markdown? 11 12GitHub Flavored Markdown, often shortened as GFM, is the dialect of Markdown 13that is currently supported for user content on GitHub.com and GitHub 14Enterprise. 15 16This formal specification, based on the CommonMark Spec, defines the syntax and 17semantics of this dialect. 18 19GFM is a strict superset of CommonMark. All the features which are supported in 20GitHub user content and that are not specified on the original CommonMark Spec 21are hence known as **extensions**, and highlighted as such. 22 23While GFM supports a wide range of inputs, it's worth noting that GitHub.com 24and GitHub Enterprise perform additional post-processing and sanitization after 25GFM is converted to HTML to ensure security and consistency of the website. 26 27## What is Markdown? 28 29Markdown is a plain text format for writing structured documents, 30based on conventions for indicating formatting in email 31and usenet posts. It was developed by John Gruber (with 32help from Aaron Swartz) and released in 2004 in the form of a 33[syntax description](http://daringfireball.net/projects/markdown/syntax) 34and a Perl script (`Markdown.pl`) for converting Markdown to 35HTML. In the next decade, dozens of implementations were 36developed in many languages. Some extended the original 37Markdown syntax with conventions for footnotes, tables, and 38other document elements. Some allowed Markdown documents to be 39rendered in formats other than HTML. Websites like Reddit, 40StackOverflow, and GitHub had millions of people using Markdown. 41And Markdown started to be used beyond the web, to author books, 42articles, slide shows, letters, and lecture notes. 43 44What distinguishes Markdown from many other lightweight markup 45syntaxes, which are often easier to write, is its readability. 46As Gruber writes: 47 48> The overriding design goal for Markdown's formatting syntax is 49> to make it as readable as possible. The idea is that a 50> Markdown-formatted document should be publishable as-is, as 51> plain text, without looking like it's been marked up with tags 52> or formatting instructions. 53> (<http://daringfireball.net/projects/markdown/>) 54 55The point can be illustrated by comparing a sample of 56[AsciiDoc](http://www.methods.co.nz/asciidoc/) with 57an equivalent sample of Markdown. Here is a sample of 58AsciiDoc from the AsciiDoc manual: 59 60``` 611. List item one. 62+ 63List item one continued with a second paragraph followed by an 64Indented block. 65+ 66................. 67$ ls *.sh 68$ mv *.sh ~/tmp 69................. 70+ 71List item continued with a third paragraph. 72 732. List item two continued with an open block. 74+ 75-- 76This paragraph is part of the preceding list item. 77 78a. This list is nested and does not require explicit item 79continuation. 80+ 81This paragraph is part of the preceding list item. 82 83b. List item b. 84 85This paragraph belongs to item two of the outer list. 86-- 87``` 88 89And here is the equivalent in Markdown: 90``` 911. List item one. 92 93 List item one continued with a second paragraph followed by an 94 Indented block. 95 96 $ ls *.sh 97 $ mv *.sh ~/tmp 98 99 List item continued with a third paragraph. 100 1012. List item two continued with an open block. 102 103 This paragraph is part of the preceding list item. 104 105 1. This list is nested and does not require explicit item continuation. 106 107 This paragraph is part of the preceding list item. 108 109 2. List item b. 110 111 This paragraph belongs to item two of the outer list. 112``` 113 114The AsciiDoc version is, arguably, easier to write. You don't need 115to worry about indentation. But the Markdown version is much easier 116to read. The nesting of list items is apparent to the eye in the 117source, not just in the processed document. 118 119## Why is a spec needed? 120 121John Gruber's [canonical description of Markdown's 122syntax](http://daringfireball.net/projects/markdown/syntax) 123does not specify the syntax unambiguously. Here are some examples of 124questions it does not answer: 125 1261. How much indentation is needed for a sublist? The spec says that 127 continuation paragraphs need to be indented four spaces, but is 128 not fully explicit about sublists. It is natural to think that 129 they, too, must be indented four spaces, but `Markdown.pl` does 130 not require that. This is hardly a "corner case," and divergences 131 between implementations on this issue often lead to surprises for 132 users in real documents. (See [this comment by John 133 Gruber](https://web.archive.org/web/20170611172104/http://article.gmane.org/gmane.text.markdown.general/1997).) 134 1352. Is a blank line needed before a block quote or heading? 136 Most implementations do not require the blank line. However, 137 this can lead to unexpected results in hard-wrapped text, and 138 also to ambiguities in parsing (note that some implementations 139 put the heading inside the blockquote, while others do not). 140 (John Gruber has also spoken [in favor of requiring the blank 141 lines](https://web.archive.org/web/20170611172104/http://article.gmane.org/gmane.text.markdown.general/2146).) 142 1433. Is a blank line needed before an indented code block? 144 (`Markdown.pl` requires it, but this is not mentioned in the 145 documentation, and some implementations do not require it.) 146 147 ``` markdown 148 paragraph 149 code? 150 ``` 151 1524. What is the exact rule for determining when list items get 153 wrapped in `<p>` tags? Can a list be partially "loose" and partially 154 "tight"? What should we do with a list like this? 155 156 ``` markdown 157 1. one 158 159 2. two 160 3. three 161 ``` 162 163 Or this? 164 165 ``` markdown 166 1. one 167 - a 168 169 - b 170 2. two 171 ``` 172 173 (There are some relevant comments by John Gruber 174 [here](https://web.archive.org/web/20170611172104/http://article.gmane.org/gmane.text.markdown.general/2554).) 175 1765. Can list markers be indented? Can ordered list markers be right-aligned? 177 178 ``` markdown 179 8. item 1 180 9. item 2 181 10. item 2a 182 ``` 183 1846. Is this one list with a thematic break in its second item, 185 or two lists separated by a thematic break? 186 187 ``` markdown 188 * a 189 * * * * * 190 * b 191 ``` 192 1937. When list markers change from numbers to bullets, do we have 194 two lists or one? (The Markdown syntax description suggests two, 195 but the perl scripts and many other implementations produce one.) 196 197 ``` markdown 198 1. fee 199 2. fie 200 - foe 201 - fum 202 ``` 203 2048. What are the precedence rules for the markers of inline structure? 205 For example, is the following a valid link, or does the code span 206 take precedence ? 207 208 ``` markdown 209 [a backtick (`)](/url) and [another backtick (`)](/url). 210 ``` 211 2129. What are the precedence rules for markers of emphasis and strong 213 emphasis? For example, how should the following be parsed? 214 215 ``` markdown 216 *foo *bar* baz* 217 ``` 218 21910. What are the precedence rules between block-level and inline-level 220 structure? For example, how should the following be parsed? 221 222 ``` markdown 223 - `a long code span can contain a hyphen like this 224 - and it can screw things up` 225 ``` 226 22711. Can list items include section headings? (`Markdown.pl` does not 228 allow this, but does allow blockquotes to include headings.) 229 230 ``` markdown 231 - # Heading 232 ``` 233 23412. Can list items be empty? 235 236 ``` markdown 237 * a 238 * 239 * b 240 ``` 241 24213. Can link references be defined inside block quotes or list items? 243 244 ``` markdown 245 > Blockquote [foo]. 246 > 247 > [foo]: /url 248 ``` 249 25014. If there are multiple definitions for the same reference, which takes 251 precedence? 252 253 ``` markdown 254 [foo]: /url1 255 [foo]: /url2 256 257 [foo][] 258 ``` 259 260In the absence of a spec, early implementers consulted `Markdown.pl` 261to resolve these ambiguities. But `Markdown.pl` was quite buggy, and 262gave manifestly bad results in many cases, so it was not a 263satisfactory replacement for a spec. 264 265Because there is no unambiguous spec, implementations have diverged 266considerably. As a result, users are often surprised to find that 267a document that renders one way on one system (say, a GitHub wiki) 268renders differently on another (say, converting to docbook using 269pandoc). To make matters worse, because nothing in Markdown counts 270as a "syntax error," the divergence often isn't discovered right away. 271 272## About this document 273 274This document attempts to specify Markdown syntax unambiguously. 275It contains many examples with side-by-side Markdown and 276HTML. These are intended to double as conformance tests. An 277accompanying script `spec_tests.py` can be used to run the tests 278against any Markdown program: 279 280 python test/spec_tests.py --spec spec.txt --program PROGRAM 281 282Since this document describes how Markdown is to be parsed into 283an abstract syntax tree, it would have made sense to use an abstract 284representation of the syntax tree instead of HTML. But HTML is capable 285of representing the structural distinctions we need to make, and the 286choice of HTML for the tests makes it possible to run the tests against 287an implementation without writing an abstract syntax tree renderer. 288 289This document is generated from a text file, `spec.txt`, written 290in Markdown with a small extension for the side-by-side tests. 291The script `tools/makespec.py` can be used to convert `spec.txt` into 292HTML or CommonMark (which can then be converted into other formats). 293 294In the examples, the `→` character is used to represent tabs. 295 296# Preliminaries 297 298## Characters and lines 299 300Any sequence of [characters] is a valid CommonMark 301document. 302 303A [character](@) is a Unicode code point. Although some 304code points (for example, combining accents) do not correspond to 305characters in an intuitive sense, all code points count as characters 306for purposes of this spec. 307 308This spec does not specify an encoding; it thinks of lines as composed 309of [characters] rather than bytes. A conforming parser may be limited 310to a certain encoding. 311 312A [line](@) is a sequence of zero or more [characters] 313other than newline (`U+000A`) or carriage return (`U+000D`), 314followed by a [line ending] or by the end of file. 315 316A [line ending](@) is a newline (`U+000A`), a carriage return 317(`U+000D`) not followed by a newline, or a carriage return and a 318following newline. 319 320A line containing no characters, or a line containing only spaces 321(`U+0020`) or tabs (`U+0009`), is called a [blank line](@). 322 323The following definitions of character classes will be used in this spec: 324 325A [whitespace character](@) is a space 326(`U+0020`), tab (`U+0009`), newline (`U+000A`), line tabulation (`U+000B`), 327form feed (`U+000C`), or carriage return (`U+000D`). 328 329[Whitespace](@) is a sequence of one or more [whitespace 330characters]. 331 332A [Unicode whitespace character](@) is 333any code point in the Unicode `Zs` general category, or a tab (`U+0009`), 334carriage return (`U+000D`), newline (`U+000A`), or form feed 335(`U+000C`). 336 337[Unicode whitespace](@) is a sequence of one 338or more [Unicode whitespace characters]. 339 340A [space](@) is `U+0020`. 341 342A [non-whitespace character](@) is any character 343that is not a [whitespace character]. 344 345An [ASCII punctuation character](@) 346is `!`, `"`, `#`, `$`, `%`, `&`, `'`, `(`, `)`, 347`*`, `+`, `,`, `-`, `.`, `/` (U+0021–2F), 348`:`, `;`, `<`, `=`, `>`, `?`, `@` (U+003A–0040), 349`[`, `\`, `]`, `^`, `_`, `` ` `` (U+005B–0060), 350`{`, `|`, `}`, or `~` (U+007B–007E). 351 352A [punctuation character](@) is an [ASCII 353punctuation character] or anything in 354the general Unicode categories `Pc`, `Pd`, `Pe`, `Pf`, `Pi`, `Po`, or `Ps`. 355 356## Tabs 357 358Tabs in lines are not expanded to [spaces]. However, 359in contexts where whitespace helps to define block structure, 360tabs behave as if they were replaced by spaces with a tab stop 361of 4 characters. 362 363Thus, for example, a tab can be used instead of four spaces 364in an indented code block. (Note, however, that internal 365tabs are passed through as literal tabs, not expanded to 366spaces.) 367 368```````````````````````````````` example 369→foo→baz→→bim 370. 371<pre><code>foo→baz→→bim 372</code></pre> 373```````````````````````````````` 374 375```````````````````````````````` example 376 →foo→baz→→bim 377. 378<pre><code>foo→baz→→bim 379</code></pre> 380```````````````````````````````` 381 382```````````````````````````````` example 383 a→a 384 ὐ→a 385. 386<pre><code>a→a 387ὐ→a 388</code></pre> 389```````````````````````````````` 390 391In the following example, a continuation paragraph of a list 392item is indented with a tab; this has exactly the same effect 393as indentation with four spaces would: 394 395```````````````````````````````` example 396 - foo 397 398→bar 399. 400<ul> 401<li> 402<p>foo</p> 403<p>bar</p> 404</li> 405</ul> 406```````````````````````````````` 407 408```````````````````````````````` example 409- foo 410 411→→bar 412. 413<ul> 414<li> 415<p>foo</p> 416<pre><code> bar 417</code></pre> 418</li> 419</ul> 420```````````````````````````````` 421 422Normally the `>` that begins a block quote may be followed 423optionally by a space, which is not considered part of the 424content. In the following case `>` is followed by a tab, 425which is treated as if it were expanded into three spaces. 426Since one of these spaces is considered part of the 427delimiter, `foo` is considered to be indented six spaces 428inside the block quote context, so we get an indented 429code block starting with two spaces. 430 431```````````````````````````````` example 432>→→foo 433. 434<blockquote> 435<pre><code> foo 436</code></pre> 437</blockquote> 438```````````````````````````````` 439 440```````````````````````````````` example 441-→→foo 442. 443<ul> 444<li> 445<pre><code> foo 446</code></pre> 447</li> 448</ul> 449```````````````````````````````` 450 451 452```````````````````````````````` example 453 foo 454→bar 455. 456<pre><code>foo 457bar 458</code></pre> 459```````````````````````````````` 460 461```````````````````````````````` example 462 - foo 463 - bar 464→ - baz 465. 466<ul> 467<li>foo 468<ul> 469<li>bar 470<ul> 471<li>baz</li> 472</ul> 473</li> 474</ul> 475</li> 476</ul> 477```````````````````````````````` 478 479```````````````````````````````` example 480#→Foo 481. 482<h1>Foo</h1> 483```````````````````````````````` 484 485```````````````````````````````` example 486*→*→*→ 487. 488<hr /> 489```````````````````````````````` 490 491 492## Insecure characters 493 494For security reasons, the Unicode character `U+0000` must be replaced 495with the REPLACEMENT CHARACTER (`U+FFFD`). 496 497# Blocks and inlines 498 499We can think of a document as a sequence of 500[blocks](@)---structural elements like paragraphs, block 501quotations, lists, headings, rules, and code blocks. Some blocks (like 502block quotes and list items) contain other blocks; others (like 503headings and paragraphs) contain [inline](@) content---text, 504links, emphasized text, images, code spans, and so on. 505 506## Precedence 507 508Indicators of block structure always take precedence over indicators 509of inline structure. So, for example, the following is a list with 510two items, not a list with one item containing a code span: 511 512```````````````````````````````` example 513- `one 514- two` 515. 516<ul> 517<li>`one</li> 518<li>two`</li> 519</ul> 520```````````````````````````````` 521 522 523This means that parsing can proceed in two steps: first, the block 524structure of the document can be discerned; second, text lines inside 525paragraphs, headings, and other block constructs can be parsed for inline 526structure. The second step requires information about link reference 527definitions that will be available only at the end of the first 528step. Note that the first step requires processing lines in sequence, 529but the second can be parallelized, since the inline parsing of 530one block element does not affect the inline parsing of any other. 531 532## Container blocks and leaf blocks 533 534We can divide blocks into two types: 535[container blocks](@), 536which can contain other blocks, and [leaf blocks](@), 537which cannot. 538 539# Leaf blocks 540 541This section describes the different kinds of leaf block that make up a 542Markdown document. 543 544## Thematic breaks 545 546A line consisting of 0-3 spaces of indentation, followed by a sequence 547of three or more matching `-`, `_`, or `*` characters, each followed 548optionally by any number of spaces or tabs, forms a 549[thematic break](@). 550 551```````````````````````````````` example 552*** 553--- 554___ 555. 556<hr /> 557<hr /> 558<hr /> 559```````````````````````````````` 560 561 562Wrong characters: 563 564```````````````````````````````` example 565+++ 566. 567<p>+++</p> 568```````````````````````````````` 569 570 571```````````````````````````````` example 572=== 573. 574<p>===</p> 575```````````````````````````````` 576 577 578Not enough characters: 579 580```````````````````````````````` example 581-- 582** 583__ 584. 585<p>-- 586** 587__</p> 588```````````````````````````````` 589 590 591One to three spaces indent are allowed: 592 593```````````````````````````````` example 594 *** 595 *** 596 *** 597. 598<hr /> 599<hr /> 600<hr /> 601```````````````````````````````` 602 603 604Four spaces is too many: 605 606```````````````````````````````` example 607 *** 608. 609<pre><code>*** 610</code></pre> 611```````````````````````````````` 612 613 614```````````````````````````````` example 615Foo 616 *** 617. 618<p>Foo 619***</p> 620```````````````````````````````` 621 622 623More than three characters may be used: 624 625```````````````````````````````` example 626_____________________________________ 627. 628<hr /> 629```````````````````````````````` 630 631 632Spaces are allowed between the characters: 633 634```````````````````````````````` example 635 - - - 636. 637<hr /> 638```````````````````````````````` 639 640 641```````````````````````````````` example 642 ** * ** * ** * ** 643. 644<hr /> 645```````````````````````````````` 646 647 648```````````````````````````````` example 649- - - - 650. 651<hr /> 652```````````````````````````````` 653 654 655Spaces are allowed at the end: 656 657```````````````````````````````` example 658- - - - 659. 660<hr /> 661```````````````````````````````` 662 663 664However, no other characters may occur in the line: 665 666```````````````````````````````` example 667_ _ _ _ a 668 669a------ 670 671---a--- 672. 673<p>_ _ _ _ a</p> 674<p>a------</p> 675<p>---a---</p> 676```````````````````````````````` 677 678 679It is required that all of the [non-whitespace characters] be the same. 680So, this is not a thematic break: 681 682```````````````````````````````` example 683 *-* 684. 685<p><em>-</em></p> 686```````````````````````````````` 687 688 689Thematic breaks do not need blank lines before or after: 690 691```````````````````````````````` example 692- foo 693*** 694- bar 695. 696<ul> 697<li>foo</li> 698</ul> 699<hr /> 700<ul> 701<li>bar</li> 702</ul> 703```````````````````````````````` 704 705 706Thematic breaks can interrupt a paragraph: 707 708```````````````````````````````` example 709Foo 710*** 711bar 712. 713<p>Foo</p> 714<hr /> 715<p>bar</p> 716```````````````````````````````` 717 718 719If a line of dashes that meets the above conditions for being a 720thematic break could also be interpreted as the underline of a [setext 721heading], the interpretation as a 722[setext heading] takes precedence. Thus, for example, 723this is a setext heading, not a paragraph followed by a thematic break: 724 725```````````````````````````````` example 726Foo 727--- 728bar 729. 730<h2>Foo</h2> 731<p>bar</p> 732```````````````````````````````` 733 734 735When both a thematic break and a list item are possible 736interpretations of a line, the thematic break takes precedence: 737 738```````````````````````````````` example 739* Foo 740* * * 741* Bar 742. 743<ul> 744<li>Foo</li> 745</ul> 746<hr /> 747<ul> 748<li>Bar</li> 749</ul> 750```````````````````````````````` 751 752 753If you want a thematic break in a list item, use a different bullet: 754 755```````````````````````````````` example 756- Foo 757- * * * 758. 759<ul> 760<li>Foo</li> 761<li> 762<hr /> 763</li> 764</ul> 765```````````````````````````````` 766 767 768## ATX headings 769 770An [ATX heading](@) 771consists of a string of characters, parsed as inline content, between an 772opening sequence of 1--6 unescaped `#` characters and an optional 773closing sequence of any number of unescaped `#` characters. 774The opening sequence of `#` characters must be followed by a 775[space] or by the end of line. The optional closing sequence of `#`s must be 776preceded by a [space] and may be followed by spaces only. The opening 777`#` character may be indented 0-3 spaces. The raw contents of the 778heading are stripped of leading and trailing spaces before being parsed 779as inline content. The heading level is equal to the number of `#` 780characters in the opening sequence. 781 782Simple headings: 783 784```````````````````````````````` example 785# foo 786## foo 787### foo 788#### foo 789##### foo 790###### foo 791. 792<h1>foo</h1> 793<h2>foo</h2> 794<h3>foo</h3> 795<h4>foo</h4> 796<h5>foo</h5> 797<h6>foo</h6> 798```````````````````````````````` 799 800 801More than six `#` characters is not a heading: 802 803```````````````````````````````` example 804####### foo 805. 806<p>####### foo</p> 807```````````````````````````````` 808 809 810At least one space is required between the `#` characters and the 811heading's contents, unless the heading is empty. Note that many 812implementations currently do not require the space. However, the 813space was required by the 814[original ATX implementation](http://www.aaronsw.com/2002/atx/atx.py), 815and it helps prevent things like the following from being parsed as 816headings: 817 818```````````````````````````````` example 819#5 bolt 820 821#hashtag 822. 823<p>#5 bolt</p> 824<p>#hashtag</p> 825```````````````````````````````` 826 827 828This is not a heading, because the first `#` is escaped: 829 830```````````````````````````````` example 831\## foo 832. 833<p>## foo</p> 834```````````````````````````````` 835 836 837Contents are parsed as inlines: 838 839```````````````````````````````` example 840# foo *bar* \*baz\* 841. 842<h1>foo <em>bar</em> *baz*</h1> 843```````````````````````````````` 844 845 846Leading and trailing [whitespace] is ignored in parsing inline content: 847 848```````````````````````````````` example 849# foo 850. 851<h1>foo</h1> 852```````````````````````````````` 853 854 855One to three spaces indentation are allowed: 856 857```````````````````````````````` example 858 ### foo 859 ## foo 860 # foo 861. 862<h3>foo</h3> 863<h2>foo</h2> 864<h1>foo</h1> 865```````````````````````````````` 866 867 868Four spaces are too much: 869 870```````````````````````````````` example 871 # foo 872. 873<pre><code># foo 874</code></pre> 875```````````````````````````````` 876 877 878```````````````````````````````` example 879foo 880 # bar 881. 882<p>foo 883# bar</p> 884```````````````````````````````` 885 886 887A closing sequence of `#` characters is optional: 888 889```````````````````````````````` example 890## foo ## 891 ### bar ### 892. 893<h2>foo</h2> 894<h3>bar</h3> 895```````````````````````````````` 896 897 898It need not be the same length as the opening sequence: 899 900```````````````````````````````` example 901# foo ################################## 902##### foo ## 903. 904<h1>foo</h1> 905<h5>foo</h5> 906```````````````````````````````` 907 908 909Spaces are allowed after the closing sequence: 910 911```````````````````````````````` example 912### foo ### 913. 914<h3>foo</h3> 915```````````````````````````````` 916 917 918A sequence of `#` characters with anything but [spaces] following it 919is not a closing sequence, but counts as part of the contents of the 920heading: 921 922```````````````````````````````` example 923### foo ### b 924. 925<h3>foo ### b</h3> 926```````````````````````````````` 927 928 929The closing sequence must be preceded by a space: 930 931```````````````````````````````` example 932# foo# 933. 934<h1>foo#</h1> 935```````````````````````````````` 936 937 938Backslash-escaped `#` characters do not count as part 939of the closing sequence: 940 941```````````````````````````````` example 942### foo \### 943## foo #\## 944# foo \# 945. 946<h3>foo ###</h3> 947<h2>foo ###</h2> 948<h1>foo #</h1> 949```````````````````````````````` 950 951 952ATX headings need not be separated from surrounding content by blank 953lines, and they can interrupt paragraphs: 954 955```````````````````````````````` example 956**** 957## foo 958**** 959. 960<hr /> 961<h2>foo</h2> 962<hr /> 963```````````````````````````````` 964 965 966```````````````````````````````` example 967Foo bar 968# baz 969Bar foo 970. 971<p>Foo bar</p> 972<h1>baz</h1> 973<p>Bar foo</p> 974```````````````````````````````` 975 976 977ATX headings can be empty: 978 979```````````````````````````````` example 980## 981# 982### ### 983. 984<h2></h2> 985<h1></h1> 986<h3></h3> 987```````````````````````````````` 988 989 990## Setext headings 991 992A [setext heading](@) consists of one or more 993lines of text, each containing at least one [non-whitespace 994character], with no more than 3 spaces indentation, followed by 995a [setext heading underline]. The lines of text must be such 996that, were they not followed by the setext heading underline, 997they would be interpreted as a paragraph: they cannot be 998interpretable as a [code fence], [ATX heading][ATX headings], 999[block quote][block quotes], [thematic break][thematic breaks], 1000[list item][list items], or [HTML block][HTML blocks]. 1001 1002A [setext heading underline](@) is a sequence of 1003`=` characters or a sequence of `-` characters, with no more than 3 1004spaces of indentation and any number of trailing spaces or tabs. 1005 1006The heading is a level 1 heading if `=` characters are used in 1007the [setext heading underline], and a level 2 heading if `-` 1008characters are used. The contents of the heading are the result 1009of parsing the preceding lines of text as CommonMark inline 1010content. 1011 1012In general, a setext heading need not be preceded or followed by a 1013blank line. However, it cannot interrupt a paragraph, so when a 1014setext heading comes after a paragraph, a blank line is needed between 1015them. 1016 1017Simple examples: 1018 1019```````````````````````````````` example 1020Foo *bar* 1021========= 1022 1023Foo *bar* 1024--------- 1025. 1026<h1>Foo <em>bar</em></h1> 1027<h2>Foo <em>bar</em></h2> 1028```````````````````````````````` 1029 1030 1031The content of the header may span more than one line: 1032 1033```````````````````````````````` example 1034Foo *bar 1035baz* 1036==== 1037. 1038<h1>Foo <em>bar 1039baz</em></h1> 1040```````````````````````````````` 1041 1042The contents are the result of parsing the headings's raw 1043content as inlines. The heading's raw content is formed by 1044concatenating the lines and removing initial and final 1045[whitespace]. 1046 1047```````````````````````````````` example 1048 Foo *bar 1049baz*→ 1050==== 1051. 1052<h1>Foo <em>bar 1053baz</em></h1> 1054```````````````````````````````` 1055 1056 1057The underlining can be any length: 1058 1059```````````````````````````````` example 1060Foo 1061------------------------- 1062 1063Foo 1064= 1065. 1066<h2>Foo</h2> 1067<h1>Foo</h1> 1068```````````````````````````````` 1069 1070 1071The heading content can be indented up to three spaces, and need 1072not line up with the underlining: 1073 1074```````````````````````````````` example 1075 Foo 1076--- 1077 1078 Foo 1079----- 1080 1081 Foo 1082 === 1083. 1084<h2>Foo</h2> 1085<h2>Foo</h2> 1086<h1>Foo</h1> 1087```````````````````````````````` 1088 1089 1090Four spaces indent is too much: 1091 1092```````````````````````````````` example 1093 Foo 1094 --- 1095 1096 Foo 1097--- 1098. 1099<pre><code>Foo 1100--- 1101 1102Foo 1103</code></pre> 1104<hr /> 1105```````````````````````````````` 1106 1107 1108The setext heading underline can be indented up to three spaces, and 1109may have trailing spaces: 1110 1111```````````````````````````````` example 1112Foo 1113 ---- 1114. 1115<h2>Foo</h2> 1116```````````````````````````````` 1117 1118 1119Four spaces is too much: 1120 1121```````````````````````````````` example 1122Foo 1123 --- 1124. 1125<p>Foo 1126---</p> 1127```````````````````````````````` 1128 1129 1130The setext heading underline cannot contain internal spaces: 1131 1132```````````````````````````````` example 1133Foo 1134= = 1135 1136Foo 1137--- - 1138. 1139<p>Foo 1140= =</p> 1141<p>Foo</p> 1142<hr /> 1143```````````````````````````````` 1144 1145 1146Trailing spaces in the content line do not cause a line break: 1147 1148```````````````````````````````` example 1149Foo 1150----- 1151. 1152<h2>Foo</h2> 1153```````````````````````````````` 1154 1155 1156Nor does a backslash at the end: 1157 1158```````````````````````````````` example 1159Foo\ 1160---- 1161. 1162<h2>Foo\</h2> 1163```````````````````````````````` 1164 1165 1166Since indicators of block structure take precedence over 1167indicators of inline structure, the following are setext headings: 1168 1169```````````````````````````````` example 1170`Foo 1171---- 1172` 1173 1174<a title="a lot 1175--- 1176of dashes"/> 1177. 1178<h2>`Foo</h2> 1179<p>`</p> 1180<h2><a title="a lot</h2> 1181<p>of dashes"/></p> 1182```````````````````````````````` 1183 1184 1185The setext heading underline cannot be a [lazy continuation 1186line] in a list item or block quote: 1187 1188```````````````````````````````` example 1189> Foo 1190--- 1191. 1192<blockquote> 1193<p>Foo</p> 1194</blockquote> 1195<hr /> 1196```````````````````````````````` 1197 1198 1199```````````````````````````````` example 1200> foo 1201bar 1202=== 1203. 1204<blockquote> 1205<p>foo 1206bar 1207===</p> 1208</blockquote> 1209```````````````````````````````` 1210 1211 1212```````````````````````````````` example 1213- Foo 1214--- 1215. 1216<ul> 1217<li>Foo</li> 1218</ul> 1219<hr /> 1220```````````````````````````````` 1221 1222 1223A blank line is needed between a paragraph and a following 1224setext heading, since otherwise the paragraph becomes part 1225of the heading's content: 1226 1227```````````````````````````````` example 1228Foo 1229Bar 1230--- 1231. 1232<h2>Foo 1233Bar</h2> 1234```````````````````````````````` 1235 1236 1237But in general a blank line is not required before or after 1238setext headings: 1239 1240```````````````````````````````` example 1241--- 1242Foo 1243--- 1244Bar 1245--- 1246Baz 1247. 1248<hr /> 1249<h2>Foo</h2> 1250<h2>Bar</h2> 1251<p>Baz</p> 1252```````````````````````````````` 1253 1254 1255Setext headings cannot be empty: 1256 1257```````````````````````````````` example 1258 1259==== 1260. 1261<p>====</p> 1262```````````````````````````````` 1263 1264 1265Setext heading text lines must not be interpretable as block 1266constructs other than paragraphs. So, the line of dashes 1267in these examples gets interpreted as a thematic break: 1268 1269```````````````````````````````` example 1270--- 1271--- 1272. 1273<hr /> 1274<hr /> 1275```````````````````````````````` 1276 1277 1278```````````````````````````````` example 1279- foo 1280----- 1281. 1282<ul> 1283<li>foo</li> 1284</ul> 1285<hr /> 1286```````````````````````````````` 1287 1288 1289```````````````````````````````` example 1290 foo 1291--- 1292. 1293<pre><code>foo 1294</code></pre> 1295<hr /> 1296```````````````````````````````` 1297 1298 1299```````````````````````````````` example 1300> foo 1301----- 1302. 1303<blockquote> 1304<p>foo</p> 1305</blockquote> 1306<hr /> 1307```````````````````````````````` 1308 1309 1310If you want a heading with `> foo` as its literal text, you can 1311use backslash escapes: 1312 1313```````````````````````````````` example 1314\> foo 1315------ 1316. 1317<h2>> foo</h2> 1318```````````````````````````````` 1319 1320 1321**Compatibility note:** Most existing Markdown implementations 1322do not allow the text of setext headings to span multiple lines. 1323But there is no consensus about how to interpret 1324 1325``` markdown 1326Foo 1327bar 1328--- 1329baz 1330``` 1331 1332One can find four different interpretations: 1333 13341. paragraph "Foo", heading "bar", paragraph "baz" 13352. paragraph "Foo bar", thematic break, paragraph "baz" 13363. paragraph "Foo bar --- baz" 13374. heading "Foo bar", paragraph "baz" 1338 1339We find interpretation 4 most natural, and interpretation 4 1340increases the expressive power of CommonMark, by allowing 1341multiline headings. Authors who want interpretation 1 can 1342put a blank line after the first paragraph: 1343 1344```````````````````````````````` example 1345Foo 1346 1347bar 1348--- 1349baz 1350. 1351<p>Foo</p> 1352<h2>bar</h2> 1353<p>baz</p> 1354```````````````````````````````` 1355 1356 1357Authors who want interpretation 2 can put blank lines around 1358the thematic break, 1359 1360```````````````````````````````` example 1361Foo 1362bar 1363 1364--- 1365 1366baz 1367. 1368<p>Foo 1369bar</p> 1370<hr /> 1371<p>baz</p> 1372```````````````````````````````` 1373 1374 1375or use a thematic break that cannot count as a [setext heading 1376underline], such as 1377 1378```````````````````````````````` example 1379Foo 1380bar 1381* * * 1382baz 1383. 1384<p>Foo 1385bar</p> 1386<hr /> 1387<p>baz</p> 1388```````````````````````````````` 1389 1390 1391Authors who want interpretation 3 can use backslash escapes: 1392 1393```````````````````````````````` example 1394Foo 1395bar 1396\--- 1397baz 1398. 1399<p>Foo 1400bar 1401--- 1402baz</p> 1403```````````````````````````````` 1404 1405 1406## Indented code blocks 1407 1408An [indented code block](@) is composed of one or more 1409[indented chunks] separated by blank lines. 1410An [indented chunk](@) is a sequence of non-blank lines, 1411each indented four or more spaces. The contents of the code block are 1412the literal contents of the lines, including trailing 1413[line endings], minus four spaces of indentation. 1414An indented code block has no [info string]. 1415 1416An indented code block cannot interrupt a paragraph, so there must be 1417a blank line between a paragraph and a following indented code block. 1418(A blank line is not needed, however, between a code block and a following 1419paragraph.) 1420 1421```````````````````````````````` example 1422 a simple 1423 indented code block 1424. 1425<pre><code>a simple 1426 indented code block 1427</code></pre> 1428```````````````````````````````` 1429 1430 1431If there is any ambiguity between an interpretation of indentation 1432as a code block and as indicating that material belongs to a [list 1433item][list items], the list item interpretation takes precedence: 1434 1435```````````````````````````````` example 1436 - foo 1437 1438 bar 1439. 1440<ul> 1441<li> 1442<p>foo</p> 1443<p>bar</p> 1444</li> 1445</ul> 1446```````````````````````````````` 1447 1448 1449```````````````````````````````` example 14501. foo 1451 1452 - bar 1453. 1454<ol> 1455<li> 1456<p>foo</p> 1457<ul> 1458<li>bar</li> 1459</ul> 1460</li> 1461</ol> 1462```````````````````````````````` 1463 1464 1465 1466The contents of a code block are literal text, and do not get parsed 1467as Markdown: 1468 1469```````````````````````````````` example 1470 <a/> 1471 *hi* 1472 1473 - one 1474. 1475<pre><code><a/> 1476*hi* 1477 1478- one 1479</code></pre> 1480```````````````````````````````` 1481 1482 1483Here we have three chunks separated by blank lines: 1484 1485```````````````````````````````` example 1486 chunk1 1487 1488 chunk2 1489 1490 1491 1492 chunk3 1493. 1494<pre><code>chunk1 1495 1496chunk2 1497 1498 1499 1500chunk3 1501</code></pre> 1502```````````````````````````````` 1503 1504 1505Any initial spaces beyond four will be included in the content, even 1506in interior blank lines: 1507 1508```````````````````````````````` example 1509 chunk1 1510 1511 chunk2 1512. 1513<pre><code>chunk1 1514 1515 chunk2 1516</code></pre> 1517```````````````````````````````` 1518 1519 1520An indented code block cannot interrupt a paragraph. (This 1521allows hanging indents and the like.) 1522 1523```````````````````````````````` example 1524Foo 1525 bar 1526 1527. 1528<p>Foo 1529bar</p> 1530```````````````````````````````` 1531 1532 1533However, any non-blank line with fewer than four leading spaces ends 1534the code block immediately. So a paragraph may occur immediately 1535after indented code: 1536 1537```````````````````````````````` example 1538 foo 1539bar 1540. 1541<pre><code>foo 1542</code></pre> 1543<p>bar</p> 1544```````````````````````````````` 1545 1546 1547And indented code can occur immediately before and after other kinds of 1548blocks: 1549 1550```````````````````````````````` example 1551# Heading 1552 foo 1553Heading 1554------ 1555 foo 1556---- 1557. 1558<h1>Heading</h1> 1559<pre><code>foo 1560</code></pre> 1561<h2>Heading</h2> 1562<pre><code>foo 1563</code></pre> 1564<hr /> 1565```````````````````````````````` 1566 1567 1568The first line can be indented more than four spaces: 1569 1570```````````````````````````````` example 1571 foo 1572 bar 1573. 1574<pre><code> foo 1575bar 1576</code></pre> 1577```````````````````````````````` 1578 1579 1580Blank lines preceding or following an indented code block 1581are not included in it: 1582 1583```````````````````````````````` example 1584 1585 1586 foo 1587 1588 1589. 1590<pre><code>foo 1591</code></pre> 1592```````````````````````````````` 1593 1594 1595Trailing spaces are included in the code block's content: 1596 1597```````````````````````````````` example 1598 foo 1599. 1600<pre><code>foo 1601</code></pre> 1602```````````````````````````````` 1603 1604 1605 1606## Fenced code blocks 1607 1608A [code fence](@) is a sequence 1609of at least three consecutive backtick characters (`` ` ``) or 1610tildes (`~`). (Tildes and backticks cannot be mixed.) 1611A [fenced code block](@) 1612begins with a code fence, indented no more than three spaces. 1613 1614The line with the opening code fence may optionally contain some text 1615following the code fence; this is trimmed of leading and trailing 1616whitespace and called the [info string](@). If the [info string] comes 1617after a backtick fence, it may not contain any backtick 1618characters. (The reason for this restriction is that otherwise 1619some inline code would be incorrectly interpreted as the 1620beginning of a fenced code block.) 1621 1622The content of the code block consists of all subsequent lines, until 1623a closing [code fence] of the same type as the code block 1624began with (backticks or tildes), and with at least as many backticks 1625or tildes as the opening code fence. If the leading code fence is 1626indented N spaces, then up to N spaces of indentation are removed from 1627each line of the content (if present). (If a content line is not 1628indented, it is preserved unchanged. If it is indented less than N 1629spaces, all of the indentation is removed.) 1630 1631The closing code fence may be indented up to three spaces, and may be 1632followed only by spaces, which are ignored. If the end of the 1633containing block (or document) is reached and no closing code fence 1634has been found, the code block contains all of the lines after the 1635opening code fence until the end of the containing block (or 1636document). (An alternative spec would require backtracking in the 1637event that a closing code fence is not found. But this makes parsing 1638much less efficient, and there seems to be no real downside to the 1639behavior described here.) 1640 1641A fenced code block may interrupt a paragraph, and does not require 1642a blank line either before or after. 1643 1644The content of a code fence is treated as literal text, not parsed 1645as inlines. The first word of the [info string] is typically used to 1646specify the language of the code sample, and rendered in the `class` 1647attribute of the `code` tag. However, this spec does not mandate any 1648particular treatment of the [info string]. 1649 1650Here is a simple example with backticks: 1651 1652```````````````````````````````` example 1653``` 1654< 1655 > 1656``` 1657. 1658<pre><code>< 1659 > 1660</code></pre> 1661```````````````````````````````` 1662 1663 1664With tildes: 1665 1666```````````````````````````````` example 1667~~~ 1668< 1669 > 1670~~~ 1671. 1672<pre><code>< 1673 > 1674</code></pre> 1675```````````````````````````````` 1676 1677Fewer than three backticks is not enough: 1678 1679```````````````````````````````` example 1680`` 1681foo 1682`` 1683. 1684<p><code>foo</code></p> 1685```````````````````````````````` 1686 1687The closing code fence must use the same character as the opening 1688fence: 1689 1690```````````````````````````````` example 1691``` 1692aaa 1693~~~ 1694``` 1695. 1696<pre><code>aaa 1697~~~ 1698</code></pre> 1699```````````````````````````````` 1700 1701 1702```````````````````````````````` example 1703~~~ 1704aaa 1705``` 1706~~~ 1707. 1708<pre><code>aaa 1709``` 1710</code></pre> 1711```````````````````````````````` 1712 1713 1714The closing code fence must be at least as long as the opening fence: 1715 1716```````````````````````````````` example 1717```` 1718aaa 1719``` 1720`````` 1721. 1722<pre><code>aaa 1723``` 1724</code></pre> 1725```````````````````````````````` 1726 1727 1728```````````````````````````````` example 1729~~~~ 1730aaa 1731~~~ 1732~~~~ 1733. 1734<pre><code>aaa 1735~~~ 1736</code></pre> 1737```````````````````````````````` 1738 1739 1740Unclosed code blocks are closed by the end of the document 1741(or the enclosing [block quote][block quotes] or [list item][list items]): 1742 1743```````````````````````````````` example 1744``` 1745. 1746<pre><code></code></pre> 1747```````````````````````````````` 1748 1749 1750```````````````````````````````` example 1751````` 1752 1753``` 1754aaa 1755. 1756<pre><code> 1757``` 1758aaa 1759</code></pre> 1760```````````````````````````````` 1761 1762 1763```````````````````````````````` example 1764> ``` 1765> aaa 1766 1767bbb 1768. 1769<blockquote> 1770<pre><code>aaa 1771</code></pre> 1772</blockquote> 1773<p>bbb</p> 1774```````````````````````````````` 1775 1776 1777A code block can have all empty lines as its content: 1778 1779```````````````````````````````` example 1780``` 1781 1782 1783``` 1784. 1785<pre><code> 1786 1787</code></pre> 1788```````````````````````````````` 1789 1790 1791A code block can be empty: 1792 1793```````````````````````````````` example 1794``` 1795``` 1796. 1797<pre><code></code></pre> 1798```````````````````````````````` 1799 1800 1801Fences can be indented. If the opening fence is indented, 1802content lines will have equivalent opening indentation removed, 1803if present: 1804 1805```````````````````````````````` example 1806 ``` 1807 aaa 1808aaa 1809``` 1810. 1811<pre><code>aaa 1812aaa 1813</code></pre> 1814```````````````````````````````` 1815 1816 1817```````````````````````````````` example 1818 ``` 1819aaa 1820 aaa 1821aaa 1822 ``` 1823. 1824<pre><code>aaa 1825aaa 1826aaa 1827</code></pre> 1828```````````````````````````````` 1829 1830 1831```````````````````````````````` example 1832 ``` 1833 aaa 1834 aaa 1835 aaa 1836 ``` 1837. 1838<pre><code>aaa 1839 aaa 1840aaa 1841</code></pre> 1842```````````````````````````````` 1843 1844 1845Four spaces indentation produces an indented code block: 1846 1847```````````````````````````````` example 1848 ``` 1849 aaa 1850 ``` 1851. 1852<pre><code>``` 1853aaa 1854``` 1855</code></pre> 1856```````````````````````````````` 1857 1858 1859Closing fences may be indented by 0-3 spaces, and their indentation 1860need not match that of the opening fence: 1861 1862```````````````````````````````` example 1863``` 1864aaa 1865 ``` 1866. 1867<pre><code>aaa 1868</code></pre> 1869```````````````````````````````` 1870 1871 1872```````````````````````````````` example 1873 ``` 1874aaa 1875 ``` 1876. 1877<pre><code>aaa 1878</code></pre> 1879```````````````````````````````` 1880 1881 1882This is not a closing fence, because it is indented 4 spaces: 1883 1884```````````````````````````````` example 1885``` 1886aaa 1887 ``` 1888. 1889<pre><code>aaa 1890 ``` 1891</code></pre> 1892```````````````````````````````` 1893 1894 1895 1896Code fences (opening and closing) cannot contain internal spaces: 1897 1898```````````````````````````````` example 1899``` ``` 1900aaa 1901. 1902<p><code> </code> 1903aaa</p> 1904```````````````````````````````` 1905 1906 1907```````````````````````````````` example 1908~~~~~~ 1909aaa 1910~~~ ~~ 1911. 1912<pre><code>aaa 1913~~~ ~~ 1914</code></pre> 1915```````````````````````````````` 1916 1917 1918Fenced code blocks can interrupt paragraphs, and can be followed 1919directly by paragraphs, without a blank line between: 1920 1921```````````````````````````````` example 1922foo 1923``` 1924bar 1925``` 1926baz 1927. 1928<p>foo</p> 1929<pre><code>bar 1930</code></pre> 1931<p>baz</p> 1932```````````````````````````````` 1933 1934 1935Other blocks can also occur before and after fenced code blocks 1936without an intervening blank line: 1937 1938```````````````````````````````` example 1939foo 1940--- 1941~~~ 1942bar 1943~~~ 1944# baz 1945. 1946<h2>foo</h2> 1947<pre><code>bar 1948</code></pre> 1949<h1>baz</h1> 1950```````````````````````````````` 1951 1952 1953An [info string] can be provided after the opening code fence. 1954Although this spec doesn't mandate any particular treatment of 1955the info string, the first word is typically used to specify 1956the language of the code block. In HTML output, the language is 1957normally indicated by adding a class to the `code` element consisting 1958of `language-` followed by the language name. 1959 1960```````````````````````````````` example 1961```ruby 1962def foo(x) 1963 return 3 1964end 1965``` 1966. 1967<pre><code class="language-ruby">def foo(x) 1968 return 3 1969end 1970</code></pre> 1971```````````````````````````````` 1972 1973 1974```````````````````````````````` example 1975~~~~ ruby startline=3 $%@#$ 1976def foo(x) 1977 return 3 1978end 1979~~~~~~~ 1980. 1981<pre><code class="language-ruby">def foo(x) 1982 return 3 1983end 1984</code></pre> 1985```````````````````````````````` 1986 1987 1988```````````````````````````````` example 1989````; 1990```` 1991. 1992<pre><code class="language-;"></code></pre> 1993```````````````````````````````` 1994 1995 1996[Info strings] for backtick code blocks cannot contain backticks: 1997 1998```````````````````````````````` example 1999``` aa ``` 2000foo 2001. 2002<p><code>aa</code> 2003foo</p> 2004```````````````````````````````` 2005 2006 2007[Info strings] for tilde code blocks can contain backticks and tildes: 2008 2009```````````````````````````````` example 2010~~~ aa ``` ~~~ 2011foo 2012~~~ 2013. 2014<pre><code class="language-aa">foo 2015</code></pre> 2016```````````````````````````````` 2017 2018 2019Closing code fences cannot have [info strings]: 2020 2021```````````````````````````````` example 2022``` 2023``` aaa 2024``` 2025. 2026<pre><code>``` aaa 2027</code></pre> 2028```````````````````````````````` 2029 2030 2031 2032## HTML blocks 2033 2034An [HTML block](@) is a group of lines that is treated 2035as raw HTML (and will not be escaped in HTML output). 2036 2037There are seven kinds of [HTML block], which can be defined by their 2038start and end conditions. The block begins with a line that meets a 2039[start condition](@) (after up to three spaces optional indentation). 2040It ends with the first subsequent line that meets a matching [end 2041condition](@), or the last line of the document, or the last line of 2042the [container block](#container-blocks) containing the current HTML 2043block, if no line is encountered that meets the [end condition]. If 2044the first line meets both the [start condition] and the [end 2045condition], the block will contain just that line. 2046 20471. **Start condition:** line begins with the string `<script`, 2048`<pre`, or `<style` (case-insensitive), followed by whitespace, 2049the string `>`, or the end of the line.\ 2050**End condition:** line contains an end tag 2051`</script>`, `</pre>`, or `</style>` (case-insensitive; it 2052need not match the start tag). 2053 20542. **Start condition:** line begins with the string `<!--`.\ 2055**End condition:** line contains the string `-->`. 2056 20573. **Start condition:** line begins with the string `<?`.\ 2058**End condition:** line contains the string `?>`. 2059 20604. **Start condition:** line begins with the string `<!` 2061followed by an uppercase ASCII letter.\ 2062**End condition:** line contains the character `>`. 2063 20645. **Start condition:** line begins with the string 2065`<![CDATA[`.\ 2066**End condition:** line contains the string `]]>`. 2067 20686. **Start condition:** line begins with the string `<` or `</` 2069followed by one of the strings (case-insensitive) `address`, 2070`article`, `aside`, `base`, `basefont`, `blockquote`, `body`, 2071`caption`, `center`, `col`, `colgroup`, `dd`, `details`, `dialog`, 2072`dir`, `div`, `dl`, `dt`, `fieldset`, `figcaption`, `figure`, 2073`footer`, `form`, `frame`, `frameset`, 2074`h1`, `h2`, `h3`, `h4`, `h5`, `h6`, `head`, `header`, `hr`, 2075`html`, `iframe`, `legend`, `li`, `link`, `main`, `menu`, `menuitem`, 2076`nav`, `noframes`, `ol`, `optgroup`, `option`, `p`, `param`, 2077`section`, `summary`, `table`, `tbody`, `td`, 2078`tfoot`, `th`, `thead`, `title`, `tr`, `track`, `ul`, followed 2079by [whitespace], the end of the line, the string `>`, or 2080the string `/>`.\ 2081**End condition:** line is followed by a [blank line]. 2082 20837. **Start condition:** line begins with a complete [open tag] 2084(with any [tag name] other than `script`, 2085`style`, or `pre`) or a complete [closing tag], 2086followed only by [whitespace] or the end of the line.\ 2087**End condition:** line is followed by a [blank line]. 2088 2089HTML blocks continue until they are closed by their appropriate 2090[end condition], or the last line of the document or other [container 2091block](#container-blocks). This means any HTML **within an HTML 2092block** that might otherwise be recognised as a start condition will 2093be ignored by the parser and passed through as-is, without changing 2094the parser's state. 2095 2096For instance, `<pre>` within a HTML block started by `<table>` will not affect 2097the parser state; as the HTML block was started in by start condition 6, it 2098will end at any blank line. This can be surprising: 2099 2100```````````````````````````````` example 2101<table><tr><td> 2102<pre> 2103**Hello**, 2104 2105_world_. 2106</pre> 2107</td></tr></table> 2108. 2109<table><tr><td> 2110<pre> 2111**Hello**, 2112<p><em>world</em>. 2113</pre></p> 2114</td></tr></table> 2115```````````````````````````````` 2116 2117In this case, the HTML block is terminated by the newline — the `**Hello**` 2118text remains verbatim — and regular parsing resumes, with a paragraph, 2119emphasised `world` and inline and block HTML following. 2120 2121All types of [HTML blocks] except type 7 may interrupt 2122a paragraph. Blocks of type 7 may not interrupt a paragraph. 2123(This restriction is intended to prevent unwanted interpretation 2124of long tags inside a wrapped paragraph as starting HTML blocks.) 2125 2126Some simple examples follow. Here are some basic HTML blocks 2127of type 6: 2128 2129```````````````````````````````` example 2130<table> 2131 <tr> 2132 <td> 2133 hi 2134 </td> 2135 </tr> 2136</table> 2137 2138okay. 2139. 2140<table> 2141 <tr> 2142 <td> 2143 hi 2144 </td> 2145 </tr> 2146</table> 2147<p>okay.</p> 2148```````````````````````````````` 2149 2150 2151```````````````````````````````` example 2152 <div> 2153 *hello* 2154 <foo><a> 2155. 2156 <div> 2157 *hello* 2158 <foo><a> 2159```````````````````````````````` 2160 2161 2162A block can also start with a closing tag: 2163 2164```````````````````````````````` example 2165</div> 2166*foo* 2167. 2168</div> 2169*foo* 2170```````````````````````````````` 2171 2172 2173Here we have two HTML blocks with a Markdown paragraph between them: 2174 2175```````````````````````````````` example 2176<DIV CLASS="foo"> 2177 2178*Markdown* 2179 2180</DIV> 2181. 2182<DIV CLASS="foo"> 2183<p><em>Markdown</em></p> 2184</DIV> 2185```````````````````````````````` 2186 2187 2188The tag on the first line can be partial, as long 2189as it is split where there would be whitespace: 2190 2191```````````````````````````````` example 2192<div id="foo" 2193 class="bar"> 2194</div> 2195. 2196<div id="foo" 2197 class="bar"> 2198</div> 2199```````````````````````````````` 2200 2201 2202```````````````````````````````` example 2203<div id="foo" class="bar 2204 baz"> 2205</div> 2206. 2207<div id="foo" class="bar 2208 baz"> 2209</div> 2210```````````````````````````````` 2211 2212 2213An open tag need not be closed: 2214```````````````````````````````` example 2215<div> 2216*foo* 2217 2218*bar* 2219. 2220<div> 2221*foo* 2222<p><em>bar</em></p> 2223```````````````````````````````` 2224 2225 2226 2227A partial tag need not even be completed (garbage 2228in, garbage out): 2229 2230```````````````````````````````` example 2231<div id="foo" 2232*hi* 2233. 2234<div id="foo" 2235*hi* 2236```````````````````````````````` 2237 2238 2239```````````````````````````````` example 2240<div class 2241foo 2242. 2243<div class 2244foo 2245```````````````````````````````` 2246 2247 2248The initial tag doesn't even need to be a valid 2249tag, as long as it starts like one: 2250 2251```````````````````````````````` example 2252<div *???-&&&-<--- 2253*foo* 2254. 2255<div *???-&&&-<--- 2256*foo* 2257```````````````````````````````` 2258 2259 2260In type 6 blocks, the initial tag need not be on a line by 2261itself: 2262 2263```````````````````````````````` example 2264<div><a href="bar">*foo*</a></div> 2265. 2266<div><a href="bar">*foo*</a></div> 2267```````````````````````````````` 2268 2269 2270```````````````````````````````` example 2271<table><tr><td> 2272foo 2273</td></tr></table> 2274. 2275<table><tr><td> 2276foo 2277</td></tr></table> 2278```````````````````````````````` 2279 2280 2281Everything until the next blank line or end of document 2282gets included in the HTML block. So, in the following 2283example, what looks like a Markdown code block 2284is actually part of the HTML block, which continues until a blank 2285line or the end of the document is reached: 2286 2287```````````````````````````````` example 2288<div></div> 2289``` c 2290int x = 33; 2291``` 2292. 2293<div></div> 2294``` c 2295int x = 33; 2296``` 2297```````````````````````````````` 2298 2299 2300To start an [HTML block] with a tag that is *not* in the 2301list of block-level tags in (6), you must put the tag by 2302itself on the first line (and it must be complete): 2303 2304```````````````````````````````` example 2305<a href="foo"> 2306*bar* 2307</a> 2308. 2309<a href="foo"> 2310*bar* 2311</a> 2312```````````````````````````````` 2313 2314 2315In type 7 blocks, the [tag name] can be anything: 2316 2317```````````````````````````````` example 2318<Warning> 2319*bar* 2320</Warning> 2321. 2322<Warning> 2323*bar* 2324</Warning> 2325```````````````````````````````` 2326 2327 2328```````````````````````````````` example 2329<i class="foo"> 2330*bar* 2331</i> 2332. 2333<i class="foo"> 2334*bar* 2335</i> 2336```````````````````````````````` 2337 2338 2339```````````````````````````````` example 2340</ins> 2341*bar* 2342. 2343</ins> 2344*bar* 2345```````````````````````````````` 2346 2347 2348These rules are designed to allow us to work with tags that 2349can function as either block-level or inline-level tags. 2350The `<del>` tag is a nice example. We can surround content with 2351`<del>` tags in three different ways. In this case, we get a raw 2352HTML block, because the `<del>` tag is on a line by itself: 2353 2354```````````````````````````````` example 2355<del> 2356*foo* 2357</del> 2358. 2359<del> 2360*foo* 2361</del> 2362```````````````````````````````` 2363 2364 2365In this case, we get a raw HTML block that just includes 2366the `<del>` tag (because it ends with the following blank 2367line). So the contents get interpreted as CommonMark: 2368 2369```````````````````````````````` example 2370<del> 2371 2372*foo* 2373 2374</del> 2375. 2376<del> 2377<p><em>foo</em></p> 2378</del> 2379```````````````````````````````` 2380 2381 2382Finally, in this case, the `<del>` tags are interpreted 2383as [raw HTML] *inside* the CommonMark paragraph. (Because 2384the tag is not on a line by itself, we get inline HTML 2385rather than an [HTML block].) 2386 2387```````````````````````````````` example 2388<del>*foo*</del> 2389. 2390<p><del><em>foo</em></del></p> 2391```````````````````````````````` 2392 2393 2394HTML tags designed to contain literal content 2395(`script`, `style`, `pre`), comments, processing instructions, 2396and declarations are treated somewhat differently. 2397Instead of ending at the first blank line, these blocks 2398end at the first line containing a corresponding end tag. 2399As a result, these blocks can contain blank lines: 2400 2401A pre tag (type 1): 2402 2403```````````````````````````````` example 2404<pre language="haskell"><code> 2405import Text.HTML.TagSoup 2406 2407main :: IO () 2408main = print $ parseTags tags 2409</code></pre> 2410okay 2411. 2412<pre language="haskell"><code> 2413import Text.HTML.TagSoup 2414 2415main :: IO () 2416main = print $ parseTags tags 2417</code></pre> 2418<p>okay</p> 2419```````````````````````````````` 2420 2421 2422A script tag (type 1): 2423 2424```````````````````````````````` example 2425<script type="text/javascript"> 2426// JavaScript example 2427 2428document.getElementById("demo").innerHTML = "Hello JavaScript!"; 2429</script> 2430okay 2431. 2432<script type="text/javascript"> 2433// JavaScript example 2434 2435document.getElementById("demo").innerHTML = "Hello JavaScript!"; 2436</script> 2437<p>okay</p> 2438```````````````````````````````` 2439 2440 2441A style tag (type 1): 2442 2443```````````````````````````````` example 2444<style 2445 type="text/css"> 2446h1 {color:red;} 2447 2448p {color:blue;} 2449</style> 2450okay 2451. 2452<style 2453 type="text/css"> 2454h1 {color:red;} 2455 2456p {color:blue;} 2457</style> 2458<p>okay</p> 2459```````````````````````````````` 2460 2461 2462If there is no matching end tag, the block will end at the 2463end of the document (or the enclosing [block quote][block quotes] 2464or [list item][list items]): 2465 2466```````````````````````````````` example 2467<style 2468 type="text/css"> 2469 2470foo 2471. 2472<style 2473 type="text/css"> 2474 2475foo 2476```````````````````````````````` 2477 2478 2479```````````````````````````````` example 2480> <div> 2481> foo 2482 2483bar 2484. 2485<blockquote> 2486<div> 2487foo 2488</blockquote> 2489<p>bar</p> 2490```````````````````````````````` 2491 2492 2493```````````````````````````````` example 2494- <div> 2495- foo 2496. 2497<ul> 2498<li> 2499<div> 2500</li> 2501<li>foo</li> 2502</ul> 2503```````````````````````````````` 2504 2505 2506The end tag can occur on the same line as the start tag: 2507 2508```````````````````````````````` example 2509<style>p{color:red;}</style> 2510*foo* 2511. 2512<style>p{color:red;}</style> 2513<p><em>foo</em></p> 2514```````````````````````````````` 2515 2516 2517```````````````````````````````` example 2518<!-- foo -->*bar* 2519*baz* 2520. 2521<!-- foo -->*bar* 2522<p><em>baz</em></p> 2523```````````````````````````````` 2524 2525 2526Note that anything on the last line after the 2527end tag will be included in the [HTML block]: 2528 2529```````````````````````````````` example 2530<script> 2531foo 2532</script>1. *bar* 2533. 2534<script> 2535foo 2536</script>1. *bar* 2537```````````````````````````````` 2538 2539 2540A comment (type 2): 2541 2542```````````````````````````````` example 2543<!-- Foo 2544 2545bar 2546 baz --> 2547okay 2548. 2549<!-- Foo 2550 2551bar 2552 baz --> 2553<p>okay</p> 2554```````````````````````````````` 2555 2556 2557 2558A processing instruction (type 3): 2559 2560```````````````````````````````` example 2561<?php 2562 2563 echo '>'; 2564 2565?> 2566okay 2567. 2568<?php 2569 2570 echo '>'; 2571 2572?> 2573<p>okay</p> 2574```````````````````````````````` 2575 2576 2577A declaration (type 4): 2578 2579```````````````````````````````` example 2580<!DOCTYPE html> 2581. 2582<!DOCTYPE html> 2583```````````````````````````````` 2584 2585 2586CDATA (type 5): 2587 2588```````````````````````````````` example 2589<![CDATA[ 2590function matchwo(a,b) 2591{ 2592 if (a < b && a < 0) then { 2593 return 1; 2594 2595 } else { 2596 2597 return 0; 2598 } 2599} 2600]]> 2601okay 2602. 2603<![CDATA[ 2604function matchwo(a,b) 2605{ 2606 if (a < b && a < 0) then { 2607 return 1; 2608 2609 } else { 2610 2611 return 0; 2612 } 2613} 2614]]> 2615<p>okay</p> 2616```````````````````````````````` 2617 2618 2619The opening tag can be indented 1-3 spaces, but not 4: 2620 2621```````````````````````````````` example 2622 <!-- foo --> 2623 2624 <!-- foo --> 2625. 2626 <!-- foo --> 2627<pre><code><!-- foo --> 2628</code></pre> 2629```````````````````````````````` 2630 2631 2632```````````````````````````````` example 2633 <div> 2634 2635 <div> 2636. 2637 <div> 2638<pre><code><div> 2639</code></pre> 2640```````````````````````````````` 2641 2642 2643An HTML block of types 1--6 can interrupt a paragraph, and need not be 2644preceded by a blank line. 2645 2646```````````````````````````````` example 2647Foo 2648<div> 2649bar 2650</div> 2651. 2652<p>Foo</p> 2653<div> 2654bar 2655</div> 2656```````````````````````````````` 2657 2658 2659However, a following blank line is needed, except at the end of 2660a document, and except for blocks of types 1--5, [above][HTML 2661block]: 2662 2663```````````````````````````````` example 2664<div> 2665bar 2666</div> 2667*foo* 2668. 2669<div> 2670bar 2671</div> 2672*foo* 2673```````````````````````````````` 2674 2675 2676HTML blocks of type 7 cannot interrupt a paragraph: 2677 2678```````````````````````````````` example 2679Foo 2680<a href="bar"> 2681baz 2682. 2683<p>Foo 2684<a href="bar"> 2685baz</p> 2686```````````````````````````````` 2687 2688 2689This rule differs from John Gruber's original Markdown syntax 2690specification, which says: 2691 2692> The only restrictions are that block-level HTML elements — 2693> e.g. `<div>`, `<table>`, `<pre>`, `<p>`, etc. — must be separated from 2694> surrounding content by blank lines, and the start and end tags of the 2695> block should not be indented with tabs or spaces. 2696 2697In some ways Gruber's rule is more restrictive than the one given 2698here: 2699 2700- It requires that an HTML block be preceded by a blank line. 2701- It does not allow the start tag to be indented. 2702- It requires a matching end tag, which it also does not allow to 2703 be indented. 2704 2705Most Markdown implementations (including some of Gruber's own) do not 2706respect all of these restrictions. 2707 2708There is one respect, however, in which Gruber's rule is more liberal 2709than the one given here, since it allows blank lines to occur inside 2710an HTML block. There are two reasons for disallowing them here. 2711First, it removes the need to parse balanced tags, which is 2712expensive and can require backtracking from the end of the document 2713if no matching end tag is found. Second, it provides a very simple 2714and flexible way of including Markdown content inside HTML tags: 2715simply separate the Markdown from the HTML using blank lines: 2716 2717Compare: 2718 2719```````````````````````````````` example 2720<div> 2721 2722*Emphasized* text. 2723 2724</div> 2725. 2726<div> 2727<p><em>Emphasized</em> text.</p> 2728</div> 2729```````````````````````````````` 2730 2731 2732```````````````````````````````` example 2733<div> 2734*Emphasized* text. 2735</div> 2736. 2737<div> 2738*Emphasized* text. 2739</div> 2740```````````````````````````````` 2741 2742 2743Some Markdown implementations have adopted a convention of 2744interpreting content inside tags as text if the open tag has 2745the attribute `markdown=1`. The rule given above seems a simpler and 2746more elegant way of achieving the same expressive power, which is also 2747much simpler to parse. 2748 2749The main potential drawback is that one can no longer paste HTML 2750blocks into Markdown documents with 100% reliability. However, 2751*in most cases* this will work fine, because the blank lines in 2752HTML are usually followed by HTML block tags. For example: 2753 2754```````````````````````````````` example 2755<table> 2756 2757<tr> 2758 2759<td> 2760Hi 2761</td> 2762 2763</tr> 2764 2765</table> 2766. 2767<table> 2768<tr> 2769<td> 2770Hi 2771</td> 2772</tr> 2773</table> 2774```````````````````````````````` 2775 2776 2777There are problems, however, if the inner tags are indented 2778*and* separated by spaces, as then they will be interpreted as 2779an indented code block: 2780 2781```````````````````````````````` example 2782<table> 2783 2784 <tr> 2785 2786 <td> 2787 Hi 2788 </td> 2789 2790 </tr> 2791 2792</table> 2793. 2794<table> 2795 <tr> 2796<pre><code><td> 2797 Hi 2798</td> 2799</code></pre> 2800 </tr> 2801</table> 2802```````````````````````````````` 2803 2804 2805Fortunately, blank lines are usually not necessary and can be 2806deleted. The exception is inside `<pre>` tags, but as described 2807[above][HTML blocks], raw HTML blocks starting with `<pre>` 2808*can* contain blank lines. 2809 2810## Link reference definitions 2811 2812A [link reference definition](@) 2813consists of a [link label], indented up to three spaces, followed 2814by a colon (`:`), optional [whitespace] (including up to one 2815[line ending]), a [link destination], 2816optional [whitespace] (including up to one 2817[line ending]), and an optional [link 2818title], which if it is present must be separated 2819from the [link destination] by [whitespace]. 2820No further [non-whitespace characters] may occur on the line. 2821 2822A [link reference definition] 2823does not correspond to a structural element of a document. Instead, it 2824defines a label which can be used in [reference links] 2825and reference-style [images] elsewhere in the document. [Link 2826reference definitions] can come either before or after the links that use 2827them. 2828 2829```````````````````````````````` example 2830[foo]: /url "title" 2831 2832[foo] 2833. 2834<p><a href="/url" title="title">foo</a></p> 2835```````````````````````````````` 2836 2837 2838```````````````````````````````` example 2839 [foo]: 2840 /url 2841 'the title' 2842 2843[foo] 2844. 2845<p><a href="/url" title="the title">foo</a></p> 2846```````````````````````````````` 2847 2848 2849```````````````````````````````` example 2850[Foo*bar\]]:my_(url) 'title (with parens)' 2851 2852[Foo*bar\]] 2853. 2854<p><a href="my_(url)" title="title (with parens)">Foo*bar]</a></p> 2855```````````````````````````````` 2856 2857 2858```````````````````````````````` example 2859[Foo bar]: 2860<my url> 2861'title' 2862 2863[Foo bar] 2864. 2865<p><a href="my%20url" title="title">Foo bar</a></p> 2866```````````````````````````````` 2867 2868 2869The title may extend over multiple lines: 2870 2871```````````````````````````````` example 2872[foo]: /url ' 2873title 2874line1 2875line2 2876' 2877 2878[foo] 2879. 2880<p><a href="/url" title=" 2881title 2882line1 2883line2 2884">foo</a></p> 2885```````````````````````````````` 2886 2887 2888However, it may not contain a [blank line]: 2889 2890```````````````````````````````` example 2891[foo]: /url 'title 2892 2893with blank line' 2894 2895[foo] 2896. 2897<p>[foo]: /url 'title</p> 2898<p>with blank line'</p> 2899<p>[foo]</p> 2900```````````````````````````````` 2901 2902 2903The title may be omitted: 2904 2905```````````````````````````````` example 2906[foo]: 2907/url 2908 2909[foo] 2910. 2911<p><a href="/url">foo</a></p> 2912```````````````````````````````` 2913 2914 2915The link destination may not be omitted: 2916 2917```````````````````````````````` example 2918[foo]: 2919 2920[foo] 2921. 2922<p>[foo]:</p> 2923<p>[foo]</p> 2924```````````````````````````````` 2925 2926 However, an empty link destination may be specified using 2927 angle brackets: 2928 2929```````````````````````````````` example 2930[foo]: <> 2931 2932[foo] 2933. 2934<p><a href="">foo</a></p> 2935```````````````````````````````` 2936 2937The title must be separated from the link destination by 2938whitespace: 2939 2940```````````````````````````````` example 2941[foo]: <bar>(baz) 2942 2943[foo] 2944. 2945<p>[foo]: <bar>(baz)</p> 2946<p>[foo]</p> 2947```````````````````````````````` 2948 2949 2950Both title and destination can contain backslash escapes 2951and literal backslashes: 2952 2953```````````````````````````````` example 2954[foo]: /url\bar\*baz "foo\"bar\baz" 2955 2956[foo] 2957. 2958<p><a href="/url%5Cbar*baz" title="foo"bar\baz">foo</a></p> 2959```````````````````````````````` 2960 2961 2962A link can come before its corresponding definition: 2963 2964```````````````````````````````` example 2965[foo] 2966 2967[foo]: url 2968. 2969<p><a href="url">foo</a></p> 2970```````````````````````````````` 2971 2972 2973If there are several matching definitions, the first one takes 2974precedence: 2975 2976```````````````````````````````` example 2977[foo] 2978 2979[foo]: first 2980[foo]: second 2981. 2982<p><a href="first">foo</a></p> 2983```````````````````````````````` 2984 2985 2986As noted in the section on [Links], matching of labels is 2987case-insensitive (see [matches]). 2988 2989```````````````````````````````` example 2990[FOO]: /url 2991 2992[Foo] 2993. 2994<p><a href="/url">Foo</a></p> 2995```````````````````````````````` 2996 2997 2998```````````````````````````````` example 2999[ΑΓΩ]: /φου 3000 3001[αγω] 3002. 3003<p><a href="/%CF%86%CE%BF%CF%85">αγω</a></p> 3004```````````````````````````````` 3005 3006 3007Here is a link reference definition with no corresponding link. 3008It contributes nothing to the document. 3009 3010```````````````````````````````` example 3011[foo]: /url 3012. 3013```````````````````````````````` 3014 3015 3016Here is another one: 3017 3018```````````````````````````````` example 3019[ 3020foo 3021]: /url 3022bar 3023. 3024<p>bar</p> 3025```````````````````````````````` 3026 3027 3028This is not a link reference definition, because there are 3029[non-whitespace characters] after the title: 3030 3031```````````````````````````````` example 3032[foo]: /url "title" ok 3033. 3034<p>[foo]: /url "title" ok</p> 3035```````````````````````````````` 3036 3037 3038This is a link reference definition, but it has no title: 3039 3040```````````````````````````````` example 3041[foo]: /url 3042"title" ok 3043. 3044<p>"title" ok</p> 3045```````````````````````````````` 3046 3047 3048This is not a link reference definition, because it is indented 3049four spaces: 3050 3051```````````````````````````````` example 3052 [foo]: /url "title" 3053 3054[foo] 3055. 3056<pre><code>[foo]: /url "title" 3057</code></pre> 3058<p>[foo]</p> 3059```````````````````````````````` 3060 3061 3062This is not a link reference definition, because it occurs inside 3063a code block: 3064 3065```````````````````````````````` example 3066``` 3067[foo]: /url 3068``` 3069 3070[foo] 3071. 3072<pre><code>[foo]: /url 3073</code></pre> 3074<p>[foo]</p> 3075```````````````````````````````` 3076 3077 3078A [link reference definition] cannot interrupt a paragraph. 3079 3080```````````````````````````````` example 3081Foo 3082[bar]: /baz 3083 3084[bar] 3085. 3086<p>Foo 3087[bar]: /baz</p> 3088<p>[bar]</p> 3089```````````````````````````````` 3090 3091 3092However, it can directly follow other block elements, such as headings 3093and thematic breaks, and it need not be followed by a blank line. 3094 3095```````````````````````````````` example 3096# [Foo] 3097[foo]: /url 3098> bar 3099. 3100<h1><a href="/url">Foo</a></h1> 3101<blockquote> 3102<p>bar</p> 3103</blockquote> 3104```````````````````````````````` 3105 3106```````````````````````````````` example 3107[foo]: /url 3108bar 3109=== 3110[foo] 3111. 3112<h1>bar</h1> 3113<p><a href="/url">foo</a></p> 3114```````````````````````````````` 3115 3116```````````````````````````````` example 3117[foo]: /url 3118=== 3119[foo] 3120. 3121<p>=== 3122<a href="/url">foo</a></p> 3123```````````````````````````````` 3124 3125 3126Several [link reference definitions] 3127can occur one after another, without intervening blank lines. 3128 3129```````````````````````````````` example 3130[foo]: /foo-url "foo" 3131[bar]: /bar-url 3132 "bar" 3133[baz]: /baz-url 3134 3135[foo], 3136[bar], 3137[baz] 3138. 3139<p><a href="/foo-url" title="foo">foo</a>, 3140<a href="/bar-url" title="bar">bar</a>, 3141<a href="/baz-url">baz</a></p> 3142```````````````````````````````` 3143 3144 3145[Link reference definitions] can occur 3146inside block containers, like lists and block quotations. They 3147affect the entire document, not just the container in which they 3148are defined: 3149 3150```````````````````````````````` example 3151[foo] 3152 3153> [foo]: /url 3154. 3155<p><a href="/url">foo</a></p> 3156<blockquote> 3157</blockquote> 3158```````````````````````````````` 3159 3160 3161Whether something is a [link reference definition] is 3162independent of whether the link reference it defines is 3163used in the document. Thus, for example, the following 3164document contains just a link reference definition, and 3165no visible content: 3166 3167```````````````````````````````` example 3168[foo]: /url 3169. 3170```````````````````````````````` 3171 3172 3173## Paragraphs 3174 3175A sequence of non-blank lines that cannot be interpreted as other 3176kinds of blocks forms a [paragraph](@). 3177The contents of the paragraph are the result of parsing the 3178paragraph's raw content as inlines. The paragraph's raw content 3179is formed by concatenating the lines and removing initial and final 3180[whitespace]. 3181 3182A simple example with two paragraphs: 3183 3184```````````````````````````````` example 3185aaa 3186 3187bbb 3188. 3189<p>aaa</p> 3190<p>bbb</p> 3191```````````````````````````````` 3192 3193 3194Paragraphs can contain multiple lines, but no blank lines: 3195 3196```````````````````````````````` example 3197aaa 3198bbb 3199 3200ccc 3201ddd 3202. 3203<p>aaa 3204bbb</p> 3205<p>ccc 3206ddd</p> 3207```````````````````````````````` 3208 3209 3210Multiple blank lines between paragraph have no effect: 3211 3212```````````````````````````````` example 3213aaa 3214 3215 3216bbb 3217. 3218<p>aaa</p> 3219<p>bbb</p> 3220```````````````````````````````` 3221 3222 3223Leading spaces are skipped: 3224 3225```````````````````````````````` example 3226 aaa 3227 bbb 3228. 3229<p>aaa 3230bbb</p> 3231```````````````````````````````` 3232 3233 3234Lines after the first may be indented any amount, since indented 3235code blocks cannot interrupt paragraphs. 3236 3237```````````````````````````````` example 3238aaa 3239 bbb 3240 ccc 3241. 3242<p>aaa 3243bbb 3244ccc</p> 3245```````````````````````````````` 3246 3247 3248However, the first line may be indented at most three spaces, 3249or an indented code block will be triggered: 3250 3251```````````````````````````````` example 3252 aaa 3253bbb 3254. 3255<p>aaa 3256bbb</p> 3257```````````````````````````````` 3258 3259 3260```````````````````````````````` example 3261 aaa 3262bbb 3263. 3264<pre><code>aaa 3265</code></pre> 3266<p>bbb</p> 3267```````````````````````````````` 3268 3269 3270Final spaces are stripped before inline parsing, so a paragraph 3271that ends with two or more spaces will not end with a [hard line 3272break]: 3273 3274```````````````````````````````` example 3275aaa 3276bbb 3277. 3278<p>aaa<br /> 3279bbb</p> 3280```````````````````````````````` 3281 3282 3283## Blank lines 3284 3285[Blank lines] between block-level elements are ignored, 3286except for the role they play in determining whether a [list] 3287is [tight] or [loose]. 3288 3289Blank lines at the beginning and end of the document are also ignored. 3290 3291```````````````````````````````` example 3292 3293 3294aaa 3295 3296 3297# aaa 3298 3299 3300. 3301<p>aaa</p> 3302<h1>aaa</h1> 3303```````````````````````````````` 3304 3305<div class="extension"> 3306 3307## Tables (extension) 3308 3309GFM enables the `table` extension, where an additional leaf block type is 3310available. 3311 3312A [table](@) is an arrangement of data with rows and columns, consisting of a 3313single header row, a [delimiter row] separating the header from the data, and 3314zero or more data rows. 3315 3316Each row consists of cells containing arbitrary text, in which [inlines] are 3317parsed, separated by pipes (`|`). A leading and trailing pipe is also 3318recommended for clarity of reading, and if there's otherwise parsing ambiguity. 3319Spaces between pipes and cell content are trimmed. Block-level elements cannot 3320be inserted in a table. 3321 3322The [delimiter row](@) consists of cells whose only content are hyphens (`-`), 3323and optionally, a leading or trailing colon (`:`), or both, to indicate left, 3324right, or center alignment respectively. 3325 3326```````````````````````````````` example table 3327| foo | bar | 3328| --- | --- | 3329| baz | bim | 3330. 3331<table> 3332<thead> 3333<tr> 3334<th>foo</th> 3335<th>bar</th> 3336</tr> 3337</thead> 3338<tbody> 3339<tr> 3340<td>baz</td> 3341<td>bim</td> 3342</tr> 3343</tbody> 3344</table> 3345```````````````````````````````` 3346 3347Cells in one column don't need to match length, though it's easier to read if 3348they are. Likewise, use of leading and trailing pipes may be inconsistent: 3349 3350```````````````````````````````` example table 3351| abc | defghi | 3352:-: | -----------: 3353bar | baz 3354. 3355<table> 3356<thead> 3357<tr> 3358<th align="center">abc</th> 3359<th align="right">defghi</th> 3360</tr> 3361</thead> 3362<tbody> 3363<tr> 3364<td align="center">bar</td> 3365<td align="right">baz</td> 3366</tr> 3367</tbody> 3368</table> 3369```````````````````````````````` 3370 3371Include a pipe in a cell's content by escaping it, including inside other 3372inline spans: 3373 3374```````````````````````````````` example table 3375| f\|oo | 3376| ------ | 3377| b `\|` az | 3378| b **\|** im | 3379. 3380<table> 3381<thead> 3382<tr> 3383<th>f|oo</th> 3384</tr> 3385</thead> 3386<tbody> 3387<tr> 3388<td>b <code>|</code> az</td> 3389</tr> 3390<tr> 3391<td>b <strong>|</strong> im</td> 3392</tr> 3393</tbody> 3394</table> 3395```````````````````````````````` 3396 3397The table is broken at the first empty line, or beginning of another 3398block-level structure: 3399 3400```````````````````````````````` example table 3401| abc | def | 3402| --- | --- | 3403| bar | baz | 3404> bar 3405. 3406<table> 3407<thead> 3408<tr> 3409<th>abc</th> 3410<th>def</th> 3411</tr> 3412</thead> 3413<tbody> 3414<tr> 3415<td>bar</td> 3416<td>baz</td> 3417</tr> 3418</tbody> 3419</table> 3420<blockquote> 3421<p>bar</p> 3422</blockquote> 3423```````````````````````````````` 3424 3425```````````````````````````````` example table 3426| abc | def | 3427| --- | --- | 3428| bar | baz | 3429bar 3430 3431bar 3432. 3433<table> 3434<thead> 3435<tr> 3436<th>abc</th> 3437<th>def</th> 3438</tr> 3439</thead> 3440<tbody> 3441<tr> 3442<td>bar</td> 3443<td>baz</td> 3444</tr> 3445<tr> 3446<td>bar</td> 3447<td></td> 3448</tr> 3449</tbody> 3450</table> 3451<p>bar</p> 3452```````````````````````````````` 3453 3454The header row must match the [delimiter row] in the number of cells. If not, 3455a table will not be recognized: 3456 3457```````````````````````````````` example table 3458| abc | def | 3459| --- | 3460| bar | 3461. 3462<p>| abc | def | 3463| --- | 3464| bar |</p> 3465```````````````````````````````` 3466 3467The remainder of the table's rows may vary in the number of cells. If there 3468are a number of cells fewer than the number of cells in the header row, empty 3469cells are inserted. If there are greater, the excess is ignored: 3470 3471```````````````````````````````` example table 3472| abc | def | 3473| --- | --- | 3474| bar | 3475| bar | baz | boo | 3476. 3477<table> 3478<thead> 3479<tr> 3480<th>abc</th> 3481<th>def</th> 3482</tr> 3483</thead> 3484<tbody> 3485<tr> 3486<td>bar</td> 3487<td></td> 3488</tr> 3489<tr> 3490<td>bar</td> 3491<td>baz</td> 3492</tr> 3493</tbody> 3494</table> 3495```````````````````````````````` 3496 3497If there are no rows in the body, no `<tbody>` is generated in HTML output: 3498 3499```````````````````````````````` example table 3500| abc | def | 3501| --- | --- | 3502. 3503<table> 3504<thead> 3505<tr> 3506<th>abc</th> 3507<th>def</th> 3508</tr> 3509</thead> 3510</table> 3511```````````````````````````````` 3512 3513</div> 3514 3515# Container blocks 3516 3517A [container block](#container-blocks) is a block that has other 3518blocks as its contents. There are two basic kinds of container blocks: 3519[block quotes] and [list items]. 3520[Lists] are meta-containers for [list items]. 3521 3522We define the syntax for container blocks recursively. The general 3523form of the definition is: 3524 3525> If X is a sequence of blocks, then the result of 3526> transforming X in such-and-such a way is a container of type Y 3527> with these blocks as its content. 3528 3529So, we explain what counts as a block quote or list item by explaining 3530how these can be *generated* from their contents. This should suffice 3531to define the syntax, although it does not give a recipe for *parsing* 3532these constructions. (A recipe is provided below in the section entitled 3533[A parsing strategy](#appendix-a-parsing-strategy).) 3534 3535## Block quotes 3536 3537A [block quote marker](@) 3538consists of 0-3 spaces of initial indent, plus (a) the character `>` together 3539with a following space, or (b) a single character `>` not followed by a space. 3540 3541The following rules define [block quotes]: 3542 35431. **Basic case.** If a string of lines *Ls* constitute a sequence 3544 of blocks *Bs*, then the result of prepending a [block quote 3545 marker] to the beginning of each line in *Ls* 3546 is a [block quote](#block-quotes) containing *Bs*. 3547 35482. **Laziness.** If a string of lines *Ls* constitute a [block 3549 quote](#block-quotes) with contents *Bs*, then the result of deleting 3550 the initial [block quote marker] from one or 3551 more lines in which the next [non-whitespace character] after the [block 3552 quote marker] is [paragraph continuation 3553 text] is a block quote with *Bs* as its content. 3554 [Paragraph continuation text](@) is text 3555 that will be parsed as part of the content of a paragraph, but does 3556 not occur at the beginning of the paragraph. 3557 35583. **Consecutiveness.** A document cannot contain two [block 3559 quotes] in a row unless there is a [blank line] between them. 3560 3561Nothing else counts as a [block quote](#block-quotes). 3562 3563Here is a simple example: 3564 3565```````````````````````````````` example 3566> # Foo 3567> bar 3568> baz 3569. 3570<blockquote> 3571<h1>Foo</h1> 3572<p>bar 3573baz</p> 3574</blockquote> 3575```````````````````````````````` 3576 3577 3578The spaces after the `>` characters can be omitted: 3579 3580```````````````````````````````` example 3581># Foo 3582>bar 3583> baz 3584. 3585<blockquote> 3586<h1>Foo</h1> 3587<p>bar 3588baz</p> 3589</blockquote> 3590```````````````````````````````` 3591 3592 3593The `>` characters can be indented 1-3 spaces: 3594 3595```````````````````````````````` example 3596 > # Foo 3597 > bar 3598 > baz 3599. 3600<blockquote> 3601<h1>Foo</h1> 3602<p>bar 3603baz</p> 3604</blockquote> 3605```````````````````````````````` 3606 3607 3608Four spaces gives us a code block: 3609 3610```````````````````````````````` example 3611 > # Foo 3612 > bar 3613 > baz 3614. 3615<pre><code>> # Foo 3616> bar 3617> baz 3618</code></pre> 3619```````````````````````````````` 3620 3621 3622The Laziness clause allows us to omit the `>` before 3623[paragraph continuation text]: 3624 3625```````````````````````````````` example 3626> # Foo 3627> bar 3628baz 3629. 3630<blockquote> 3631<h1>Foo</h1> 3632<p>bar 3633baz</p> 3634</blockquote> 3635```````````````````````````````` 3636 3637 3638A block quote can contain some lazy and some non-lazy 3639continuation lines: 3640 3641```````````````````````````````` example 3642> bar 3643baz 3644> foo 3645. 3646<blockquote> 3647<p>bar 3648baz 3649foo</p> 3650</blockquote> 3651```````````````````````````````` 3652 3653 3654Laziness only applies to lines that would have been continuations of 3655paragraphs had they been prepended with [block quote markers]. 3656For example, the `> ` cannot be omitted in the second line of 3657 3658``` markdown 3659> foo 3660> --- 3661``` 3662 3663without changing the meaning: 3664 3665```````````````````````````````` example 3666> foo 3667--- 3668. 3669<blockquote> 3670<p>foo</p> 3671</blockquote> 3672<hr /> 3673```````````````````````````````` 3674 3675 3676Similarly, if we omit the `> ` in the second line of 3677 3678``` markdown 3679> - foo 3680> - bar 3681``` 3682 3683then the block quote ends after the first line: 3684 3685```````````````````````````````` example 3686> - foo 3687- bar 3688. 3689<blockquote> 3690<ul> 3691<li>foo</li> 3692</ul> 3693</blockquote> 3694<ul> 3695<li>bar</li> 3696</ul> 3697```````````````````````````````` 3698 3699 3700For the same reason, we can't omit the `> ` in front of 3701subsequent lines of an indented or fenced code block: 3702 3703```````````````````````````````` example 3704> foo 3705 bar 3706. 3707<blockquote> 3708<pre><code>foo 3709</code></pre> 3710</blockquote> 3711<pre><code>bar 3712</code></pre> 3713```````````````````````````````` 3714 3715 3716```````````````````````````````` example 3717> ``` 3718foo 3719``` 3720. 3721<blockquote> 3722<pre><code></code></pre> 3723</blockquote> 3724<p>foo</p> 3725<pre><code></code></pre> 3726```````````````````````````````` 3727 3728 3729Note that in the following case, we have a [lazy 3730continuation line]: 3731 3732```````````````````````````````` example 3733> foo 3734 - bar 3735. 3736<blockquote> 3737<p>foo 3738- bar</p> 3739</blockquote> 3740```````````````````````````````` 3741 3742 3743To see why, note that in 3744 3745```markdown 3746> foo 3747> - bar 3748``` 3749 3750the `- bar` is indented too far to start a list, and can't 3751be an indented code block because indented code blocks cannot 3752interrupt paragraphs, so it is [paragraph continuation text]. 3753 3754A block quote can be empty: 3755 3756```````````````````````````````` example 3757> 3758. 3759<blockquote> 3760</blockquote> 3761```````````````````````````````` 3762 3763 3764```````````````````````````````` example 3765> 3766> 3767> 3768. 3769<blockquote> 3770</blockquote> 3771```````````````````````````````` 3772 3773 3774A block quote can have initial or final blank lines: 3775 3776```````````````````````````````` example 3777> 3778> foo 3779> 3780. 3781<blockquote> 3782<p>foo</p> 3783</blockquote> 3784```````````````````````````````` 3785 3786 3787A blank line always separates block quotes: 3788 3789```````````````````````````````` example 3790> foo 3791 3792> bar 3793. 3794<blockquote> 3795<p>foo</p> 3796</blockquote> 3797<blockquote> 3798<p>bar</p> 3799</blockquote> 3800```````````````````````````````` 3801 3802 3803(Most current Markdown implementations, including John Gruber's 3804original `Markdown.pl`, will parse this example as a single block quote 3805with two paragraphs. But it seems better to allow the author to decide 3806whether two block quotes or one are wanted.) 3807 3808Consecutiveness means that if we put these block quotes together, 3809we get a single block quote: 3810 3811```````````````````````````````` example 3812> foo 3813> bar 3814. 3815<blockquote> 3816<p>foo 3817bar</p> 3818</blockquote> 3819```````````````````````````````` 3820 3821 3822To get a block quote with two paragraphs, use: 3823 3824```````````````````````````````` example 3825> foo 3826> 3827> bar 3828. 3829<blockquote> 3830<p>foo</p> 3831<p>bar</p> 3832</blockquote> 3833```````````````````````````````` 3834 3835 3836Block quotes can interrupt paragraphs: 3837 3838```````````````````````````````` example 3839foo 3840> bar 3841. 3842<p>foo</p> 3843<blockquote> 3844<p>bar</p> 3845</blockquote> 3846```````````````````````````````` 3847 3848 3849In general, blank lines are not needed before or after block 3850quotes: 3851 3852```````````````````````````````` example 3853> aaa 3854*** 3855> bbb 3856. 3857<blockquote> 3858<p>aaa</p> 3859</blockquote> 3860<hr /> 3861<blockquote> 3862<p>bbb</p> 3863</blockquote> 3864```````````````````````````````` 3865 3866 3867However, because of laziness, a blank line is needed between 3868a block quote and a following paragraph: 3869 3870```````````````````````````````` example 3871> bar 3872baz 3873. 3874<blockquote> 3875<p>bar 3876baz</p> 3877</blockquote> 3878```````````````````````````````` 3879 3880 3881```````````````````````````````` example 3882> bar 3883 3884baz 3885. 3886<blockquote> 3887<p>bar</p> 3888</blockquote> 3889<p>baz</p> 3890```````````````````````````````` 3891 3892 3893```````````````````````````````` example 3894> bar 3895> 3896baz 3897. 3898<blockquote> 3899<p>bar</p> 3900</blockquote> 3901<p>baz</p> 3902```````````````````````````````` 3903 3904 3905It is a consequence of the Laziness rule that any number 3906of initial `>`s may be omitted on a continuation line of a 3907nested block quote: 3908 3909```````````````````````````````` example 3910> > > foo 3911bar 3912. 3913<blockquote> 3914<blockquote> 3915<blockquote> 3916<p>foo 3917bar</p> 3918</blockquote> 3919</blockquote> 3920</blockquote> 3921```````````````````````````````` 3922 3923 3924```````````````````````````````` example 3925>>> foo 3926> bar 3927>>baz 3928. 3929<blockquote> 3930<blockquote> 3931<blockquote> 3932<p>foo 3933bar 3934baz</p> 3935</blockquote> 3936</blockquote> 3937</blockquote> 3938```````````````````````````````` 3939 3940 3941When including an indented code block in a block quote, 3942remember that the [block quote marker] includes 3943both the `>` and a following space. So *five spaces* are needed after 3944the `>`: 3945 3946```````````````````````````````` example 3947> code 3948 3949> not code 3950. 3951<blockquote> 3952<pre><code>code 3953</code></pre> 3954</blockquote> 3955<blockquote> 3956<p>not code</p> 3957</blockquote> 3958```````````````````````````````` 3959 3960 3961 3962## List items 3963 3964A [list marker](@) is a 3965[bullet list marker] or an [ordered list marker]. 3966 3967A [bullet list marker](@) 3968is a `-`, `+`, or `*` character. 3969 3970An [ordered list marker](@) 3971is a sequence of 1--9 arabic digits (`0-9`), followed by either a 3972`.` character or a `)` character. (The reason for the length 3973limit is that with 10 digits we start seeing integer overflows 3974in some browsers.) 3975 3976The following rules define [list items]: 3977 39781. **Basic case.** If a sequence of lines *Ls* constitute a sequence of 3979 blocks *Bs* starting with a [non-whitespace character], and *M* is a 3980 list marker of width *W* followed by 1 ≤ *N* ≤ 4 spaces, then the result 3981 of prepending *M* and the following spaces to the first line of 3982 *Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a 3983 list item with *Bs* as its contents. The type of the list item 3984 (bullet or ordered) is determined by the type of its list marker. 3985 If the list item is ordered, then it is also assigned a start 3986 number, based on the ordered list marker. 3987 3988 Exceptions: 3989 3990 1. When the first list item in a [list] interrupts 3991 a paragraph---that is, when it starts on a line that would 3992 otherwise count as [paragraph continuation text]---then (a) 3993 the lines *Ls* must not begin with a blank line, and (b) if 3994 the list item is ordered, the start number must be 1. 3995 2. If any line is a [thematic break][thematic breaks] then 3996 that line is not a list item. 3997 3998For example, let *Ls* be the lines 3999 4000```````````````````````````````` example 4001A paragraph 4002with two lines. 4003 4004 indented code 4005 4006> A block quote. 4007. 4008<p>A paragraph 4009with two lines.</p> 4010<pre><code>indented code 4011</code></pre> 4012<blockquote> 4013<p>A block quote.</p> 4014</blockquote> 4015```````````````````````````````` 4016 4017 4018And let *M* be the marker `1.`, and *N* = 2. Then rule #1 says 4019that the following is an ordered list item with start number 1, 4020and the same contents as *Ls*: 4021 4022```````````````````````````````` example 40231. A paragraph 4024 with two lines. 4025 4026 indented code 4027 4028 > A block quote. 4029. 4030<ol> 4031<li> 4032<p>A paragraph 4033with two lines.</p> 4034<pre><code>indented code 4035</code></pre> 4036<blockquote> 4037<p>A block quote.</p> 4038</blockquote> 4039</li> 4040</ol> 4041```````````````````````````````` 4042 4043 4044The most important thing to notice is that the position of 4045the text after the list marker determines how much indentation 4046is needed in subsequent blocks in the list item. If the list 4047marker takes up two spaces, and there are three spaces between 4048the list marker and the next [non-whitespace character], then blocks 4049must be indented five spaces in order to fall under the list 4050item. 4051 4052Here are some examples showing how far content must be indented to be 4053put under the list item: 4054 4055```````````````````````````````` example 4056- one 4057 4058 two 4059. 4060<ul> 4061<li>one</li> 4062</ul> 4063<p>two</p> 4064```````````````````````````````` 4065 4066 4067```````````````````````````````` example 4068- one 4069 4070 two 4071. 4072<ul> 4073<li> 4074<p>one</p> 4075<p>two</p> 4076</li> 4077</ul> 4078```````````````````````````````` 4079 4080 4081```````````````````````````````` example 4082 - one 4083 4084 two 4085. 4086<ul> 4087<li>one</li> 4088</ul> 4089<pre><code> two 4090</code></pre> 4091```````````````````````````````` 4092 4093 4094```````````````````````````````` example 4095 - one 4096 4097 two 4098. 4099<ul> 4100<li> 4101<p>one</p> 4102<p>two</p> 4103</li> 4104</ul> 4105```````````````````````````````` 4106 4107 4108It is tempting to think of this in terms of columns: the continuation 4109blocks must be indented at least to the column of the first 4110[non-whitespace character] after the list marker. However, that is not quite right. 4111The spaces after the list marker determine how much relative indentation 4112is needed. Which column this indentation reaches will depend on 4113how the list item is embedded in other constructions, as shown by 4114this example: 4115 4116```````````````````````````````` example 4117 > > 1. one 4118>> 4119>> two 4120. 4121<blockquote> 4122<blockquote> 4123<ol> 4124<li> 4125<p>one</p> 4126<p>two</p> 4127</li> 4128</ol> 4129</blockquote> 4130</blockquote> 4131```````````````````````````````` 4132 4133 4134Here `two` occurs in the same column as the list marker `1.`, 4135but is actually contained in the list item, because there is 4136sufficient indentation after the last containing blockquote marker. 4137 4138The converse is also possible. In the following example, the word `two` 4139occurs far to the right of the initial text of the list item, `one`, but 4140it is not considered part of the list item, because it is not indented 4141far enough past the blockquote marker: 4142 4143```````````````````````````````` example 4144>>- one 4145>> 4146 > > two 4147. 4148<blockquote> 4149<blockquote> 4150<ul> 4151<li>one</li> 4152</ul> 4153<p>two</p> 4154</blockquote> 4155</blockquote> 4156```````````````````````````````` 4157 4158 4159Note that at least one space is needed between the list marker and 4160any following content, so these are not list items: 4161 4162```````````````````````````````` example 4163-one 4164 41652.two 4166. 4167<p>-one</p> 4168<p>2.two</p> 4169```````````````````````````````` 4170 4171 4172A list item may contain blocks that are separated by more than 4173one blank line. 4174 4175```````````````````````````````` example 4176- foo 4177 4178 4179 bar 4180. 4181<ul> 4182<li> 4183<p>foo</p> 4184<p>bar</p> 4185</li> 4186</ul> 4187```````````````````````````````` 4188 4189 4190A list item may contain any kind of block: 4191 4192```````````````````````````````` example 41931. foo 4194 4195 ``` 4196 bar 4197 ``` 4198 4199 baz 4200 4201 > bam 4202. 4203<ol> 4204<li> 4205<p>foo</p> 4206<pre><code>bar 4207</code></pre> 4208<p>baz</p> 4209<blockquote> 4210<p>bam</p> 4211</blockquote> 4212</li> 4213</ol> 4214```````````````````````````````` 4215 4216 4217A list item that contains an indented code block will preserve 4218empty lines within the code block verbatim. 4219 4220```````````````````````````````` example 4221- Foo 4222 4223 bar 4224 4225 4226 baz 4227. 4228<ul> 4229<li> 4230<p>Foo</p> 4231<pre><code>bar 4232 4233 4234baz 4235</code></pre> 4236</li> 4237</ul> 4238```````````````````````````````` 4239 4240Note that ordered list start numbers must be nine digits or less: 4241 4242```````````````````````````````` example 4243123456789. ok 4244. 4245<ol start="123456789"> 4246<li>ok</li> 4247</ol> 4248```````````````````````````````` 4249 4250 4251```````````````````````````````` example 42521234567890. not ok 4253. 4254<p>1234567890. not ok</p> 4255```````````````````````````````` 4256 4257 4258A start number may begin with 0s: 4259 4260```````````````````````````````` example 42610. ok 4262. 4263<ol start="0"> 4264<li>ok</li> 4265</ol> 4266```````````````````````````````` 4267 4268 4269```````````````````````````````` example 4270003. ok 4271. 4272<ol start="3"> 4273<li>ok</li> 4274</ol> 4275```````````````````````````````` 4276 4277 4278A start number may not be negative: 4279 4280```````````````````````````````` example 4281-1. not ok 4282. 4283<p>-1. not ok</p> 4284```````````````````````````````` 4285 4286 4287 42882. **Item starting with indented code.** If a sequence of lines *Ls* 4289 constitute a sequence of blocks *Bs* starting with an indented code 4290 block, and *M* is a list marker of width *W* followed by 4291 one space, then the result of prepending *M* and the following 4292 space to the first line of *Ls*, and indenting subsequent lines of 4293 *Ls* by *W + 1* spaces, is a list item with *Bs* as its contents. 4294 If a line is empty, then it need not be indented. The type of the 4295 list item (bullet or ordered) is determined by the type of its list 4296 marker. If the list item is ordered, then it is also assigned a 4297 start number, based on the ordered list marker. 4298 4299An indented code block will have to be indented four spaces beyond 4300the edge of the region where text will be included in the list item. 4301In the following case that is 6 spaces: 4302 4303```````````````````````````````` example 4304- foo 4305 4306 bar 4307. 4308<ul> 4309<li> 4310<p>foo</p> 4311<pre><code>bar 4312</code></pre> 4313</li> 4314</ul> 4315```````````````````````````````` 4316 4317 4318And in this case it is 11 spaces: 4319 4320```````````````````````````````` example 4321 10. foo 4322 4323 bar 4324. 4325<ol start="10"> 4326<li> 4327<p>foo</p> 4328<pre><code>bar 4329</code></pre> 4330</li> 4331</ol> 4332```````````````````````````````` 4333 4334 4335If the *first* block in the list item is an indented code block, 4336then by rule #2, the contents must be indented *one* space after the 4337list marker: 4338 4339```````````````````````````````` example 4340 indented code 4341 4342paragraph 4343 4344 more code 4345. 4346<pre><code>indented code 4347</code></pre> 4348<p>paragraph</p> 4349<pre><code>more code 4350</code></pre> 4351```````````````````````````````` 4352 4353 4354```````````````````````````````` example 43551. indented code 4356 4357 paragraph 4358 4359 more code 4360. 4361<ol> 4362<li> 4363<pre><code>indented code 4364</code></pre> 4365<p>paragraph</p> 4366<pre><code>more code 4367</code></pre> 4368</li> 4369</ol> 4370```````````````````````````````` 4371 4372 4373Note that an additional space indent is interpreted as space 4374inside the code block: 4375 4376```````````````````````````````` example 43771. indented code 4378 4379 paragraph 4380 4381 more code 4382. 4383<ol> 4384<li> 4385<pre><code> indented code 4386</code></pre> 4387<p>paragraph</p> 4388<pre><code>more code 4389</code></pre> 4390</li> 4391</ol> 4392```````````````````````````````` 4393 4394 4395Note that rules #1 and #2 only apply to two cases: (a) cases 4396in which the lines to be included in a list item begin with a 4397[non-whitespace character], and (b) cases in which 4398they begin with an indented code 4399block. In a case like the following, where the first block begins with 4400a three-space indent, the rules do not allow us to form a list item by 4401indenting the whole thing and prepending a list marker: 4402 4403```````````````````````````````` example 4404 foo 4405 4406bar 4407. 4408<p>foo</p> 4409<p>bar</p> 4410```````````````````````````````` 4411 4412 4413```````````````````````````````` example 4414- foo 4415 4416 bar 4417. 4418<ul> 4419<li>foo</li> 4420</ul> 4421<p>bar</p> 4422```````````````````````````````` 4423 4424 4425This is not a significant restriction, because when a block begins 4426with 1-3 spaces indent, the indentation can always be removed without 4427a change in interpretation, allowing rule #1 to be applied. So, in 4428the above case: 4429 4430```````````````````````````````` example 4431- foo 4432 4433 bar 4434. 4435<ul> 4436<li> 4437<p>foo</p> 4438<p>bar</p> 4439</li> 4440</ul> 4441```````````````````````````````` 4442 4443 44443. **Item starting with a blank line.** If a sequence of lines *Ls* 4445 starting with a single [blank line] constitute a (possibly empty) 4446 sequence of blocks *Bs*, not separated from each other by more than 4447 one blank line, and *M* is a list marker of width *W*, 4448 then the result of prepending *M* to the first line of *Ls*, and 4449 indenting subsequent lines of *Ls* by *W + 1* spaces, is a list 4450 item with *Bs* as its contents. 4451 If a line is empty, then it need not be indented. The type of the 4452 list item (bullet or ordered) is determined by the type of its list 4453 marker. If the list item is ordered, then it is also assigned a 4454 start number, based on the ordered list marker. 4455 4456Here are some list items that start with a blank line but are not empty: 4457 4458```````````````````````````````` example 4459- 4460 foo 4461- 4462 ``` 4463 bar 4464 ``` 4465- 4466 baz 4467. 4468<ul> 4469<li>foo</li> 4470<li> 4471<pre><code>bar 4472</code></pre> 4473</li> 4474<li> 4475<pre><code>baz 4476</code></pre> 4477</li> 4478</ul> 4479```````````````````````````````` 4480 4481When the list item starts with a blank line, the number of spaces 4482following the list marker doesn't change the required indentation: 4483 4484```````````````````````````````` example 4485- 4486 foo 4487. 4488<ul> 4489<li>foo</li> 4490</ul> 4491```````````````````````````````` 4492 4493 4494A list item can begin with at most one blank line. 4495In the following example, `foo` is not part of the list 4496item: 4497 4498```````````````````````````````` example 4499- 4500 4501 foo 4502. 4503<ul> 4504<li></li> 4505</ul> 4506<p>foo</p> 4507```````````````````````````````` 4508 4509 4510Here is an empty bullet list item: 4511 4512```````````````````````````````` example 4513- foo 4514- 4515- bar 4516. 4517<ul> 4518<li>foo</li> 4519<li></li> 4520<li>bar</li> 4521</ul> 4522```````````````````````````````` 4523 4524 4525It does not matter whether there are spaces following the [list marker]: 4526 4527```````````````````````````````` example 4528- foo 4529- 4530- bar 4531. 4532<ul> 4533<li>foo</li> 4534<li></li> 4535<li>bar</li> 4536</ul> 4537```````````````````````````````` 4538 4539 4540Here is an empty ordered list item: 4541 4542```````````````````````````````` example 45431. foo 45442. 45453. bar 4546. 4547<ol> 4548<li>foo</li> 4549<li></li> 4550<li>bar</li> 4551</ol> 4552```````````````````````````````` 4553 4554 4555A list may start or end with an empty list item: 4556 4557```````````````````````````````` example 4558* 4559. 4560<ul> 4561<li></li> 4562</ul> 4563```````````````````````````````` 4564 4565However, an empty list item cannot interrupt a paragraph: 4566 4567```````````````````````````````` example 4568foo 4569* 4570 4571foo 45721. 4573. 4574<p>foo 4575*</p> 4576<p>foo 45771.</p> 4578```````````````````````````````` 4579 4580 45814. **Indentation.** If a sequence of lines *Ls* constitutes a list item 4582 according to rule #1, #2, or #3, then the result of indenting each line 4583 of *Ls* by 1-3 spaces (the same for each line) also constitutes a 4584 list item with the same contents and attributes. If a line is 4585 empty, then it need not be indented. 4586 4587Indented one space: 4588 4589```````````````````````````````` example 4590 1. A paragraph 4591 with two lines. 4592 4593 indented code 4594 4595 > A block quote. 4596. 4597<ol> 4598<li> 4599<p>A paragraph 4600with two lines.</p> 4601<pre><code>indented code 4602</code></pre> 4603<blockquote> 4604<p>A block quote.</p> 4605</blockquote> 4606</li> 4607</ol> 4608```````````````````````````````` 4609 4610 4611Indented two spaces: 4612 4613```````````````````````````````` example 4614 1. A paragraph 4615 with two lines. 4616 4617 indented code 4618 4619 > A block quote. 4620. 4621<ol> 4622<li> 4623<p>A paragraph 4624with two lines.</p> 4625<pre><code>indented code 4626</code></pre> 4627<blockquote> 4628<p>A block quote.</p> 4629</blockquote> 4630</li> 4631</ol> 4632```````````````````````````````` 4633 4634 4635Indented three spaces: 4636 4637```````````````````````````````` example 4638 1. A paragraph 4639 with two lines. 4640 4641 indented code 4642 4643 > A block quote. 4644. 4645<ol> 4646<li> 4647<p>A paragraph 4648with two lines.</p> 4649<pre><code>indented code 4650</code></pre> 4651<blockquote> 4652<p>A block quote.</p> 4653</blockquote> 4654</li> 4655</ol> 4656```````````````````````````````` 4657 4658 4659Four spaces indent gives a code block: 4660 4661```````````````````````````````` example 4662 1. A paragraph 4663 with two lines. 4664 4665 indented code 4666 4667 > A block quote. 4668. 4669<pre><code>1. A paragraph 4670 with two lines. 4671 4672 indented code 4673 4674 > A block quote. 4675</code></pre> 4676```````````````````````````````` 4677 4678 4679 46805. **Laziness.** If a string of lines *Ls* constitute a [list 4681 item](#list-items) with contents *Bs*, then the result of deleting 4682 some or all of the indentation from one or more lines in which the 4683 next [non-whitespace character] after the indentation is 4684 [paragraph continuation text] is a 4685 list item with the same contents and attributes. The unindented 4686 lines are called 4687 [lazy continuation line](@)s. 4688 4689Here is an example with [lazy continuation lines]: 4690 4691```````````````````````````````` example 4692 1. A paragraph 4693with two lines. 4694 4695 indented code 4696 4697 > A block quote. 4698. 4699<ol> 4700<li> 4701<p>A paragraph 4702with two lines.</p> 4703<pre><code>indented code 4704</code></pre> 4705<blockquote> 4706<p>A block quote.</p> 4707</blockquote> 4708</li> 4709</ol> 4710```````````````````````````````` 4711 4712 4713Indentation can be partially deleted: 4714 4715```````````````````````````````` example 4716 1. A paragraph 4717 with two lines. 4718. 4719<ol> 4720<li>A paragraph 4721with two lines.</li> 4722</ol> 4723```````````````````````````````` 4724 4725 4726These examples show how laziness can work in nested structures: 4727 4728```````````````````````````````` example 4729> 1. > Blockquote 4730continued here. 4731. 4732<blockquote> 4733<ol> 4734<li> 4735<blockquote> 4736<p>Blockquote 4737continued here.</p> 4738</blockquote> 4739</li> 4740</ol> 4741</blockquote> 4742```````````````````````````````` 4743 4744 4745```````````````````````````````` example 4746> 1. > Blockquote 4747> continued here. 4748. 4749<blockquote> 4750<ol> 4751<li> 4752<blockquote> 4753<p>Blockquote 4754continued here.</p> 4755</blockquote> 4756</li> 4757</ol> 4758</blockquote> 4759```````````````````````````````` 4760 4761 4762 47636. **That's all.** Nothing that is not counted as a list item by rules 4764 #1--5 counts as a [list item](#list-items). 4765 4766The rules for sublists follow from the general rules 4767[above][List items]. A sublist must be indented the same number 4768of spaces a paragraph would need to be in order to be included 4769in the list item. 4770 4771So, in this case we need two spaces indent: 4772 4773```````````````````````````````` example 4774- foo 4775 - bar 4776 - baz 4777 - boo 4778. 4779<ul> 4780<li>foo 4781<ul> 4782<li>bar 4783<ul> 4784<li>baz 4785<ul> 4786<li>boo</li> 4787</ul> 4788</li> 4789</ul> 4790</li> 4791</ul> 4792</li> 4793</ul> 4794```````````````````````````````` 4795 4796 4797One is not enough: 4798 4799```````````````````````````````` example 4800- foo 4801 - bar 4802 - baz 4803 - boo 4804. 4805<ul> 4806<li>foo</li> 4807<li>bar</li> 4808<li>baz</li> 4809<li>boo</li> 4810</ul> 4811```````````````````````````````` 4812 4813 4814Here we need four, because the list marker is wider: 4815 4816```````````````````````````````` example 481710) foo 4818 - bar 4819. 4820<ol start="10"> 4821<li>foo 4822<ul> 4823<li>bar</li> 4824</ul> 4825</li> 4826</ol> 4827```````````````````````````````` 4828 4829 4830Three is not enough: 4831 4832```````````````````````````````` example 483310) foo 4834 - bar 4835. 4836<ol start="10"> 4837<li>foo</li> 4838</ol> 4839<ul> 4840<li>bar</li> 4841</ul> 4842```````````````````````````````` 4843 4844 4845A list may be the first block in a list item: 4846 4847```````````````````````````````` example 4848- - foo 4849. 4850<ul> 4851<li> 4852<ul> 4853<li>foo</li> 4854</ul> 4855</li> 4856</ul> 4857```````````````````````````````` 4858 4859 4860```````````````````````````````` example 48611. - 2. foo 4862. 4863<ol> 4864<li> 4865<ul> 4866<li> 4867<ol start="2"> 4868<li>foo</li> 4869</ol> 4870</li> 4871</ul> 4872</li> 4873</ol> 4874```````````````````````````````` 4875 4876 4877A list item can contain a heading: 4878 4879```````````````````````````````` example 4880- # Foo 4881- Bar 4882 --- 4883 baz 4884. 4885<ul> 4886<li> 4887<h1>Foo</h1> 4888</li> 4889<li> 4890<h2>Bar</h2> 4891baz</li> 4892</ul> 4893```````````````````````````````` 4894 4895 4896### Motivation 4897 4898John Gruber's Markdown spec says the following about list items: 4899 49001. "List markers typically start at the left margin, but may be indented 4901 by up to three spaces. List markers must be followed by one or more 4902 spaces or a tab." 4903 49042. "To make lists look nice, you can wrap items with hanging indents.... 4905 But if you don't want to, you don't have to." 4906 49073. "List items may consist of multiple paragraphs. Each subsequent 4908 paragraph in a list item must be indented by either 4 spaces or one 4909 tab." 4910 49114. "It looks nice if you indent every line of the subsequent paragraphs, 4912 but here again, Markdown will allow you to be lazy." 4913 49145. "To put a blockquote within a list item, the blockquote's `>` 4915 delimiters need to be indented." 4916 49176. "To put a code block within a list item, the code block needs to be 4918 indented twice — 8 spaces or two tabs." 4919 4920These rules specify that a paragraph under a list item must be indented 4921four spaces (presumably, from the left margin, rather than the start of 4922the list marker, but this is not said), and that code under a list item 4923must be indented eight spaces instead of the usual four. They also say 4924that a block quote must be indented, but not by how much; however, the 4925example given has four spaces indentation. Although nothing is said 4926about other kinds of block-level content, it is certainly reasonable to 4927infer that *all* block elements under a list item, including other 4928lists, must be indented four spaces. This principle has been called the 4929*four-space rule*. 4930 4931The four-space rule is clear and principled, and if the reference 4932implementation `Markdown.pl` had followed it, it probably would have 4933become the standard. However, `Markdown.pl` allowed paragraphs and 4934sublists to start with only two spaces indentation, at least on the 4935outer level. Worse, its behavior was inconsistent: a sublist of an 4936outer-level list needed two spaces indentation, but a sublist of this 4937sublist needed three spaces. It is not surprising, then, that different 4938implementations of Markdown have developed very different rules for 4939determining what comes under a list item. (Pandoc and python-Markdown, 4940for example, stuck with Gruber's syntax description and the four-space 4941rule, while discount, redcarpet, marked, PHP Markdown, and others 4942followed `Markdown.pl`'s behavior more closely.) 4943 4944Unfortunately, given the divergences between implementations, there 4945is no way to give a spec for list items that will be guaranteed not 4946to break any existing documents. However, the spec given here should 4947correctly handle lists formatted with either the four-space rule or 4948the more forgiving `Markdown.pl` behavior, provided they are laid out 4949in a way that is natural for a human to read. 4950 4951The strategy here is to let the width and indentation of the list marker 4952determine the indentation necessary for blocks to fall under the list 4953item, rather than having a fixed and arbitrary number. The writer can 4954think of the body of the list item as a unit which gets indented to the 4955right enough to fit the list marker (and any indentation on the list 4956marker). (The laziness rule, #5, then allows continuation lines to be 4957unindented if needed.) 4958 4959This rule is superior, we claim, to any rule requiring a fixed level of 4960indentation from the margin. The four-space rule is clear but 4961unnatural. It is quite unintuitive that 4962 4963``` markdown 4964- foo 4965 4966 bar 4967 4968 - baz 4969``` 4970 4971should be parsed as two lists with an intervening paragraph, 4972 4973``` html 4974<ul> 4975<li>foo</li> 4976</ul> 4977<p>bar</p> 4978<ul> 4979<li>baz</li> 4980</ul> 4981``` 4982 4983as the four-space rule demands, rather than a single list, 4984 4985``` html 4986<ul> 4987<li> 4988<p>foo</p> 4989<p>bar</p> 4990<ul> 4991<li>baz</li> 4992</ul> 4993</li> 4994</ul> 4995``` 4996 4997The choice of four spaces is arbitrary. It can be learned, but it is 4998not likely to be guessed, and it trips up beginners regularly. 4999 5000Would it help to adopt a two-space rule? The problem is that such 5001a rule, together with the rule allowing 1--3 spaces indentation of the 5002initial list marker, allows text that is indented *less than* the 5003original list marker to be included in the list item. For example, 5004`Markdown.pl` parses 5005 5006``` markdown 5007 - one 5008 5009 two 5010``` 5011 5012as a single list item, with `two` a continuation paragraph: 5013 5014``` html 5015<ul> 5016<li> 5017<p>one</p> 5018<p>two</p> 5019</li> 5020</ul> 5021``` 5022 5023and similarly 5024 5025``` markdown 5026> - one 5027> 5028> two 5029``` 5030 5031as 5032 5033``` html 5034<blockquote> 5035<ul> 5036<li> 5037<p>one</p> 5038<p>two</p> 5039</li> 5040</ul> 5041</blockquote> 5042``` 5043 5044This is extremely unintuitive. 5045 5046Rather than requiring a fixed indent from the margin, we could require 5047a fixed indent (say, two spaces, or even one space) from the list marker (which 5048may itself be indented). This proposal would remove the last anomaly 5049discussed. Unlike the spec presented above, it would count the following 5050as a list item with a subparagraph, even though the paragraph `bar` 5051is not indented as far as the first paragraph `foo`: 5052 5053``` markdown 5054 10. foo 5055 5056 bar 5057``` 5058 5059Arguably this text does read like a list item with `bar` as a subparagraph, 5060which may count in favor of the proposal. However, on this proposal indented 5061code would have to be indented six spaces after the list marker. And this 5062would break a lot of existing Markdown, which has the pattern: 5063 5064``` markdown 50651. foo 5066 5067 indented code 5068``` 5069 5070where the code is indented eight spaces. The spec above, by contrast, will 5071parse this text as expected, since the code block's indentation is measured 5072from the beginning of `foo`. 5073 5074The one case that needs special treatment is a list item that *starts* 5075with indented code. How much indentation is required in that case, since 5076we don't have a "first paragraph" to measure from? Rule #2 simply stipulates 5077that in such cases, we require one space indentation from the list marker 5078(and then the normal four spaces for the indented code). This will match the 5079four-space rule in cases where the list marker plus its initial indentation 5080takes four spaces (a common case), but diverge in other cases. 5081 5082<div class="extension"> 5083 5084## Task list items (extension) 5085 5086GFM enables the `tasklist` extension, where an additional processing step is 5087performed on [list items]. 5088 5089A [task list item](@) is a [list item][list items] where the first block in it 5090is a paragraph which begins with a [task list item marker] and at least one 5091whitespace character before any other content. 5092 5093A [task list item marker](@) consists of an optional number of spaces, a left 5094bracket (`[`), either a whitespace character or the letter `x` in either 5095lowercase or uppercase, and then a right bracket (`]`). 5096 5097When rendered, the [task list item marker] is replaced with a semantic checkbox element; 5098in an HTML output, this would be an `<input type="checkbox">` element. 5099 5100If the character between the brackets is a whitespace character, the checkbox 5101is unchecked. Otherwise, the checkbox is checked. 5102 5103This spec does not define how the checkbox elements are interacted with: in practice, 5104implementors are free to render the checkboxes as disabled or inmutable elements, 5105or they may dynamically handle dynamic interactions (i.e. checking, unchecking) in 5106the final rendered document. 5107 5108```````````````````````````````` example disabled 5109- [ ] foo 5110- [x] bar 5111. 5112<ul> 5113<li><input disabled="" type="checkbox"> foo</li> 5114<li><input checked="" disabled="" type="checkbox"> bar</li> 5115</ul> 5116```````````````````````````````` 5117 5118Task lists can be arbitrarily nested: 5119 5120```````````````````````````````` example disabled 5121- [x] foo 5122 - [ ] bar 5123 - [x] baz 5124- [ ] bim 5125. 5126<ul> 5127<li><input checked="" disabled="" type="checkbox"> foo 5128<ul> 5129<li><input disabled="" type="checkbox"> bar</li> 5130<li><input checked="" disabled="" type="checkbox"> baz</li> 5131</ul> 5132</li> 5133<li><input disabled="" type="checkbox"> bim</li> 5134</ul> 5135```````````````````````````````` 5136 5137</div> 5138 5139## Lists 5140 5141A [list](@) is a sequence of one or more 5142list items [of the same type]. The list items 5143may be separated by any number of blank lines. 5144 5145Two list items are [of the same type](@) 5146if they begin with a [list marker] of the same type. 5147Two list markers are of the 5148same type if (a) they are bullet list markers using the same character 5149(`-`, `+`, or `*`) or (b) they are ordered list numbers with the same 5150delimiter (either `.` or `)`). 5151 5152A list is an [ordered list](@) 5153if its constituent list items begin with 5154[ordered list markers], and a 5155[bullet list](@) if its constituent list 5156items begin with [bullet list markers]. 5157 5158The [start number](@) 5159of an [ordered list] is determined by the list number of 5160its initial list item. The numbers of subsequent list items are 5161disregarded. 5162 5163A list is [loose](@) if any of its constituent 5164list items are separated by blank lines, or if any of its constituent 5165list items directly contain two block-level elements with a blank line 5166between them. Otherwise a list is [tight](@). 5167(The difference in HTML output is that paragraphs in a loose list are 5168wrapped in `<p>` tags, while paragraphs in a tight list are not.) 5169 5170Changing the bullet or ordered list delimiter starts a new list: 5171 5172```````````````````````````````` example 5173- foo 5174- bar 5175+ baz 5176. 5177<ul> 5178<li>foo</li> 5179<li>bar</li> 5180</ul> 5181<ul> 5182<li>baz</li> 5183</ul> 5184```````````````````````````````` 5185 5186 5187```````````````````````````````` example 51881. foo 51892. bar 51903) baz 5191. 5192<ol> 5193<li>foo</li> 5194<li>bar</li> 5195</ol> 5196<ol start="3"> 5197<li>baz</li> 5198</ol> 5199```````````````````````````````` 5200 5201 5202In CommonMark, a list can interrupt a paragraph. That is, 5203no blank line is needed to separate a paragraph from a following 5204list: 5205 5206```````````````````````````````` example 5207Foo 5208- bar 5209- baz 5210. 5211<p>Foo</p> 5212<ul> 5213<li>bar</li> 5214<li>baz</li> 5215</ul> 5216```````````````````````````````` 5217 5218`Markdown.pl` does not allow this, through fear of triggering a list 5219via a numeral in a hard-wrapped line: 5220 5221``` markdown 5222The number of windows in my house is 522314. The number of doors is 6. 5224``` 5225 5226Oddly, though, `Markdown.pl` *does* allow a blockquote to 5227interrupt a paragraph, even though the same considerations might 5228apply. 5229 5230In CommonMark, we do allow lists to interrupt paragraphs, for 5231two reasons. First, it is natural and not uncommon for people 5232to start lists without blank lines: 5233 5234``` markdown 5235I need to buy 5236- new shoes 5237- a coat 5238- a plane ticket 5239``` 5240 5241Second, we are attracted to a 5242 5243> [principle of uniformity](@): 5244> if a chunk of text has a certain 5245> meaning, it will continue to have the same meaning when put into a 5246> container block (such as a list item or blockquote). 5247 5248(Indeed, the spec for [list items] and [block quotes] presupposes 5249this principle.) This principle implies that if 5250 5251``` markdown 5252 * I need to buy 5253 - new shoes 5254 - a coat 5255 - a plane ticket 5256``` 5257 5258is a list item containing a paragraph followed by a nested sublist, 5259as all Markdown implementations agree it is (though the paragraph 5260may be rendered without `<p>` tags, since the list is "tight"), 5261then 5262 5263``` markdown 5264I need to buy 5265- new shoes 5266- a coat 5267- a plane ticket 5268``` 5269 5270by itself should be a paragraph followed by a nested sublist. 5271 5272Since it is well established Markdown practice to allow lists to 5273interrupt paragraphs inside list items, the [principle of 5274uniformity] requires us to allow this outside list items as 5275well. ([reStructuredText](http://docutils.sourceforge.net/rst.html) 5276takes a different approach, requiring blank lines before lists 5277even inside other list items.) 5278 5279In order to solve the problem of unwanted lists in paragraphs with 5280hard-wrapped numerals, we allow only lists starting with `1` to 5281interrupt paragraphs. Thus, 5282 5283```````````````````````````````` example 5284The number of windows in my house is 528514. The number of doors is 6. 5286. 5287<p>The number of windows in my house is 528814. The number of doors is 6.</p> 5289```````````````````````````````` 5290 5291We may still get an unintended result in cases like 5292 5293```````````````````````````````` example 5294The number of windows in my house is 52951. The number of doors is 6. 5296. 5297<p>The number of windows in my house is</p> 5298<ol> 5299<li>The number of doors is 6.</li> 5300</ol> 5301```````````````````````````````` 5302 5303but this rule should prevent most spurious list captures. 5304 5305There can be any number of blank lines between items: 5306 5307```````````````````````````````` example 5308- foo 5309 5310- bar 5311 5312 5313- baz 5314. 5315<ul> 5316<li> 5317<p>foo</p> 5318</li> 5319<li> 5320<p>bar</p> 5321</li> 5322<li> 5323<p>baz</p> 5324</li> 5325</ul> 5326```````````````````````````````` 5327 5328```````````````````````````````` example 5329- foo 5330 - bar 5331 - baz 5332 5333 5334 bim 5335. 5336<ul> 5337<li>foo 5338<ul> 5339<li>bar 5340<ul> 5341<li> 5342<p>baz</p> 5343<p>bim</p> 5344</li> 5345</ul> 5346</li> 5347</ul> 5348</li> 5349</ul> 5350```````````````````````````````` 5351 5352 5353To separate consecutive lists of the same type, or to separate a 5354list from an indented code block that would otherwise be parsed 5355as a subparagraph of the final list item, you can insert a blank HTML 5356comment: 5357 5358```````````````````````````````` example 5359- foo 5360- bar 5361 5362<!-- --> 5363 5364- baz 5365- bim 5366. 5367<ul> 5368<li>foo</li> 5369<li>bar</li> 5370</ul> 5371<!-- --> 5372<ul> 5373<li>baz</li> 5374<li>bim</li> 5375</ul> 5376```````````````````````````````` 5377 5378 5379```````````````````````````````` example 5380- foo 5381 5382 notcode 5383 5384- foo 5385 5386<!-- --> 5387 5388 code 5389. 5390<ul> 5391<li> 5392<p>foo</p> 5393<p>notcode</p> 5394</li> 5395<li> 5396<p>foo</p> 5397</li> 5398</ul> 5399<!-- --> 5400<pre><code>code 5401</code></pre> 5402```````````````````````````````` 5403 5404 5405List items need not be indented to the same level. The following 5406list items will be treated as items at the same list level, 5407since none is indented enough to belong to the previous list 5408item: 5409 5410```````````````````````````````` example 5411- a 5412 - b 5413 - c 5414 - d 5415 - e 5416 - f 5417- g 5418. 5419<ul> 5420<li>a</li> 5421<li>b</li> 5422<li>c</li> 5423<li>d</li> 5424<li>e</li> 5425<li>f</li> 5426<li>g</li> 5427</ul> 5428```````````````````````````````` 5429 5430 5431```````````````````````````````` example 54321. a 5433 5434 2. b 5435 5436 3. c 5437. 5438<ol> 5439<li> 5440<p>a</p> 5441</li> 5442<li> 5443<p>b</p> 5444</li> 5445<li> 5446<p>c</p> 5447</li> 5448</ol> 5449```````````````````````````````` 5450 5451Note, however, that list items may not be indented more than 5452three spaces. Here `- e` is treated as a paragraph continuation 5453line, because it is indented more than three spaces: 5454 5455```````````````````````````````` example 5456- a 5457 - b 5458 - c 5459 - d 5460 - e 5461. 5462<ul> 5463<li>a</li> 5464<li>b</li> 5465<li>c</li> 5466<li>d 5467- e</li> 5468</ul> 5469```````````````````````````````` 5470 5471And here, `3. c` is treated as in indented code block, 5472because it is indented four spaces and preceded by a 5473blank line. 5474 5475```````````````````````````````` example 54761. a 5477 5478 2. b 5479 5480 3. c 5481. 5482<ol> 5483<li> 5484<p>a</p> 5485</li> 5486<li> 5487<p>b</p> 5488</li> 5489</ol> 5490<pre><code>3. c 5491</code></pre> 5492```````````````````````````````` 5493 5494 5495This is a loose list, because there is a blank line between 5496two of the list items: 5497 5498```````````````````````````````` example 5499- a 5500- b 5501 5502- c 5503. 5504<ul> 5505<li> 5506<p>a</p> 5507</li> 5508<li> 5509<p>b</p> 5510</li> 5511<li> 5512<p>c</p> 5513</li> 5514</ul> 5515```````````````````````````````` 5516 5517 5518So is this, with a empty second item: 5519 5520```````````````````````````````` example 5521* a 5522* 5523 5524* c 5525. 5526<ul> 5527<li> 5528<p>a</p> 5529</li> 5530<li></li> 5531<li> 5532<p>c</p> 5533</li> 5534</ul> 5535```````````````````````````````` 5536 5537 5538These are loose lists, even though there is no space between the items, 5539because one of the items directly contains two block-level elements 5540with a blank line between them: 5541 5542```````````````````````````````` example 5543- a 5544- b 5545 5546 c 5547- d 5548. 5549<ul> 5550<li> 5551<p>a</p> 5552</li> 5553<li> 5554<p>b</p> 5555<p>c</p> 5556</li> 5557<li> 5558<p>d</p> 5559</li> 5560</ul> 5561```````````````````````````````` 5562 5563 5564```````````````````````````````` example 5565- a 5566- b 5567 5568 [ref]: /url 5569- d 5570. 5571<ul> 5572<li> 5573<p>a</p> 5574</li> 5575<li> 5576<p>b</p> 5577</li> 5578<li> 5579<p>d</p> 5580</li> 5581</ul> 5582```````````````````````````````` 5583 5584 5585This is a tight list, because the blank lines are in a code block: 5586 5587```````````````````````````````` example 5588- a 5589- ``` 5590 b 5591 5592 5593 ``` 5594- c 5595. 5596<ul> 5597<li>a</li> 5598<li> 5599<pre><code>b 5600 5601 5602</code></pre> 5603</li> 5604<li>c</li> 5605</ul> 5606```````````````````````````````` 5607 5608 5609This is a tight list, because the blank line is between two 5610paragraphs of a sublist. So the sublist is loose while 5611the outer list is tight: 5612 5613```````````````````````````````` example 5614- a 5615 - b 5616 5617 c 5618- d 5619. 5620<ul> 5621<li>a 5622<ul> 5623<li> 5624<p>b</p> 5625<p>c</p> 5626</li> 5627</ul> 5628</li> 5629<li>d</li> 5630</ul> 5631```````````````````````````````` 5632 5633 5634This is a tight list, because the blank line is inside the 5635block quote: 5636 5637```````````````````````````````` example 5638* a 5639 > b 5640 > 5641* c 5642. 5643<ul> 5644<li>a 5645<blockquote> 5646<p>b</p> 5647</blockquote> 5648</li> 5649<li>c</li> 5650</ul> 5651```````````````````````````````` 5652 5653 5654This list is tight, because the consecutive block elements 5655are not separated by blank lines: 5656 5657```````````````````````````````` example 5658- a 5659 > b 5660 ``` 5661 c 5662 ``` 5663- d 5664. 5665<ul> 5666<li>a 5667<blockquote> 5668<p>b</p> 5669</blockquote> 5670<pre><code>c 5671</code></pre> 5672</li> 5673<li>d</li> 5674</ul> 5675```````````````````````````````` 5676 5677 5678A single-paragraph list is tight: 5679 5680```````````````````````````````` example 5681- a 5682. 5683<ul> 5684<li>a</li> 5685</ul> 5686```````````````````````````````` 5687 5688 5689```````````````````````````````` example 5690- a 5691 - b 5692. 5693<ul> 5694<li>a 5695<ul> 5696<li>b</li> 5697</ul> 5698</li> 5699</ul> 5700```````````````````````````````` 5701 5702 5703This list is loose, because of the blank line between the 5704two block elements in the list item: 5705 5706```````````````````````````````` example 57071. ``` 5708 foo 5709 ``` 5710 5711 bar 5712. 5713<ol> 5714<li> 5715<pre><code>foo 5716</code></pre> 5717<p>bar</p> 5718</li> 5719</ol> 5720```````````````````````````````` 5721 5722 5723Here the outer list is loose, the inner list tight: 5724 5725```````````````````````````````` example 5726* foo 5727 * bar 5728 5729 baz 5730. 5731<ul> 5732<li> 5733<p>foo</p> 5734<ul> 5735<li>bar</li> 5736</ul> 5737<p>baz</p> 5738</li> 5739</ul> 5740```````````````````````````````` 5741 5742 5743```````````````````````````````` example 5744- a 5745 - b 5746 - c 5747 5748- d 5749 - e 5750 - f 5751. 5752<ul> 5753<li> 5754<p>a</p> 5755<ul> 5756<li>b</li> 5757<li>c</li> 5758</ul> 5759</li> 5760<li> 5761<p>d</p> 5762<ul> 5763<li>e</li> 5764<li>f</li> 5765</ul> 5766</li> 5767</ul> 5768```````````````````````````````` 5769 5770 5771# Inlines 5772 5773Inlines are parsed sequentially from the beginning of the character 5774stream to the end (left to right, in left-to-right languages). 5775Thus, for example, in 5776 5777```````````````````````````````` example 5778`hi`lo` 5779. 5780<p><code>hi</code>lo`</p> 5781```````````````````````````````` 5782 5783`hi` is parsed as code, leaving the backtick at the end as a literal 5784backtick. 5785 5786 5787## Backslash escapes 5788 5789Any ASCII punctuation character may be backslash-escaped: 5790 5791```````````````````````````````` example 5792\!\"\#\$\%\&\'\(\)\*\+\,\-\.\/\:\;\<\=\>\?\@\[\\\]\^\_\`\{\|\}\~ 5793. 5794<p>!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~</p> 5795```````````````````````````````` 5796 5797 5798Backslashes before other characters are treated as literal 5799backslashes: 5800 5801```````````````````````````````` example 5802\→\A\a\ \3\φ\« 5803. 5804<p>\→\A\a\ \3\φ\«</p> 5805```````````````````````````````` 5806 5807 5808Escaped characters are treated as regular characters and do 5809not have their usual Markdown meanings: 5810 5811```````````````````````````````` example 5812\*not emphasized* 5813\<br/> not a tag 5814\[not a link](/foo) 5815\`not code` 58161\. not a list 5817\* not a list 5818\# not a heading 5819\[foo]: /url "not a reference" 5820\ö not a character entity 5821. 5822<p>*not emphasized* 5823<br/> not a tag 5824[not a link](/foo) 5825`not code` 58261. not a list 5827* not a list 5828# not a heading 5829[foo]: /url "not a reference" 5830&ouml; not a character entity</p> 5831```````````````````````````````` 5832 5833 5834If a backslash is itself escaped, the following character is not: 5835 5836```````````````````````````````` example 5837\\*emphasis* 5838. 5839<p>\<em>emphasis</em></p> 5840```````````````````````````````` 5841 5842 5843A backslash at the end of the line is a [hard line break]: 5844 5845```````````````````````````````` example 5846foo\ 5847bar 5848. 5849<p>foo<br /> 5850bar</p> 5851```````````````````````````````` 5852 5853 5854Backslash escapes do not work in code blocks, code spans, autolinks, or 5855raw HTML: 5856 5857```````````````````````````````` example 5858`` \[\` `` 5859. 5860<p><code>\[\`</code></p> 5861```````````````````````````````` 5862 5863 5864```````````````````````````````` example 5865 \[\] 5866. 5867<pre><code>\[\] 5868</code></pre> 5869```````````````````````````````` 5870 5871 5872```````````````````````````````` example 5873~~~ 5874\[\] 5875~~~ 5876. 5877<pre><code>\[\] 5878</code></pre> 5879```````````````````````````````` 5880 5881 5882```````````````````````````````` example 5883<http://example.com?find=\*> 5884. 5885<p><a href="http://example.com?find=%5C*">http://example.com?find=\*</a></p> 5886```````````````````````````````` 5887 5888 5889```````````````````````````````` example 5890<a href="/bar\/)"> 5891. 5892<a href="/bar\/)"> 5893```````````````````````````````` 5894 5895 5896But they work in all other contexts, including URLs and link titles, 5897link references, and [info strings] in [fenced code blocks]: 5898 5899```````````````````````````````` example 5900[foo](/bar\* "ti\*tle") 5901. 5902<p><a href="/bar*" title="ti*tle">foo</a></p> 5903```````````````````````````````` 5904 5905 5906```````````````````````````````` example 5907[foo] 5908 5909[foo]: /bar\* "ti\*tle" 5910. 5911<p><a href="/bar*" title="ti*tle">foo</a></p> 5912```````````````````````````````` 5913 5914 5915```````````````````````````````` example 5916``` foo\+bar 5917foo 5918``` 5919. 5920<pre><code class="language-foo+bar">foo 5921</code></pre> 5922```````````````````````````````` 5923 5924 5925 5926## Entity and numeric character references 5927 5928Valid HTML entity references and numeric character references 5929can be used in place of the corresponding Unicode character, 5930with the following exceptions: 5931 5932- Entity and character references are not recognized in code 5933 blocks and code spans. 5934 5935- Entity and character references cannot stand in place of 5936 special characters that define structural elements in 5937 CommonMark. For example, although `*` can be used 5938 in place of a literal `*` character, `*` cannot replace 5939 `*` in emphasis delimiters, bullet list markers, or thematic 5940 breaks. 5941 5942Conforming CommonMark parsers need not store information about 5943whether a particular character was represented in the source 5944using a Unicode character or an entity reference. 5945 5946[Entity references](@) consist of `&` + any of the valid 5947HTML5 entity names + `;`. The 5948document <https://html.spec.whatwg.org/multipage/entities.json> 5949is used as an authoritative source for the valid entity 5950references and their corresponding code points. 5951 5952```````````````````````````````` example 5953 & © Æ Ď 5954¾ ℋ ⅆ 5955∲ ≧̸ 5956. 5957<p> & © Æ Ď 5958¾ ℋ ⅆ 5959∲ ≧̸</p> 5960```````````````````````````````` 5961 5962 5963[Decimal numeric character 5964references](@) 5965consist of `&#` + a string of 1--7 arabic digits + `;`. A 5966numeric character reference is parsed as the corresponding 5967Unicode character. Invalid Unicode code points will be replaced by 5968the REPLACEMENT CHARACTER (`U+FFFD`). For security reasons, 5969the code point `U+0000` will also be replaced by `U+FFFD`. 5970 5971```````````````````````````````` example 5972# Ӓ Ϡ � 5973. 5974<p># Ӓ Ϡ �</p> 5975```````````````````````````````` 5976 5977 5978[Hexadecimal numeric character 5979references](@) consist of `&#` + 5980either `X` or `x` + a string of 1-6 hexadecimal digits + `;`. 5981They too are parsed as the corresponding Unicode character (this 5982time specified with a hexadecimal numeral instead of decimal). 5983 5984```````````````````````````````` example 5985" ആ ಫ 5986. 5987<p>" ആ ಫ</p> 5988```````````````````````````````` 5989 5990 5991Here are some nonentities: 5992 5993```````````````````````````````` example 5994  &x; &#; &#x; 5995� 5996&#abcdef0; 5997&ThisIsNotDefined; &hi?; 5998. 5999<p>&nbsp &x; &#; &#x; 6000&#987654321; 6001&#abcdef0; 6002&ThisIsNotDefined; &hi?;</p> 6003```````````````````````````````` 6004 6005 6006Although HTML5 does accept some entity references 6007without a trailing semicolon (such as `©`), these are not 6008recognized here, because it makes the grammar too ambiguous: 6009 6010```````````````````````````````` example 6011© 6012. 6013<p>&copy</p> 6014```````````````````````````````` 6015 6016 6017Strings that are not on the list of HTML5 named entities are not 6018recognized as entity references either: 6019 6020```````````````````````````````` example 6021&MadeUpEntity; 6022. 6023<p>&MadeUpEntity;</p> 6024```````````````````````````````` 6025 6026 6027Entity and numeric character references are recognized in any 6028context besides code spans or code blocks, including 6029URLs, [link titles], and [fenced code block][] [info strings]: 6030 6031```````````````````````````````` example 6032<a href="öö.html"> 6033. 6034<a href="öö.html"> 6035```````````````````````````````` 6036 6037 6038```````````````````````````````` example 6039[foo](/föö "föö") 6040. 6041<p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p> 6042```````````````````````````````` 6043 6044 6045```````````````````````````````` example 6046[foo] 6047 6048[foo]: /föö "föö" 6049. 6050<p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p> 6051```````````````````````````````` 6052 6053 6054```````````````````````````````` example 6055``` föö 6056foo 6057``` 6058. 6059<pre><code class="language-föö">foo 6060</code></pre> 6061```````````````````````````````` 6062 6063 6064Entity and numeric character references are treated as literal 6065text in code spans and code blocks: 6066 6067```````````````````````````````` example 6068`föö` 6069. 6070<p><code>f&ouml;&ouml;</code></p> 6071```````````````````````````````` 6072 6073 6074```````````````````````````````` example 6075 föfö 6076. 6077<pre><code>f&ouml;f&ouml; 6078</code></pre> 6079```````````````````````````````` 6080 6081 6082Entity and numeric character references cannot be used 6083in place of symbols indicating structure in CommonMark 6084documents. 6085 6086```````````````````````````````` example 6087*foo* 6088*foo* 6089. 6090<p>*foo* 6091<em>foo</em></p> 6092```````````````````````````````` 6093 6094```````````````````````````````` example 6095* foo 6096 6097* foo 6098. 6099<p>* foo</p> 6100<ul> 6101<li>foo</li> 6102</ul> 6103```````````````````````````````` 6104 6105```````````````````````````````` example 6106foo bar 6107. 6108<p>foo 6109 6110bar</p> 6111```````````````````````````````` 6112 6113```````````````````````````````` example 6114	foo 6115. 6116<p>→foo</p> 6117```````````````````````````````` 6118 6119 6120```````````````````````````````` example 6121[a](url "tit") 6122. 6123<p>[a](url "tit")</p> 6124```````````````````````````````` 6125 6126 6127## Code spans 6128 6129A [backtick string](@) 6130is a string of one or more backtick characters (`` ` ``) that is neither 6131preceded nor followed by a backtick. 6132 6133A [code span](@) begins with a backtick string and ends with 6134a backtick string of equal length. The contents of the code span are 6135the characters between the two backtick strings, normalized in the 6136following ways: 6137 6138- First, [line endings] are converted to [spaces]. 6139- If the resulting string both begins *and* ends with a [space] 6140 character, but does not consist entirely of [space] 6141 characters, a single [space] character is removed from the 6142 front and back. This allows you to include code that begins 6143 or ends with backtick characters, which must be separated by 6144 whitespace from the opening or closing backtick strings. 6145 6146This is a simple code span: 6147 6148```````````````````````````````` example 6149`foo` 6150. 6151<p><code>foo</code></p> 6152```````````````````````````````` 6153 6154 6155Here two backticks are used, because the code contains a backtick. 6156This example also illustrates stripping of a single leading and 6157trailing space: 6158 6159```````````````````````````````` example 6160`` foo ` bar `` 6161. 6162<p><code>foo ` bar</code></p> 6163```````````````````````````````` 6164 6165 6166This example shows the motivation for stripping leading and trailing 6167spaces: 6168 6169```````````````````````````````` example 6170` `` ` 6171. 6172<p><code>``</code></p> 6173```````````````````````````````` 6174 6175Note that only *one* space is stripped: 6176 6177```````````````````````````````` example 6178` `` ` 6179. 6180<p><code> `` </code></p> 6181```````````````````````````````` 6182 6183The stripping only happens if the space is on both 6184sides of the string: 6185 6186```````````````````````````````` example 6187` a` 6188. 6189<p><code> a</code></p> 6190```````````````````````````````` 6191 6192Only [spaces], and not [unicode whitespace] in general, are 6193stripped in this way: 6194 6195```````````````````````````````` example 6196` b ` 6197. 6198<p><code> b </code></p> 6199```````````````````````````````` 6200 6201No stripping occurs if the code span contains only spaces: 6202 6203```````````````````````````````` example 6204` ` 6205` ` 6206. 6207<p><code> </code> 6208<code> </code></p> 6209```````````````````````````````` 6210 6211 6212[Line endings] are treated like spaces: 6213 6214```````````````````````````````` example 6215`` 6216foo 6217bar 6218baz 6219`` 6220. 6221<p><code>foo bar baz</code></p> 6222```````````````````````````````` 6223 6224```````````````````````````````` example 6225`` 6226foo 6227`` 6228. 6229<p><code>foo </code></p> 6230```````````````````````````````` 6231 6232 6233Interior spaces are not collapsed: 6234 6235```````````````````````````````` example 6236`foo bar 6237baz` 6238. 6239<p><code>foo bar baz</code></p> 6240```````````````````````````````` 6241 6242Note that browsers will typically collapse consecutive spaces 6243when rendering `<code>` elements, so it is recommended that 6244the following CSS be used: 6245 6246 code{white-space: pre-wrap;} 6247 6248 6249Note that backslash escapes do not work in code spans. All backslashes 6250are treated literally: 6251 6252```````````````````````````````` example 6253`foo\`bar` 6254. 6255<p><code>foo\</code>bar`</p> 6256```````````````````````````````` 6257 6258 6259Backslash escapes are never needed, because one can always choose a 6260string of *n* backtick characters as delimiters, where the code does 6261not contain any strings of exactly *n* backtick characters. 6262 6263```````````````````````````````` example 6264``foo`bar`` 6265. 6266<p><code>foo`bar</code></p> 6267```````````````````````````````` 6268 6269```````````````````````````````` example 6270` foo `` bar ` 6271. 6272<p><code>foo `` bar</code></p> 6273```````````````````````````````` 6274 6275 6276Code span backticks have higher precedence than any other inline 6277constructs except HTML tags and autolinks. Thus, for example, this is 6278not parsed as emphasized text, since the second `*` is part of a code 6279span: 6280 6281```````````````````````````````` example 6282*foo`*` 6283. 6284<p>*foo<code>*</code></p> 6285```````````````````````````````` 6286 6287 6288And this is not parsed as a link: 6289 6290```````````````````````````````` example 6291[not a `link](/foo`) 6292. 6293<p>[not a <code>link](/foo</code>)</p> 6294```````````````````````````````` 6295 6296 6297Code spans, HTML tags, and autolinks have the same precedence. 6298Thus, this is code: 6299 6300```````````````````````````````` example 6301`<a href="`">` 6302. 6303<p><code><a href="</code>">`</p> 6304```````````````````````````````` 6305 6306 6307But this is an HTML tag: 6308 6309```````````````````````````````` example 6310<a href="`">` 6311. 6312<p><a href="`">`</p> 6313```````````````````````````````` 6314 6315 6316And this is code: 6317 6318```````````````````````````````` example 6319`<http://foo.bar.`baz>` 6320. 6321<p><code><http://foo.bar.</code>baz>`</p> 6322```````````````````````````````` 6323 6324 6325But this is an autolink: 6326 6327```````````````````````````````` example 6328<http://foo.bar.`baz>` 6329. 6330<p><a href="http://foo.bar.%60baz">http://foo.bar.`baz</a>`</p> 6331```````````````````````````````` 6332 6333 6334When a backtick string is not closed by a matching backtick string, 6335we just have literal backticks: 6336 6337```````````````````````````````` example 6338```foo`` 6339. 6340<p>```foo``</p> 6341```````````````````````````````` 6342 6343 6344```````````````````````````````` example 6345`foo 6346. 6347<p>`foo</p> 6348```````````````````````````````` 6349 6350The following case also illustrates the need for opening and 6351closing backtick strings to be equal in length: 6352 6353```````````````````````````````` example 6354`foo``bar`` 6355. 6356<p>`foo<code>bar</code></p> 6357```````````````````````````````` 6358 6359 6360## Emphasis and strong emphasis 6361 6362John Gruber's original [Markdown syntax 6363description](http://daringfireball.net/projects/markdown/syntax#em) says: 6364 6365> Markdown treats asterisks (`*`) and underscores (`_`) as indicators of 6366> emphasis. Text wrapped with one `*` or `_` will be wrapped with an HTML 6367> `<em>` tag; double `*`'s or `_`'s will be wrapped with an HTML `<strong>` 6368> tag. 6369 6370This is enough for most users, but these rules leave much undecided, 6371especially when it comes to nested emphasis. The original 6372`Markdown.pl` test suite makes it clear that triple `***` and 6373`___` delimiters can be used for strong emphasis, and most 6374implementations have also allowed the following patterns: 6375 6376``` markdown 6377***strong emph*** 6378***strong** in emph* 6379***emph* in strong** 6380**in strong *emph*** 6381*in emph **strong*** 6382``` 6383 6384The following patterns are less widely supported, but the intent 6385is clear and they are useful (especially in contexts like bibliography 6386entries): 6387 6388``` markdown 6389*emph *with emph* in it* 6390**strong **with strong** in it** 6391``` 6392 6393Many implementations have also restricted intraword emphasis to 6394the `*` forms, to avoid unwanted emphasis in words containing 6395internal underscores. (It is best practice to put these in code 6396spans, but users often do not.) 6397 6398``` markdown 6399internal emphasis: foo*bar*baz 6400no emphasis: foo_bar_baz 6401``` 6402 6403The rules given below capture all of these patterns, while allowing 6404for efficient parsing strategies that do not backtrack. 6405 6406First, some definitions. A [delimiter run](@) is either 6407a sequence of one or more `*` characters that is not preceded or 6408followed by a non-backslash-escaped `*` character, or a sequence 6409of one or more `_` characters that is not preceded or followed by 6410a non-backslash-escaped `_` character. 6411 6412A [left-flanking delimiter run](@) is 6413a [delimiter run] that is (1) not followed by [Unicode whitespace], 6414and either (2a) not followed by a [punctuation character], or 6415(2b) followed by a [punctuation character] and 6416preceded by [Unicode whitespace] or a [punctuation character]. 6417For purposes of this definition, the beginning and the end of 6418the line count as Unicode whitespace. 6419 6420A [right-flanking delimiter run](@) is 6421a [delimiter run] that is (1) not preceded by [Unicode whitespace], 6422and either (2a) not preceded by a [punctuation character], or 6423(2b) preceded by a [punctuation character] and 6424followed by [Unicode whitespace] or a [punctuation character]. 6425For purposes of this definition, the beginning and the end of 6426the line count as Unicode whitespace. 6427 6428Here are some examples of delimiter runs. 6429 6430 - left-flanking but not right-flanking: 6431 6432 ``` 6433 ***abc 6434 _abc 6435 **"abc" 6436 _"abc" 6437 ``` 6438 6439 - right-flanking but not left-flanking: 6440 6441 ``` 6442 abc*** 6443 abc_ 6444 "abc"** 6445 "abc"_ 6446 ``` 6447 6448 - Both left and right-flanking: 6449 6450 ``` 6451 abc***def 6452 "abc"_"def" 6453 ``` 6454 6455 - Neither left nor right-flanking: 6456 6457 ``` 6458 abc *** def 6459 a _ b 6460 ``` 6461 6462(The idea of distinguishing left-flanking and right-flanking 6463delimiter runs based on the character before and the character 6464after comes from Roopesh Chander's 6465[vfmd](http://www.vfmd.org/vfmd-spec/specification/#procedure-for-identifying-emphasis-tags). 6466vfmd uses the terminology "emphasis indicator string" instead of "delimiter 6467run," and its rules for distinguishing left- and right-flanking runs 6468are a bit more complex than the ones given here.) 6469 6470The following rules define emphasis and strong emphasis: 6471 64721. A single `*` character [can open emphasis](@) 6473 iff (if and only if) it is part of a [left-flanking delimiter run]. 6474 64752. A single `_` character [can open emphasis] iff 6476 it is part of a [left-flanking delimiter run] 6477 and either (a) not part of a [right-flanking delimiter run] 6478 or (b) part of a [right-flanking delimiter run] 6479 preceded by punctuation. 6480 64813. A single `*` character [can close emphasis](@) 6482 iff it is part of a [right-flanking delimiter run]. 6483 64844. A single `_` character [can close emphasis] iff 6485 it is part of a [right-flanking delimiter run] 6486 and either (a) not part of a [left-flanking delimiter run] 6487 or (b) part of a [left-flanking delimiter run] 6488 followed by punctuation. 6489 64905. A double `**` [can open strong emphasis](@) 6491 iff it is part of a [left-flanking delimiter run]. 6492 64936. A double `__` [can open strong emphasis] iff 6494 it is part of a [left-flanking delimiter run] 6495 and either (a) not part of a [right-flanking delimiter run] 6496 or (b) part of a [right-flanking delimiter run] 6497 preceded by punctuation. 6498 64997. A double `**` [can close strong emphasis](@) 6500 iff it is part of a [right-flanking delimiter run]. 6501 65028. A double `__` [can close strong emphasis] iff 6503 it is part of a [right-flanking delimiter run] 6504 and either (a) not part of a [left-flanking delimiter run] 6505 or (b) part of a [left-flanking delimiter run] 6506 followed by punctuation. 6507 65089. Emphasis begins with a delimiter that [can open emphasis] and ends 6509 with a delimiter that [can close emphasis], and that uses the same 6510 character (`_` or `*`) as the opening delimiter. The 6511 opening and closing delimiters must belong to separate 6512 [delimiter runs]. If one of the delimiters can both 6513 open and close emphasis, then the sum of the lengths of the 6514 delimiter runs containing the opening and closing delimiters 6515 must not be a multiple of 3 unless both lengths are 6516 multiples of 3. 6517 651810. Strong emphasis begins with a delimiter that 6519 [can open strong emphasis] and ends with a delimiter that 6520 [can close strong emphasis], and that uses the same character 6521 (`_` or `*`) as the opening delimiter. The 6522 opening and closing delimiters must belong to separate 6523 [delimiter runs]. If one of the delimiters can both open 6524 and close strong emphasis, then the sum of the lengths of 6525 the delimiter runs containing the opening and closing 6526 delimiters must not be a multiple of 3 unless both lengths 6527 are multiples of 3. 6528 652911. A literal `*` character cannot occur at the beginning or end of 6530 `*`-delimited emphasis or `**`-delimited strong emphasis, unless it 6531 is backslash-escaped. 6532 653312. A literal `_` character cannot occur at the beginning or end of 6534 `_`-delimited emphasis or `__`-delimited strong emphasis, unless it 6535 is backslash-escaped. 6536 6537Where rules 1--12 above are compatible with multiple parsings, 6538the following principles resolve ambiguity: 6539 654013. The number of nestings should be minimized. Thus, for example, 6541 an interpretation `<strong>...</strong>` is always preferred to 6542 `<em><em>...</em></em>`. 6543 654414. An interpretation `<em><strong>...</strong></em>` is always 6545 preferred to `<strong><em>...</em></strong>`. 6546 654715. When two potential emphasis or strong emphasis spans overlap, 6548 so that the second begins before the first ends and ends after 6549 the first ends, the first takes precedence. Thus, for example, 6550 `*foo _bar* baz_` is parsed as `<em>foo _bar</em> baz_` rather 6551 than `*foo <em>bar* baz</em>`. 6552 655316. When there are two potential emphasis or strong emphasis spans 6554 with the same closing delimiter, the shorter one (the one that 6555 opens later) takes precedence. Thus, for example, 6556 `**foo **bar baz**` is parsed as `**foo <strong>bar baz</strong>` 6557 rather than `<strong>foo **bar baz</strong>`. 6558 655917. Inline code spans, links, images, and HTML tags group more tightly 6560 than emphasis. So, when there is a choice between an interpretation 6561 that contains one of these elements and one that does not, the 6562 former always wins. Thus, for example, `*[foo*](bar)` is 6563 parsed as `*<a href="bar">foo*</a>` rather than as 6564 `<em>[foo</em>](bar)`. 6565 6566These rules can be illustrated through a series of examples. 6567 6568Rule 1: 6569 6570```````````````````````````````` example 6571*foo bar* 6572. 6573<p><em>foo bar</em></p> 6574```````````````````````````````` 6575 6576 6577This is not emphasis, because the opening `*` is followed by 6578whitespace, and hence not part of a [left-flanking delimiter run]: 6579 6580```````````````````````````````` example 6581a * foo bar* 6582. 6583<p>a * foo bar*</p> 6584```````````````````````````````` 6585 6586 6587This is not emphasis, because the opening `*` is preceded 6588by an alphanumeric and followed by punctuation, and hence 6589not part of a [left-flanking delimiter run]: 6590 6591```````````````````````````````` example 6592a*"foo"* 6593. 6594<p>a*"foo"*</p> 6595```````````````````````````````` 6596 6597 6598Unicode nonbreaking spaces count as whitespace, too: 6599 6600```````````````````````````````` example 6601* a * 6602. 6603<p>* a *</p> 6604```````````````````````````````` 6605 6606 6607Intraword emphasis with `*` is permitted: 6608 6609```````````````````````````````` example 6610foo*bar* 6611. 6612<p>foo<em>bar</em></p> 6613```````````````````````````````` 6614 6615 6616```````````````````````````````` example 66175*6*78 6618. 6619<p>5<em>6</em>78</p> 6620```````````````````````````````` 6621 6622 6623Rule 2: 6624 6625```````````````````````````````` example 6626_foo bar_ 6627. 6628<p><em>foo bar</em></p> 6629```````````````````````````````` 6630 6631 6632This is not emphasis, because the opening `_` is followed by 6633whitespace: 6634 6635```````````````````````````````` example 6636_ foo bar_ 6637. 6638<p>_ foo bar_</p> 6639```````````````````````````````` 6640 6641 6642This is not emphasis, because the opening `_` is preceded 6643by an alphanumeric and followed by punctuation: 6644 6645```````````````````````````````` example 6646a_"foo"_ 6647. 6648<p>a_"foo"_</p> 6649```````````````````````````````` 6650 6651 6652Emphasis with `_` is not allowed inside words: 6653 6654```````````````````````````````` example 6655foo_bar_ 6656. 6657<p>foo_bar_</p> 6658```````````````````````````````` 6659 6660 6661```````````````````````````````` example 66625_6_78 6663. 6664<p>5_6_78</p> 6665```````````````````````````````` 6666 6667 6668```````````````````````````````` example 6669пристаням_стремятся_ 6670. 6671<p>пристаням_стремятся_</p> 6672```````````````````````````````` 6673 6674 6675Here `_` does not generate emphasis, because the first delimiter run 6676is right-flanking and the second left-flanking: 6677 6678```````````````````````````````` example 6679aa_"bb"_cc 6680. 6681<p>aa_"bb"_cc</p> 6682```````````````````````````````` 6683 6684 6685This is emphasis, even though the opening delimiter is 6686both left- and right-flanking, because it is preceded by 6687punctuation: 6688 6689```````````````````````````````` example 6690foo-_(bar)_ 6691. 6692<p>foo-<em>(bar)</em></p> 6693```````````````````````````````` 6694 6695 6696Rule 3: 6697 6698This is not emphasis, because the closing delimiter does 6699not match the opening delimiter: 6700 6701```````````````````````````````` example 6702_foo* 6703. 6704<p>_foo*</p> 6705```````````````````````````````` 6706 6707 6708This is not emphasis, because the closing `*` is preceded by 6709whitespace: 6710 6711```````````````````````````````` example 6712*foo bar * 6713. 6714<p>*foo bar *</p> 6715```````````````````````````````` 6716 6717 6718A newline also counts as whitespace: 6719 6720```````````````````````````````` example 6721*foo bar 6722* 6723. 6724<p>*foo bar 6725*</p> 6726```````````````````````````````` 6727 6728 6729This is not emphasis, because the second `*` is 6730preceded by punctuation and followed by an alphanumeric 6731(hence it is not part of a [right-flanking delimiter run]: 6732 6733```````````````````````````````` example 6734*(*foo) 6735. 6736<p>*(*foo)</p> 6737```````````````````````````````` 6738 6739 6740The point of this restriction is more easily appreciated 6741with this example: 6742 6743```````````````````````````````` example 6744*(*foo*)* 6745. 6746<p><em>(<em>foo</em>)</em></p> 6747```````````````````````````````` 6748 6749 6750Intraword emphasis with `*` is allowed: 6751 6752```````````````````````````````` example 6753*foo*bar 6754. 6755<p><em>foo</em>bar</p> 6756```````````````````````````````` 6757 6758 6759 6760Rule 4: 6761 6762This is not emphasis, because the closing `_` is preceded by 6763whitespace: 6764 6765```````````````````````````````` example 6766_foo bar _ 6767. 6768<p>_foo bar _</p> 6769```````````````````````````````` 6770 6771 6772This is not emphasis, because the second `_` is 6773preceded by punctuation and followed by an alphanumeric: 6774 6775```````````````````````````````` example 6776_(_foo) 6777. 6778<p>_(_foo)</p> 6779```````````````````````````````` 6780 6781 6782This is emphasis within emphasis: 6783 6784```````````````````````````````` example 6785_(_foo_)_ 6786. 6787<p><em>(<em>foo</em>)</em></p> 6788```````````````````````````````` 6789 6790 6791Intraword emphasis is disallowed for `_`: 6792 6793```````````````````````````````` example 6794_foo_bar 6795. 6796<p>_foo_bar</p> 6797```````````````````````````````` 6798 6799 6800```````````````````````````````` example 6801_пристаням_стремятся 6802. 6803<p>_пристаням_стремятся</p> 6804```````````````````````````````` 6805 6806 6807```````````````````````````````` example 6808_foo_bar_baz_ 6809. 6810<p><em>foo_bar_baz</em></p> 6811```````````````````````````````` 6812 6813 6814This is emphasis, even though the closing delimiter is 6815both left- and right-flanking, because it is followed by 6816punctuation: 6817 6818```````````````````````````````` example 6819_(bar)_. 6820. 6821<p><em>(bar)</em>.</p> 6822```````````````````````````````` 6823 6824 6825Rule 5: 6826 6827```````````````````````````````` example 6828**foo bar** 6829. 6830<p><strong>foo bar</strong></p> 6831```````````````````````````````` 6832 6833 6834This is not strong emphasis, because the opening delimiter is 6835followed by whitespace: 6836 6837```````````````````````````````` example 6838** foo bar** 6839. 6840<p>** foo bar**</p> 6841```````````````````````````````` 6842 6843 6844This is not strong emphasis, because the opening `**` is preceded 6845by an alphanumeric and followed by punctuation, and hence 6846not part of a [left-flanking delimiter run]: 6847 6848```````````````````````````````` example 6849a**"foo"** 6850. 6851<p>a**"foo"**</p> 6852```````````````````````````````` 6853 6854 6855Intraword strong emphasis with `**` is permitted: 6856 6857```````````````````````````````` example 6858foo**bar** 6859. 6860<p>foo<strong>bar</strong></p> 6861```````````````````````````````` 6862 6863 6864Rule 6: 6865 6866```````````````````````````````` example 6867__foo bar__ 6868. 6869<p><strong>foo bar</strong></p> 6870```````````````````````````````` 6871 6872 6873This is not strong emphasis, because the opening delimiter is 6874followed by whitespace: 6875 6876```````````````````````````````` example 6877__ foo bar__ 6878. 6879<p>__ foo bar__</p> 6880```````````````````````````````` 6881 6882 6883A newline counts as whitespace: 6884```````````````````````````````` example 6885__ 6886foo bar__ 6887. 6888<p>__ 6889foo bar__</p> 6890```````````````````````````````` 6891 6892 6893This is not strong emphasis, because the opening `__` is preceded 6894by an alphanumeric and followed by punctuation: 6895 6896```````````````````````````````` example 6897a__"foo"__ 6898. 6899<p>a__"foo"__</p> 6900```````````````````````````````` 6901 6902 6903Intraword strong emphasis is forbidden with `__`: 6904 6905```````````````````````````````` example 6906foo__bar__ 6907. 6908<p>foo__bar__</p> 6909```````````````````````````````` 6910 6911 6912```````````````````````````````` example 69135__6__78 6914. 6915<p>5__6__78</p> 6916```````````````````````````````` 6917 6918 6919```````````````````````````````` example 6920пристаням__стремятся__ 6921. 6922<p>пристаням__стремятся__</p> 6923```````````````````````````````` 6924 6925 6926```````````````````````````````` example 6927__foo, __bar__, baz__ 6928. 6929<p><strong>foo, bar, baz</strong></p> 6930```````````````````````````````` 6931 6932 6933This is strong emphasis, even though the opening delimiter is 6934both left- and right-flanking, because it is preceded by 6935punctuation: 6936 6937```````````````````````````````` example 6938foo-__(bar)__ 6939. 6940<p>foo-<strong>(bar)</strong></p> 6941```````````````````````````````` 6942 6943 6944 6945Rule 7: 6946 6947This is not strong emphasis, because the closing delimiter is preceded 6948by whitespace: 6949 6950```````````````````````````````` example 6951**foo bar ** 6952. 6953<p>**foo bar **</p> 6954```````````````````````````````` 6955 6956 6957(Nor can it be interpreted as an emphasized `*foo bar *`, because of 6958Rule 11.) 6959 6960This is not strong emphasis, because the second `**` is 6961preceded by punctuation and followed by an alphanumeric: 6962 6963```````````````````````````````` example 6964**(**foo) 6965. 6966<p>**(**foo)</p> 6967```````````````````````````````` 6968 6969 6970The point of this restriction is more easily appreciated 6971with these examples: 6972 6973```````````````````````````````` example 6974*(**foo**)* 6975. 6976<p><em>(<strong>foo</strong>)</em></p> 6977```````````````````````````````` 6978 6979 6980```````````````````````````````` example 6981**Gomphocarpus (*Gomphocarpus physocarpus*, syn. 6982*Asclepias physocarpa*)** 6983. 6984<p><strong>Gomphocarpus (<em>Gomphocarpus physocarpus</em>, syn. 6985<em>Asclepias physocarpa</em>)</strong></p> 6986```````````````````````````````` 6987 6988 6989```````````````````````````````` example 6990**foo "*bar*" foo** 6991. 6992<p><strong>foo "<em>bar</em>" foo</strong></p> 6993```````````````````````````````` 6994 6995 6996Intraword emphasis: 6997 6998```````````````````````````````` example 6999**foo**bar 7000. 7001<p><strong>foo</strong>bar</p> 7002```````````````````````````````` 7003 7004 7005Rule 8: 7006 7007This is not strong emphasis, because the closing delimiter is 7008preceded by whitespace: 7009 7010```````````````````````````````` example 7011__foo bar __ 7012. 7013<p>__foo bar __</p> 7014```````````````````````````````` 7015 7016 7017This is not strong emphasis, because the second `__` is 7018preceded by punctuation and followed by an alphanumeric: 7019 7020```````````````````````````````` example 7021__(__foo) 7022. 7023<p>__(__foo)</p> 7024```````````````````````````````` 7025 7026 7027The point of this restriction is more easily appreciated 7028with this example: 7029 7030```````````````````````````````` example 7031_(__foo__)_ 7032. 7033<p><em>(<strong>foo</strong>)</em></p> 7034```````````````````````````````` 7035 7036 7037Intraword strong emphasis is forbidden with `__`: 7038 7039```````````````````````````````` example 7040__foo__bar 7041. 7042<p>__foo__bar</p> 7043```````````````````````````````` 7044 7045 7046```````````````````````````````` example 7047__пристаням__стремятся 7048. 7049<p>__пристаням__стремятся</p> 7050```````````````````````````````` 7051 7052 7053```````````````````````````````` example 7054__foo__bar__baz__ 7055. 7056<p><strong>foo__bar__baz</strong></p> 7057```````````````````````````````` 7058 7059 7060This is strong emphasis, even though the closing delimiter is 7061both left- and right-flanking, because it is followed by 7062punctuation: 7063 7064```````````````````````````````` example 7065__(bar)__. 7066. 7067<p><strong>(bar)</strong>.</p> 7068```````````````````````````````` 7069 7070 7071Rule 9: 7072 7073Any nonempty sequence of inline elements can be the contents of an 7074emphasized span. 7075 7076```````````````````````````````` example 7077*foo [bar](/url)* 7078. 7079<p><em>foo <a href="/url">bar</a></em></p> 7080```````````````````````````````` 7081 7082 7083```````````````````````````````` example 7084*foo 7085bar* 7086. 7087<p><em>foo 7088bar</em></p> 7089```````````````````````````````` 7090 7091 7092In particular, emphasis and strong emphasis can be nested 7093inside emphasis: 7094 7095```````````````````````````````` example 7096_foo __bar__ baz_ 7097. 7098<p><em>foo <strong>bar</strong> baz</em></p> 7099```````````````````````````````` 7100 7101 7102```````````````````````````````` example 7103_foo _bar_ baz_ 7104. 7105<p><em>foo <em>bar</em> baz</em></p> 7106```````````````````````````````` 7107 7108 7109```````````````````````````````` example 7110__foo_ bar_ 7111. 7112<p><em><em>foo</em> bar</em></p> 7113```````````````````````````````` 7114 7115 7116```````````````````````````````` example 7117*foo *bar** 7118. 7119<p><em>foo <em>bar</em></em></p> 7120```````````````````````````````` 7121 7122 7123```````````````````````````````` example 7124*foo **bar** baz* 7125. 7126<p><em>foo <strong>bar</strong> baz</em></p> 7127```````````````````````````````` 7128 7129```````````````````````````````` example 7130*foo**bar**baz* 7131. 7132<p><em>foo<strong>bar</strong>baz</em></p> 7133```````````````````````````````` 7134 7135Note that in the preceding case, the interpretation 7136 7137``` markdown 7138<p><em>foo</em><em>bar<em></em>baz</em></p> 7139``` 7140 7141 7142is precluded by the condition that a delimiter that 7143can both open and close (like the `*` after `foo`) 7144cannot form emphasis if the sum of the lengths of 7145the delimiter runs containing the opening and 7146closing delimiters is a multiple of 3 unless 7147both lengths are multiples of 3. 7148 7149 7150For the same reason, we don't get two consecutive 7151emphasis sections in this example: 7152 7153```````````````````````````````` example 7154*foo**bar* 7155. 7156<p><em>foo**bar</em></p> 7157```````````````````````````````` 7158 7159 7160The same condition ensures that the following 7161cases are all strong emphasis nested inside 7162emphasis, even when the interior spaces are 7163omitted: 7164 7165 7166```````````````````````````````` example 7167***foo** bar* 7168. 7169<p><em><strong>foo</strong> bar</em></p> 7170```````````````````````````````` 7171 7172 7173```````````````````````````````` example 7174*foo **bar*** 7175. 7176<p><em>foo <strong>bar</strong></em></p> 7177```````````````````````````````` 7178 7179 7180```````````````````````````````` example 7181*foo**bar*** 7182. 7183<p><em>foo<strong>bar</strong></em></p> 7184```````````````````````````````` 7185 7186 7187When the lengths of the interior closing and opening 7188delimiter runs are *both* multiples of 3, though, 7189they can match to create emphasis: 7190 7191```````````````````````````````` example 7192foo***bar***baz 7193. 7194<p>foo<em><strong>bar</strong></em>baz</p> 7195```````````````````````````````` 7196 7197```````````````````````````````` example 7198foo******bar*********baz 7199. 7200<p>foo<strong>bar</strong>***baz</p> 7201```````````````````````````````` 7202 7203 7204Indefinite levels of nesting are possible: 7205 7206```````````````````````````````` example 7207*foo **bar *baz* bim** bop* 7208. 7209<p><em>foo <strong>bar <em>baz</em> bim</strong> bop</em></p> 7210```````````````````````````````` 7211 7212 7213```````````````````````````````` example 7214*foo [*bar*](/url)* 7215. 7216<p><em>foo <a href="/url"><em>bar</em></a></em></p> 7217```````````````````````````````` 7218 7219 7220There can be no empty emphasis or strong emphasis: 7221 7222```````````````````````````````` example 7223** is not an empty emphasis 7224. 7225<p>** is not an empty emphasis</p> 7226```````````````````````````````` 7227 7228 7229```````````````````````````````` example 7230**** is not an empty strong emphasis 7231. 7232<p>**** is not an empty strong emphasis</p> 7233```````````````````````````````` 7234 7235 7236 7237Rule 10: 7238 7239Any nonempty sequence of inline elements can be the contents of an 7240strongly emphasized span. 7241 7242```````````````````````````````` example 7243**foo [bar](/url)** 7244. 7245<p><strong>foo <a href="/url">bar</a></strong></p> 7246```````````````````````````````` 7247 7248 7249```````````````````````````````` example 7250**foo 7251bar** 7252. 7253<p><strong>foo 7254bar</strong></p> 7255```````````````````````````````` 7256 7257 7258In particular, emphasis and strong emphasis can be nested 7259inside strong emphasis: 7260 7261```````````````````````````````` example 7262__foo _bar_ baz__ 7263. 7264<p><strong>foo <em>bar</em> baz</strong></p> 7265```````````````````````````````` 7266 7267 7268```````````````````````````````` example 7269__foo __bar__ baz__ 7270. 7271<p><strong>foo bar baz</strong></p> 7272```````````````````````````````` 7273 7274 7275```````````````````````````````` example 7276____foo__ bar__ 7277. 7278<p><strong>foo bar</strong></p> 7279```````````````````````````````` 7280 7281 7282```````````````````````````````` example 7283**foo **bar**** 7284. 7285<p><strong>foo bar</strong></p> 7286```````````````````````````````` 7287 7288 7289```````````````````````````````` example 7290**foo *bar* baz** 7291. 7292<p><strong>foo <em>bar</em> baz</strong></p> 7293```````````````````````````````` 7294 7295 7296```````````````````````````````` example 7297**foo*bar*baz** 7298. 7299<p><strong>foo<em>bar</em>baz</strong></p> 7300```````````````````````````````` 7301 7302 7303```````````````````````````````` example 7304***foo* bar** 7305. 7306<p><strong><em>foo</em> bar</strong></p> 7307```````````````````````````````` 7308 7309 7310```````````````````````````````` example 7311**foo *bar*** 7312. 7313<p><strong>foo <em>bar</em></strong></p> 7314```````````````````````````````` 7315 7316 7317Indefinite levels of nesting are possible: 7318 7319```````````````````````````````` example 7320**foo *bar **baz** 7321bim* bop** 7322. 7323<p><strong>foo <em>bar <strong>baz</strong> 7324bim</em> bop</strong></p> 7325```````````````````````````````` 7326 7327 7328```````````````````````````````` example 7329**foo [*bar*](/url)** 7330. 7331<p><strong>foo <a href="/url"><em>bar</em></a></strong></p> 7332```````````````````````````````` 7333 7334 7335There can be no empty emphasis or strong emphasis: 7336 7337```````````````````````````````` example 7338__ is not an empty emphasis 7339. 7340<p>__ is not an empty emphasis</p> 7341```````````````````````````````` 7342 7343 7344```````````````````````````````` example 7345____ is not an empty strong emphasis 7346. 7347<p>____ is not an empty strong emphasis</p> 7348```````````````````````````````` 7349 7350 7351 7352Rule 11: 7353 7354```````````````````````````````` example 7355foo *** 7356. 7357<p>foo ***</p> 7358```````````````````````````````` 7359 7360 7361```````````````````````````````` example 7362foo *\** 7363. 7364<p>foo <em>*</em></p> 7365```````````````````````````````` 7366 7367 7368```````````````````````````````` example 7369foo *_* 7370. 7371<p>foo <em>_</em></p> 7372```````````````````````````````` 7373 7374 7375```````````````````````````````` example 7376foo ***** 7377. 7378<p>foo *****</p> 7379```````````````````````````````` 7380 7381 7382```````````````````````````````` example 7383foo **\*** 7384. 7385<p>foo <strong>*</strong></p> 7386```````````````````````````````` 7387 7388 7389```````````````````````````````` example 7390foo **_** 7391. 7392<p>foo <strong>_</strong></p> 7393```````````````````````````````` 7394 7395 7396Note that when delimiters do not match evenly, Rule 11 determines 7397that the excess literal `*` characters will appear outside of the 7398emphasis, rather than inside it: 7399 7400```````````````````````````````` example 7401**foo* 7402. 7403<p>*<em>foo</em></p> 7404```````````````````````````````` 7405 7406 7407```````````````````````````````` example 7408*foo** 7409. 7410<p><em>foo</em>*</p> 7411```````````````````````````````` 7412 7413 7414```````````````````````````````` example 7415***foo** 7416. 7417<p>*<strong>foo</strong></p> 7418```````````````````````````````` 7419 7420 7421```````````````````````````````` example 7422****foo* 7423. 7424<p>***<em>foo</em></p> 7425```````````````````````````````` 7426 7427 7428```````````````````````````````` example 7429**foo*** 7430. 7431<p><strong>foo</strong>*</p> 7432```````````````````````````````` 7433 7434 7435```````````````````````````````` example 7436*foo**** 7437. 7438<p><em>foo</em>***</p> 7439```````````````````````````````` 7440 7441 7442 7443Rule 12: 7444 7445```````````````````````````````` example 7446foo ___ 7447. 7448<p>foo ___</p> 7449```````````````````````````````` 7450 7451 7452```````````````````````````````` example 7453foo _\__ 7454. 7455<p>foo <em>_</em></p> 7456```````````````````````````````` 7457 7458 7459```````````````````````````````` example 7460foo _*_ 7461. 7462<p>foo <em>*</em></p> 7463```````````````````````````````` 7464 7465 7466```````````````````````````````` example 7467foo _____ 7468. 7469<p>foo _____</p> 7470```````````````````````````````` 7471 7472 7473```````````````````````````````` example 7474foo __\___ 7475. 7476<p>foo <strong>_</strong></p> 7477```````````````````````````````` 7478 7479 7480```````````````````````````````` example 7481foo __*__ 7482. 7483<p>foo <strong>*</strong></p> 7484```````````````````````````````` 7485 7486 7487```````````````````````````````` example 7488__foo_ 7489. 7490<p>_<em>foo</em></p> 7491```````````````````````````````` 7492 7493 7494Note that when delimiters do not match evenly, Rule 12 determines 7495that the excess literal `_` characters will appear outside of the 7496emphasis, rather than inside it: 7497 7498```````````````````````````````` example 7499_foo__ 7500. 7501<p><em>foo</em>_</p> 7502```````````````````````````````` 7503 7504 7505```````````````````````````````` example 7506___foo__ 7507. 7508<p>_<strong>foo</strong></p> 7509```````````````````````````````` 7510 7511 7512```````````````````````````````` example 7513____foo_ 7514. 7515<p>___<em>foo</em></p> 7516```````````````````````````````` 7517 7518 7519```````````````````````````````` example 7520__foo___ 7521. 7522<p><strong>foo</strong>_</p> 7523```````````````````````````````` 7524 7525 7526```````````````````````````````` example 7527_foo____ 7528. 7529<p><em>foo</em>___</p> 7530```````````````````````````````` 7531 7532 7533Rule 13 implies that if you want emphasis nested directly inside 7534emphasis, you must use different delimiters: 7535 7536```````````````````````````````` example 7537**foo** 7538. 7539<p><strong>foo</strong></p> 7540```````````````````````````````` 7541 7542 7543```````````````````````````````` example 7544*_foo_* 7545. 7546<p><em><em>foo</em></em></p> 7547```````````````````````````````` 7548 7549 7550```````````````````````````````` example 7551__foo__ 7552. 7553<p><strong>foo</strong></p> 7554```````````````````````````````` 7555 7556 7557```````````````````````````````` example 7558_*foo*_ 7559. 7560<p><em><em>foo</em></em></p> 7561```````````````````````````````` 7562 7563 7564However, strong emphasis within strong emphasis is possible without 7565switching delimiters: 7566 7567```````````````````````````````` example 7568****foo**** 7569. 7570<p><strong>foo</strong></p> 7571```````````````````````````````` 7572 7573 7574```````````````````````````````` example 7575____foo____ 7576. 7577<p><strong>foo</strong></p> 7578```````````````````````````````` 7579 7580 7581 7582Rule 13 can be applied to arbitrarily long sequences of 7583delimiters: 7584 7585```````````````````````````````` example 7586******foo****** 7587. 7588<p><strong>foo</strong></p> 7589```````````````````````````````` 7590 7591 7592Rule 14: 7593 7594```````````````````````````````` example 7595***foo*** 7596. 7597<p><em><strong>foo</strong></em></p> 7598```````````````````````````````` 7599 7600 7601```````````````````````````````` example 7602_____foo_____ 7603. 7604<p><em><strong>foo</strong></em></p> 7605```````````````````````````````` 7606 7607 7608Rule 15: 7609 7610```````````````````````````````` example 7611*foo _bar* baz_ 7612. 7613<p><em>foo _bar</em> baz_</p> 7614```````````````````````````````` 7615 7616 7617```````````````````````````````` example 7618*foo __bar *baz bim__ bam* 7619. 7620<p><em>foo <strong>bar *baz bim</strong> bam</em></p> 7621```````````````````````````````` 7622 7623 7624Rule 16: 7625 7626```````````````````````````````` example 7627**foo **bar baz** 7628. 7629<p>**foo <strong>bar baz</strong></p> 7630```````````````````````````````` 7631 7632 7633```````````````````````````````` example 7634*foo *bar baz* 7635. 7636<p>*foo <em>bar baz</em></p> 7637```````````````````````````````` 7638 7639 7640Rule 17: 7641 7642```````````````````````````````` example 7643*[bar*](/url) 7644. 7645<p>*<a href="/url">bar*</a></p> 7646```````````````````````````````` 7647 7648 7649```````````````````````````````` example 7650_foo [bar_](/url) 7651. 7652<p>_foo <a href="/url">bar_</a></p> 7653```````````````````````````````` 7654 7655 7656```````````````````````````````` example 7657*<img src="foo" title="*"/> 7658. 7659<p>*<img src="foo" title="*"/></p> 7660```````````````````````````````` 7661 7662 7663```````````````````````````````` example 7664**<a href="**"> 7665. 7666<p>**<a href="**"></p> 7667```````````````````````````````` 7668 7669 7670```````````````````````````````` example 7671__<a href="__"> 7672. 7673<p>__<a href="__"></p> 7674```````````````````````````````` 7675 7676 7677```````````````````````````````` example 7678*a `*`* 7679. 7680<p><em>a <code>*</code></em></p> 7681```````````````````````````````` 7682 7683 7684```````````````````````````````` example 7685_a `_`_ 7686. 7687<p><em>a <code>_</code></em></p> 7688```````````````````````````````` 7689 7690 7691```````````````````````````````` example 7692**a<http://foo.bar/?q=**> 7693. 7694<p>**a<a href="http://foo.bar/?q=**">http://foo.bar/?q=**</a></p> 7695```````````````````````````````` 7696 7697 7698```````````````````````````````` example 7699__a<http://foo.bar/?q=__> 7700. 7701<p>__a<a href="http://foo.bar/?q=__">http://foo.bar/?q=__</a></p> 7702```````````````````````````````` 7703 7704 7705<div class="extension"> 7706 7707## Strikethrough (extension) 7708 7709GFM enables the `strikethrough` extension, where an additional emphasis type is 7710available. 7711 7712Strikethrough text is any text wrapped in two tildes (`~`). 7713 7714```````````````````````````````` example strikethrough 7715~~Hi~~ Hello, world! 7716. 7717<p><del>Hi</del> Hello, world!</p> 7718```````````````````````````````` 7719 7720As with regular emphasis delimiters, a new paragraph will cause strikethrough 7721parsing to cease: 7722 7723```````````````````````````````` example strikethrough 7724This ~~has a 7725 7726new paragraph~~. 7727. 7728<p>This ~~has a</p> 7729<p>new paragraph~~.</p> 7730```````````````````````````````` 7731 7732</div> 7733 7734## Links 7735 7736A link contains [link text] (the visible text), a [link destination] 7737(the URI that is the link destination), and optionally a [link title]. 7738There are two basic kinds of links in Markdown. In [inline links] the 7739destination and title are given immediately after the link text. In 7740[reference links] the destination and title are defined elsewhere in 7741the document. 7742 7743A [link text](@) consists of a sequence of zero or more 7744inline elements enclosed by square brackets (`[` and `]`). The 7745following rules apply: 7746 7747- Links may not contain other links, at any level of nesting. If 7748 multiple otherwise valid link definitions appear nested inside each 7749 other, the inner-most definition is used. 7750 7751- Brackets are allowed in the [link text] only if (a) they 7752 are backslash-escaped or (b) they appear as a matched pair of brackets, 7753 with an open bracket `[`, a sequence of zero or more inlines, and 7754 a close bracket `]`. 7755 7756- Backtick [code spans], [autolinks], and raw [HTML tags] bind more tightly 7757 than the brackets in link text. Thus, for example, 7758 `` [foo`]` `` could not be a link text, since the second `]` 7759 is part of a code span. 7760 7761- The brackets in link text bind more tightly than markers for 7762 [emphasis and strong emphasis]. Thus, for example, `*[foo*](url)` is a link. 7763 7764A [link destination](@) consists of either 7765 7766- a sequence of zero or more characters between an opening `<` and a 7767 closing `>` that contains no line breaks or unescaped 7768 `<` or `>` characters, or 7769 7770- a nonempty sequence of characters that does not start with 7771 `<`, does not include ASCII space or control characters, and 7772 includes parentheses only if (a) they are backslash-escaped or 7773 (b) they are part of a balanced pair of unescaped parentheses. 7774 (Implementations may impose limits on parentheses nesting to 7775 avoid performance issues, but at least three levels of nesting 7776 should be supported.) 7777 7778A [link title](@) consists of either 7779 7780- a sequence of zero or more characters between straight double-quote 7781 characters (`"`), including a `"` character only if it is 7782 backslash-escaped, or 7783 7784- a sequence of zero or more characters between straight single-quote 7785 characters (`'`), including a `'` character only if it is 7786 backslash-escaped, or 7787 7788- a sequence of zero or more characters between matching parentheses 7789 (`(...)`), including a `(` or `)` character only if it is 7790 backslash-escaped. 7791 7792Although [link titles] may span multiple lines, they may not contain 7793a [blank line]. 7794 7795An [inline link](@) consists of a [link text] followed immediately 7796by a left parenthesis `(`, optional [whitespace], an optional 7797[link destination], an optional [link title] separated from the link 7798destination by [whitespace], optional [whitespace], and a right 7799parenthesis `)`. The link's text consists of the inlines contained 7800in the [link text] (excluding the enclosing square brackets). 7801The link's URI consists of the link destination, excluding enclosing 7802`<...>` if present, with backslash-escapes in effect as described 7803above. The link's title consists of the link title, excluding its 7804enclosing delimiters, with backslash-escapes in effect as described 7805above. 7806 7807Here is a simple inline link: 7808 7809```````````````````````````````` example 7810[link](/uri "title") 7811. 7812<p><a href="/uri" title="title">link</a></p> 7813```````````````````````````````` 7814 7815 7816The title may be omitted: 7817 7818```````````````````````````````` example 7819[link](/uri) 7820. 7821<p><a href="/uri">link</a></p> 7822```````````````````````````````` 7823 7824 7825Both the title and the destination may be omitted: 7826 7827```````````````````````````````` example 7828[link]() 7829. 7830<p><a href="">link</a></p> 7831```````````````````````````````` 7832 7833 7834```````````````````````````````` example 7835[link](<>) 7836. 7837<p><a href="">link</a></p> 7838```````````````````````````````` 7839 7840The destination can only contain spaces if it is 7841enclosed in pointy brackets: 7842 7843```````````````````````````````` example 7844[link](/my uri) 7845. 7846<p>[link](/my uri)</p> 7847```````````````````````````````` 7848 7849```````````````````````````````` example 7850[link](</my uri>) 7851. 7852<p><a href="/my%20uri">link</a></p> 7853```````````````````````````````` 7854 7855The destination cannot contain line breaks, 7856even if enclosed in pointy brackets: 7857 7858```````````````````````````````` example 7859[link](foo 7860bar) 7861. 7862<p>[link](foo 7863bar)</p> 7864```````````````````````````````` 7865 7866```````````````````````````````` example 7867[link](<foo 7868bar>) 7869. 7870<p>[link](<foo 7871bar>)</p> 7872```````````````````````````````` 7873 7874The destination can contain `)` if it is enclosed 7875in pointy brackets: 7876 7877```````````````````````````````` example 7878[a](<b)c>) 7879. 7880<p><a href="b)c">a</a></p> 7881```````````````````````````````` 7882 7883Pointy brackets that enclose links must be unescaped: 7884 7885```````````````````````````````` example 7886[link](<foo\>) 7887. 7888<p>[link](<foo>)</p> 7889```````````````````````````````` 7890 7891These are not links, because the opening pointy bracket 7892is not matched properly: 7893 7894```````````````````````````````` example 7895[a](<b)c 7896[a](<b)c> 7897[a](<b>c) 7898. 7899<p>[a](<b)c 7900[a](<b)c> 7901[a](<b>c)</p> 7902```````````````````````````````` 7903 7904Parentheses inside the link destination may be escaped: 7905 7906```````````````````````````````` example 7907[link](\(foo\)) 7908. 7909<p><a href="(foo)">link</a></p> 7910```````````````````````````````` 7911 7912Any number of parentheses are allowed without escaping, as long as they are 7913balanced: 7914 7915```````````````````````````````` example 7916[link](foo(and(bar))) 7917. 7918<p><a href="foo(and(bar))">link</a></p> 7919```````````````````````````````` 7920 7921However, if you have unbalanced parentheses, you need to escape or use the 7922`<...>` form: 7923 7924```````````````````````````````` example 7925[link](foo\(and\(bar\)) 7926. 7927<p><a href="foo(and(bar)">link</a></p> 7928```````````````````````````````` 7929 7930 7931```````````````````````````````` example 7932[link](<foo(and(bar)>) 7933. 7934<p><a href="foo(and(bar)">link</a></p> 7935```````````````````````````````` 7936 7937 7938Parentheses and other symbols can also be escaped, as usual 7939in Markdown: 7940 7941```````````````````````````````` example 7942[link](foo\)\:) 7943. 7944<p><a href="foo):">link</a></p> 7945```````````````````````````````` 7946 7947 7948A link can contain fragment identifiers and queries: 7949 7950```````````````````````````````` example 7951[link](#fragment) 7952 7953[link](http://example.com#fragment) 7954 7955[link](http://example.com?foo=3#frag) 7956. 7957<p><a href="#fragment">link</a></p> 7958<p><a href="http://example.com#fragment">link</a></p> 7959<p><a href="http://example.com?foo=3#frag">link</a></p> 7960```````````````````````````````` 7961 7962 7963Note that a backslash before a non-escapable character is 7964just a backslash: 7965 7966```````````````````````````````` example 7967[link](foo\bar) 7968. 7969<p><a href="foo%5Cbar">link</a></p> 7970```````````````````````````````` 7971 7972 7973URL-escaping should be left alone inside the destination, as all 7974URL-escaped characters are also valid URL characters. Entity and 7975numerical character references in the destination will be parsed 7976into the corresponding Unicode code points, as usual. These may 7977be optionally URL-escaped when written as HTML, but this spec 7978does not enforce any particular policy for rendering URLs in 7979HTML or other formats. Renderers may make different decisions 7980about how to escape or normalize URLs in the output. 7981 7982```````````````````````````````` example 7983[link](foo%20bä) 7984. 7985<p><a href="foo%20b%C3%A4">link</a></p> 7986```````````````````````````````` 7987 7988 7989Note that, because titles can often be parsed as destinations, 7990if you try to omit the destination and keep the title, you'll 7991get unexpected results: 7992 7993```````````````````````````````` example 7994[link]("title") 7995. 7996<p><a href="%22title%22">link</a></p> 7997```````````````````````````````` 7998 7999 8000Titles may be in single quotes, double quotes, or parentheses: 8001 8002```````````````````````````````` example 8003[link](/url "title") 8004[link](/url 'title') 8005[link](/url (title)) 8006. 8007<p><a href="/url" title="title">link</a> 8008<a href="/url" title="title">link</a> 8009<a href="/url" title="title">link</a></p> 8010```````````````````````````````` 8011 8012 8013Backslash escapes and entity and numeric character references 8014may be used in titles: 8015 8016```````````````````````````````` example 8017[link](/url "title \""") 8018. 8019<p><a href="/url" title="title """>link</a></p> 8020```````````````````````````````` 8021 8022 8023Titles must be separated from the link using a [whitespace]. 8024Other [Unicode whitespace] like non-breaking space doesn't work. 8025 8026```````````````````````````````` example 8027[link](/url "title") 8028. 8029<p><a href="/url%C2%A0%22title%22">link</a></p> 8030```````````````````````````````` 8031 8032 8033Nested balanced quotes are not allowed without escaping: 8034 8035```````````````````````````````` example 8036[link](/url "title "and" title") 8037. 8038<p>[link](/url "title "and" title")</p> 8039```````````````````````````````` 8040 8041 8042But it is easy to work around this by using a different quote type: 8043 8044```````````````````````````````` example 8045[link](/url 'title "and" title') 8046. 8047<p><a href="/url" title="title "and" title">link</a></p> 8048```````````````````````````````` 8049 8050 8051(Note: `Markdown.pl` did allow double quotes inside a double-quoted 8052title, and its test suite included a test demonstrating this. 8053But it is hard to see a good rationale for the extra complexity this 8054brings, since there are already many ways---backslash escaping, 8055entity and numeric character references, or using a different 8056quote type for the enclosing title---to write titles containing 8057double quotes. `Markdown.pl`'s handling of titles has a number 8058of other strange features. For example, it allows single-quoted 8059titles in inline links, but not reference links. And, in 8060reference links but not inline links, it allows a title to begin 8061with `"` and end with `)`. `Markdown.pl` 1.0.1 even allows 8062titles with no closing quotation mark, though 1.0.2b8 does not. 8063It seems preferable to adopt a simple, rational rule that works 8064the same way in inline links and link reference definitions.) 8065 8066[Whitespace] is allowed around the destination and title: 8067 8068```````````````````````````````` example 8069[link]( /uri 8070 "title" ) 8071. 8072<p><a href="/uri" title="title">link</a></p> 8073```````````````````````````````` 8074 8075 8076But it is not allowed between the link text and the 8077following parenthesis: 8078 8079```````````````````````````````` example 8080[link] (/uri) 8081. 8082<p>[link] (/uri)</p> 8083```````````````````````````````` 8084 8085 8086The link text may contain balanced brackets, but not unbalanced ones, 8087unless they are escaped: 8088 8089```````````````````````````````` example 8090[link [foo [bar]]](/uri) 8091. 8092<p><a href="/uri">link [foo [bar]]</a></p> 8093```````````````````````````````` 8094 8095 8096```````````````````````````````` example 8097[link] bar](/uri) 8098. 8099<p>[link] bar](/uri)</p> 8100```````````````````````````````` 8101 8102 8103```````````````````````````````` example 8104[link [bar](/uri) 8105. 8106<p>[link <a href="/uri">bar</a></p> 8107```````````````````````````````` 8108 8109 8110```````````````````````````````` example 8111[link \[bar](/uri) 8112. 8113<p><a href="/uri">link [bar</a></p> 8114```````````````````````````````` 8115 8116 8117The link text may contain inline content: 8118 8119```````````````````````````````` example 8120[link *foo **bar** `#`*](/uri) 8121. 8122<p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p> 8123```````````````````````````````` 8124 8125 8126```````````````````````````````` example 8127[](/uri) 8128. 8129<p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p> 8130```````````````````````````````` 8131 8132 8133However, links may not contain other links, at any level of nesting. 8134 8135```````````````````````````````` example 8136[foo [bar](/uri)](/uri) 8137. 8138<p>[foo <a href="/uri">bar</a>](/uri)</p> 8139```````````````````````````````` 8140 8141 8142```````````````````````````````` example 8143[foo *[bar [baz](/uri)](/uri)*](/uri) 8144. 8145<p>[foo <em>[bar <a href="/uri">baz</a>](/uri)</em>](/uri)</p> 8146```````````````````````````````` 8147 8148 8149```````````````````````````````` example 8150](uri2)](uri3) 8151. 8152<p><img src="uri3" alt="[foo](uri2)" /></p> 8153```````````````````````````````` 8154 8155 8156These cases illustrate the precedence of link text grouping over 8157emphasis grouping: 8158 8159```````````````````````````````` example 8160*[foo*](/uri) 8161. 8162<p>*<a href="/uri">foo*</a></p> 8163```````````````````````````````` 8164 8165 8166```````````````````````````````` example 8167[foo *bar](baz*) 8168. 8169<p><a href="baz*">foo *bar</a></p> 8170```````````````````````````````` 8171 8172 8173Note that brackets that *aren't* part of links do not take 8174precedence: 8175 8176```````````````````````````````` example 8177*foo [bar* baz] 8178. 8179<p><em>foo [bar</em> baz]</p> 8180```````````````````````````````` 8181 8182 8183These cases illustrate the precedence of HTML tags, code spans, 8184and autolinks over link grouping: 8185 8186```````````````````````````````` example 8187[foo <bar attr="](baz)"> 8188. 8189<p>[foo <bar attr="](baz)"></p> 8190```````````````````````````````` 8191 8192 8193```````````````````````````````` example 8194[foo`](/uri)` 8195. 8196<p>[foo<code>](/uri)</code></p> 8197```````````````````````````````` 8198 8199 8200```````````````````````````````` example 8201[foo<http://example.com/?search=](uri)> 8202. 8203<p>[foo<a href="http://example.com/?search=%5D(uri)">http://example.com/?search=](uri)</a></p> 8204```````````````````````````````` 8205 8206 8207There are three kinds of [reference link](@)s: 8208[full](#full-reference-link), [collapsed](#collapsed-reference-link), 8209and [shortcut](#shortcut-reference-link). 8210 8211A [full reference link](@) 8212consists of a [link text] immediately followed by a [link label] 8213that [matches] a [link reference definition] elsewhere in the document. 8214 8215A [link label](@) begins with a left bracket (`[`) and ends 8216with the first right bracket (`]`) that is not backslash-escaped. 8217Between these brackets there must be at least one [non-whitespace character]. 8218Unescaped square bracket characters are not allowed inside the 8219opening and closing square brackets of [link labels]. A link 8220label can have at most 999 characters inside the square 8221brackets. 8222 8223One label [matches](@) 8224another just in case their normalized forms are equal. To normalize a 8225label, strip off the opening and closing brackets, 8226perform the *Unicode case fold*, strip leading and trailing 8227[whitespace] and collapse consecutive internal 8228[whitespace] to a single space. If there are multiple 8229matching reference link definitions, the one that comes first in the 8230document is used. (It is desirable in such cases to emit a warning.) 8231 8232The contents of the first link label are parsed as inlines, which are 8233used as the link's text. The link's URI and title are provided by the 8234matching [link reference definition]. 8235 8236Here is a simple example: 8237 8238```````````````````````````````` example 8239[foo][bar] 8240 8241[bar]: /url "title" 8242. 8243<p><a href="/url" title="title">foo</a></p> 8244```````````````````````````````` 8245 8246 8247The rules for the [link text] are the same as with 8248[inline links]. Thus: 8249 8250The link text may contain balanced brackets, but not unbalanced ones, 8251unless they are escaped: 8252 8253```````````````````````````````` example 8254[link [foo [bar]]][ref] 8255 8256[ref]: /uri 8257. 8258<p><a href="/uri">link [foo [bar]]</a></p> 8259```````````````````````````````` 8260 8261 8262```````````````````````````````` example 8263[link \[bar][ref] 8264 8265[ref]: /uri 8266. 8267<p><a href="/uri">link [bar</a></p> 8268```````````````````````````````` 8269 8270 8271The link text may contain inline content: 8272 8273```````````````````````````````` example 8274[link *foo **bar** `#`*][ref] 8275 8276[ref]: /uri 8277. 8278<p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p> 8279```````````````````````````````` 8280 8281 8282```````````````````````````````` example 8283[][ref] 8284 8285[ref]: /uri 8286. 8287<p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p> 8288```````````````````````````````` 8289 8290 8291However, links may not contain other links, at any level of nesting. 8292 8293```````````````````````````````` example 8294[foo [bar](/uri)][ref] 8295 8296[ref]: /uri 8297. 8298<p>[foo <a href="/uri">bar</a>]<a href="/uri">ref</a></p> 8299```````````````````````````````` 8300 8301 8302```````````````````````````````` example 8303[foo *bar [baz][ref]*][ref] 8304 8305[ref]: /uri 8306. 8307<p>[foo <em>bar <a href="/uri">baz</a></em>]<a href="/uri">ref</a></p> 8308```````````````````````````````` 8309 8310 8311(In the examples above, we have two [shortcut reference links] 8312instead of one [full reference link].) 8313 8314The following cases illustrate the precedence of link text grouping over 8315emphasis grouping: 8316 8317```````````````````````````````` example 8318*[foo*][ref] 8319 8320[ref]: /uri 8321. 8322<p>*<a href="/uri">foo*</a></p> 8323```````````````````````````````` 8324 8325 8326```````````````````````````````` example 8327[foo *bar][ref] 8328 8329[ref]: /uri 8330. 8331<p><a href="/uri">foo *bar</a></p> 8332```````````````````````````````` 8333 8334 8335These cases illustrate the precedence of HTML tags, code spans, 8336and autolinks over link grouping: 8337 8338```````````````````````````````` example 8339[foo <bar attr="][ref]"> 8340 8341[ref]: /uri 8342. 8343<p>[foo <bar attr="][ref]"></p> 8344```````````````````````````````` 8345 8346 8347```````````````````````````````` example 8348[foo`][ref]` 8349 8350[ref]: /uri 8351. 8352<p>[foo<code>][ref]</code></p> 8353```````````````````````````````` 8354 8355 8356```````````````````````````````` example 8357[foo<http://example.com/?search=][ref]> 8358 8359[ref]: /uri 8360. 8361<p>[foo<a href="http://example.com/?search=%5D%5Bref%5D">http://example.com/?search=][ref]</a></p> 8362```````````````````````````````` 8363 8364 8365Matching is case-insensitive: 8366 8367```````````````````````````````` example 8368[foo][BaR] 8369 8370[bar]: /url "title" 8371. 8372<p><a href="/url" title="title">foo</a></p> 8373```````````````````````````````` 8374 8375 8376Unicode case fold is used: 8377 8378```````````````````````````````` example 8379[Толпой][Толпой] is a Russian word. 8380 8381[ТОЛПОЙ]: /url 8382. 8383<p><a href="/url">Толпой</a> is a Russian word.</p> 8384```````````````````````````````` 8385 8386 8387Consecutive internal [whitespace] is treated as one space for 8388purposes of determining matching: 8389 8390```````````````````````````````` example 8391[Foo 8392 bar]: /url 8393 8394[Baz][Foo bar] 8395. 8396<p><a href="/url">Baz</a></p> 8397```````````````````````````````` 8398 8399 8400No [whitespace] is allowed between the [link text] and the 8401[link label]: 8402 8403```````````````````````````````` example 8404[foo] [bar] 8405 8406[bar]: /url "title" 8407. 8408<p>[foo] <a href="/url" title="title">bar</a></p> 8409```````````````````````````````` 8410 8411 8412```````````````````````````````` example 8413[foo] 8414[bar] 8415 8416[bar]: /url "title" 8417. 8418<p>[foo] 8419<a href="/url" title="title">bar</a></p> 8420```````````````````````````````` 8421 8422 8423This is a departure from John Gruber's original Markdown syntax 8424description, which explicitly allows whitespace between the link 8425text and the link label. It brings reference links in line with 8426[inline links], which (according to both original Markdown and 8427this spec) cannot have whitespace after the link text. More 8428importantly, it prevents inadvertent capture of consecutive 8429[shortcut reference links]. If whitespace is allowed between the 8430link text and the link label, then in the following we will have 8431a single reference link, not two shortcut reference links, as 8432intended: 8433 8434``` markdown 8435[foo] 8436[bar] 8437 8438[foo]: /url1 8439[bar]: /url2 8440``` 8441 8442(Note that [shortcut reference links] were introduced by Gruber 8443himself in a beta version of `Markdown.pl`, but never included 8444in the official syntax description. Without shortcut reference 8445links, it is harmless to allow space between the link text and 8446link label; but once shortcut references are introduced, it is 8447too dangerous to allow this, as it frequently leads to 8448unintended results.) 8449 8450When there are multiple matching [link reference definitions], 8451the first is used: 8452 8453```````````````````````````````` example 8454[foo]: /url1 8455 8456[foo]: /url2 8457 8458[bar][foo] 8459. 8460<p><a href="/url1">bar</a></p> 8461```````````````````````````````` 8462 8463 8464Note that matching is performed on normalized strings, not parsed 8465inline content. So the following does not match, even though the 8466labels define equivalent inline content: 8467 8468```````````````````````````````` example 8469[bar][foo\!] 8470 8471[foo!]: /url 8472. 8473<p>[bar][foo!]</p> 8474```````````````````````````````` 8475 8476 8477[Link labels] cannot contain brackets, unless they are 8478backslash-escaped: 8479 8480```````````````````````````````` example 8481[foo][ref[] 8482 8483[ref[]: /uri 8484. 8485<p>[foo][ref[]</p> 8486<p>[ref[]: /uri</p> 8487```````````````````````````````` 8488 8489 8490```````````````````````````````` example 8491[foo][ref[bar]] 8492 8493[ref[bar]]: /uri 8494. 8495<p>[foo][ref[bar]]</p> 8496<p>[ref[bar]]: /uri</p> 8497```````````````````````````````` 8498 8499 8500```````````````````````````````` example 8501[[[foo]]] 8502 8503[[[foo]]]: /url 8504. 8505<p>[[[foo]]]</p> 8506<p>[[[foo]]]: /url</p> 8507```````````````````````````````` 8508 8509 8510```````````````````````````````` example 8511[foo][ref\[] 8512 8513[ref\[]: /uri 8514. 8515<p><a href="/uri">foo</a></p> 8516```````````````````````````````` 8517 8518 8519Note that in this example `]` is not backslash-escaped: 8520 8521```````````````````````````````` example 8522[bar\\]: /uri 8523 8524[bar\\] 8525. 8526<p><a href="/uri">bar\</a></p> 8527```````````````````````````````` 8528 8529 8530A [link label] must contain at least one [non-whitespace character]: 8531 8532```````````````````````````````` example 8533[] 8534 8535[]: /uri 8536. 8537<p>[]</p> 8538<p>[]: /uri</p> 8539```````````````````````````````` 8540 8541 8542```````````````````````````````` example 8543[ 8544 ] 8545 8546[ 8547 ]: /uri 8548. 8549<p>[ 8550]</p> 8551<p>[ 8552]: /uri</p> 8553```````````````````````````````` 8554 8555 8556A [collapsed reference link](@) 8557consists of a [link label] that [matches] a 8558[link reference definition] elsewhere in the 8559document, followed by the string `[]`. 8560The contents of the first link label are parsed as inlines, 8561which are used as the link's text. The link's URI and title are 8562provided by the matching reference link definition. Thus, 8563`[foo][]` is equivalent to `[foo][foo]`. 8564 8565```````````````````````````````` example 8566[foo][] 8567 8568[foo]: /url "title" 8569. 8570<p><a href="/url" title="title">foo</a></p> 8571```````````````````````````````` 8572 8573 8574```````````````````````````````` example 8575[*foo* bar][] 8576 8577[*foo* bar]: /url "title" 8578. 8579<p><a href="/url" title="title"><em>foo</em> bar</a></p> 8580```````````````````````````````` 8581 8582 8583The link labels are case-insensitive: 8584 8585```````````````````````````````` example 8586[Foo][] 8587 8588[foo]: /url "title" 8589. 8590<p><a href="/url" title="title">Foo</a></p> 8591```````````````````````````````` 8592 8593 8594 8595As with full reference links, [whitespace] is not 8596allowed between the two sets of brackets: 8597 8598```````````````````````````````` example 8599[foo] 8600[] 8601 8602[foo]: /url "title" 8603. 8604<p><a href="/url" title="title">foo</a> 8605[]</p> 8606```````````````````````````````` 8607 8608 8609A [shortcut reference link](@) 8610consists of a [link label] that [matches] a 8611[link reference definition] elsewhere in the 8612document and is not followed by `[]` or a link label. 8613The contents of the first link label are parsed as inlines, 8614which are used as the link's text. The link's URI and title 8615are provided by the matching link reference definition. 8616Thus, `[foo]` is equivalent to `[foo][]`. 8617 8618```````````````````````````````` example 8619[foo] 8620 8621[foo]: /url "title" 8622. 8623<p><a href="/url" title="title">foo</a></p> 8624```````````````````````````````` 8625 8626 8627```````````````````````````````` example 8628[*foo* bar] 8629 8630[*foo* bar]: /url "title" 8631. 8632<p><a href="/url" title="title"><em>foo</em> bar</a></p> 8633```````````````````````````````` 8634 8635 8636```````````````````````````````` example 8637[[*foo* bar]] 8638 8639[*foo* bar]: /url "title" 8640. 8641<p>[<a href="/url" title="title"><em>foo</em> bar</a>]</p> 8642```````````````````````````````` 8643 8644 8645```````````````````````````````` example 8646[[bar [foo] 8647 8648[foo]: /url 8649. 8650<p>[[bar <a href="/url">foo</a></p> 8651```````````````````````````````` 8652 8653 8654The link labels are case-insensitive: 8655 8656```````````````````````````````` example 8657[Foo] 8658 8659[foo]: /url "title" 8660. 8661<p><a href="/url" title="title">Foo</a></p> 8662```````````````````````````````` 8663 8664 8665A space after the link text should be preserved: 8666 8667```````````````````````````````` example 8668[foo] bar 8669 8670[foo]: /url 8671. 8672<p><a href="/url">foo</a> bar</p> 8673```````````````````````````````` 8674 8675 8676If you just want bracketed text, you can backslash-escape the 8677opening bracket to avoid links: 8678 8679```````````````````````````````` example 8680\[foo] 8681 8682[foo]: /url "title" 8683. 8684<p>[foo]</p> 8685```````````````````````````````` 8686 8687 8688Note that this is a link, because a link label ends with the first 8689following closing bracket: 8690 8691```````````````````````````````` example 8692[foo*]: /url 8693 8694*[foo*] 8695. 8696<p>*<a href="/url">foo*</a></p> 8697```````````````````````````````` 8698 8699 8700Full and compact references take precedence over shortcut 8701references: 8702 8703```````````````````````````````` example 8704[foo][bar] 8705 8706[foo]: /url1 8707[bar]: /url2 8708. 8709<p><a href="/url2">foo</a></p> 8710```````````````````````````````` 8711 8712```````````````````````````````` example 8713[foo][] 8714 8715[foo]: /url1 8716. 8717<p><a href="/url1">foo</a></p> 8718```````````````````````````````` 8719 8720Inline links also take precedence: 8721 8722```````````````````````````````` example 8723[foo]() 8724 8725[foo]: /url1 8726. 8727<p><a href="">foo</a></p> 8728```````````````````````````````` 8729 8730```````````````````````````````` example 8731[foo](not a link) 8732 8733[foo]: /url1 8734. 8735<p><a href="/url1">foo</a>(not a link)</p> 8736```````````````````````````````` 8737 8738In the following case `[bar][baz]` is parsed as a reference, 8739`[foo]` as normal text: 8740 8741```````````````````````````````` example 8742[foo][bar][baz] 8743 8744[baz]: /url 8745. 8746<p>[foo]<a href="/url">bar</a></p> 8747```````````````````````````````` 8748 8749 8750Here, though, `[foo][bar]` is parsed as a reference, since 8751`[bar]` is defined: 8752 8753```````````````````````````````` example 8754[foo][bar][baz] 8755 8756[baz]: /url1 8757[bar]: /url2 8758. 8759<p><a href="/url2">foo</a><a href="/url1">baz</a></p> 8760```````````````````````````````` 8761 8762 8763Here `[foo]` is not parsed as a shortcut reference, because it 8764is followed by a link label (even though `[bar]` is not defined): 8765 8766```````````````````````````````` example 8767[foo][bar][baz] 8768 8769[baz]: /url1 8770[foo]: /url2 8771. 8772<p>[foo]<a href="/url1">bar</a></p> 8773```````````````````````````````` 8774 8775 8776 8777## Images 8778 8779Syntax for images is like the syntax for links, with one 8780difference. Instead of [link text], we have an 8781[image description](@). The rules for this are the 8782same as for [link text], except that (a) an 8783image description starts with ` 8791. 8792<p><img src="/url" alt="foo" title="title" /></p> 8793```````````````````````````````` 8794 8795 8796```````````````````````````````` example 8797![foo *bar*] 8798 8799[foo *bar*]: train.jpg "train & tracks" 8800. 8801<p><img src="train.jpg" alt="foo bar" title="train & tracks" /></p> 8802```````````````````````````````` 8803 8804 8805```````````````````````````````` example 8806](/url2) 8807. 8808<p><img src="/url2" alt="foo bar" /></p> 8809```````````````````````````````` 8810 8811 8812```````````````````````````````` example 8813](/url2) 8814. 8815<p><img src="/url2" alt="foo bar" /></p> 8816```````````````````````````````` 8817 8818 8819Though this spec is concerned with parsing, not rendering, it is 8820recommended that in rendering to HTML, only the plain string content 8821of the [image description] be used. Note that in 8822the above example, the alt attribute's value is `foo bar`, not `foo 8823[bar](/url)` or `foo <a href="/url">bar</a>`. Only the plain string 8824content is rendered, without formatting. 8825 8826```````````````````````````````` example 8827![foo *bar*][] 8828 8829[foo *bar*]: train.jpg "train & tracks" 8830. 8831<p><img src="train.jpg" alt="foo bar" title="train & tracks" /></p> 8832```````````````````````````````` 8833 8834 8835```````````````````````````````` example 8836![foo *bar*][foobar] 8837 8838[FOOBAR]: train.jpg "train & tracks" 8839. 8840<p><img src="train.jpg" alt="foo bar" title="train & tracks" /></p> 8841```````````````````````````````` 8842 8843 8844```````````````````````````````` example 8845 8846. 8847<p><img src="train.jpg" alt="foo" /></p> 8848```````````````````````````````` 8849 8850 8851```````````````````````````````` example 8852My  8853. 8854<p>My <img src="/path/to/train.jpg" alt="foo bar" title="title" /></p> 8855```````````````````````````````` 8856 8857 8858```````````````````````````````` example 8859 8860. 8861<p><img src="url" alt="foo" /></p> 8862```````````````````````````````` 8863 8864 8865```````````````````````````````` example 8866 8867. 8868<p><img src="/url" alt="" /></p> 8869```````````````````````````````` 8870 8871 8872Reference-style: 8873 8874```````````````````````````````` example 8875![foo][bar] 8876 8877[bar]: /url 8878. 8879<p><img src="/url" alt="foo" /></p> 8880```````````````````````````````` 8881 8882 8883```````````````````````````````` example 8884![foo][bar] 8885 8886[BAR]: /url 8887. 8888<p><img src="/url" alt="foo" /></p> 8889```````````````````````````````` 8890 8891 8892Collapsed: 8893 8894```````````````````````````````` example 8895![foo][] 8896 8897[foo]: /url "title" 8898. 8899<p><img src="/url" alt="foo" title="title" /></p> 8900```````````````````````````````` 8901 8902 8903```````````````````````````````` example 8904![*foo* bar][] 8905 8906[*foo* bar]: /url "title" 8907. 8908<p><img src="/url" alt="foo bar" title="title" /></p> 8909```````````````````````````````` 8910 8911 8912The labels are case-insensitive: 8913 8914```````````````````````````````` example 8915![Foo][] 8916 8917[foo]: /url "title" 8918. 8919<p><img src="/url" alt="Foo" title="title" /></p> 8920```````````````````````````````` 8921 8922 8923As with reference links, [whitespace] is not allowed 8924between the two sets of brackets: 8925 8926```````````````````````````````` example 8927![foo] 8928[] 8929 8930[foo]: /url "title" 8931. 8932<p><img src="/url" alt="foo" title="title" /> 8933[]</p> 8934```````````````````````````````` 8935 8936 8937Shortcut: 8938 8939```````````````````````````````` example 8940![foo] 8941 8942[foo]: /url "title" 8943. 8944<p><img src="/url" alt="foo" title="title" /></p> 8945```````````````````````````````` 8946 8947 8948```````````````````````````````` example 8949![*foo* bar] 8950 8951[*foo* bar]: /url "title" 8952. 8953<p><img src="/url" alt="foo bar" title="title" /></p> 8954```````````````````````````````` 8955 8956 8957Note that link labels cannot contain unescaped brackets: 8958 8959```````````````````````````````` example 8960![[foo]] 8961 8962[[foo]]: /url "title" 8963. 8964<p>![[foo]]</p> 8965<p>[[foo]]: /url "title"</p> 8966```````````````````````````````` 8967 8968 8969The link labels are case-insensitive: 8970 8971```````````````````````````````` example 8972![Foo] 8973 8974[foo]: /url "title" 8975. 8976<p><img src="/url" alt="Foo" title="title" /></p> 8977```````````````````````````````` 8978 8979 8980If you just want a literal `!` followed by bracketed text, you can 8981backslash-escape the opening `[`: 8982 8983```````````````````````````````` example 8984!\[foo] 8985 8986[foo]: /url "title" 8987. 8988<p>![foo]</p> 8989```````````````````````````````` 8990 8991 8992If you want a link after a literal `!`, backslash-escape the 8993`!`: 8994 8995```````````````````````````````` example 8996\![foo] 8997 8998[foo]: /url "title" 8999. 9000<p>!<a href="/url" title="title">foo</a></p> 9001```````````````````````````````` 9002 9003 9004## Autolinks 9005 9006[Autolink](@)s are absolute URIs and email addresses inside 9007`<` and `>`. They are parsed as links, with the URL or email address 9008as the link label. 9009 9010A [URI autolink](@) consists of `<`, followed by an 9011[absolute URI] followed by `>`. It is parsed as 9012a link to the URI, with the URI as the link's label. 9013 9014An [absolute URI](@), 9015for these purposes, consists of a [scheme] followed by a colon (`:`) 9016followed by zero or more characters other than ASCII 9017[whitespace] and control characters, `<`, and `>`. If 9018the URI includes these characters, they must be percent-encoded 9019(e.g. `%20` for a space). 9020 9021For purposes of this spec, a [scheme](@) is any sequence 9022of 2--32 characters beginning with an ASCII letter and followed 9023by any combination of ASCII letters, digits, or the symbols plus 9024("+"), period ("."), or hyphen ("-"). 9025 9026Here are some valid autolinks: 9027 9028```````````````````````````````` example 9029<http://foo.bar.baz> 9030. 9031<p><a href="http://foo.bar.baz">http://foo.bar.baz</a></p> 9032```````````````````````````````` 9033 9034 9035```````````````````````````````` example 9036<http://foo.bar.baz/test?q=hello&id=22&boolean> 9037. 9038<p><a href="http://foo.bar.baz/test?q=hello&id=22&boolean">http://foo.bar.baz/test?q=hello&id=22&boolean</a></p> 9039```````````````````````````````` 9040 9041 9042```````````````````````````````` example 9043<irc://foo.bar:2233/baz> 9044. 9045<p><a href="irc://foo.bar:2233/baz">irc://foo.bar:2233/baz</a></p> 9046```````````````````````````````` 9047 9048 9049Uppercase is also fine: 9050 9051```````````````````````````````` example 9052<MAILTO:FOO@BAR.BAZ> 9053. 9054<p><a href="MAILTO:FOO@BAR.BAZ">MAILTO:FOO@BAR.BAZ</a></p> 9055```````````````````````````````` 9056 9057 9058Note that many strings that count as [absolute URIs] for 9059purposes of this spec are not valid URIs, because their 9060schemes are not registered or because of other problems 9061with their syntax: 9062 9063```````````````````````````````` example 9064<a+b+c:d> 9065. 9066<p><a href="a+b+c:d">a+b+c:d</a></p> 9067```````````````````````````````` 9068 9069 9070```````````````````````````````` example 9071<made-up-scheme://foo,bar> 9072. 9073<p><a href="made-up-scheme://foo,bar">made-up-scheme://foo,bar</a></p> 9074```````````````````````````````` 9075 9076 9077```````````````````````````````` example 9078<http://../> 9079. 9080<p><a href="http://../">http://../</a></p> 9081```````````````````````````````` 9082 9083 9084```````````````````````````````` example 9085<localhost:5001/foo> 9086. 9087<p><a href="localhost:5001/foo">localhost:5001/foo</a></p> 9088```````````````````````````````` 9089 9090 9091Spaces are not allowed in autolinks: 9092 9093```````````````````````````````` example 9094<http://foo.bar/baz bim> 9095. 9096<p><http://foo.bar/baz bim></p> 9097```````````````````````````````` 9098 9099 9100Backslash-escapes do not work inside autolinks: 9101 9102```````````````````````````````` example 9103<http://example.com/\[\> 9104. 9105<p><a href="http://example.com/%5C%5B%5C">http://example.com/\[\</a></p> 9106```````````````````````````````` 9107 9108 9109An [email autolink](@) 9110consists of `<`, followed by an [email address], 9111followed by `>`. The link's label is the email address, 9112and the URL is `mailto:` followed by the email address. 9113 9114An [email address](@), 9115for these purposes, is anything that matches 9116the [non-normative regex from the HTML5 9117spec](https://html.spec.whatwg.org/multipage/forms.html#e-mail-state-(type=email)): 9118 9119 /^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])? 9120 (?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/ 9121 9122Examples of email autolinks: 9123 9124```````````````````````````````` example 9125<foo@bar.example.com> 9126. 9127<p><a href="mailto:foo@bar.example.com">foo@bar.example.com</a></p> 9128```````````````````````````````` 9129 9130 9131```````````````````````````````` example 9132<foo+special@Bar.baz-bar0.com> 9133. 9134<p><a href="mailto:foo+special@Bar.baz-bar0.com">foo+special@Bar.baz-bar0.com</a></p> 9135```````````````````````````````` 9136 9137 9138Backslash-escapes do not work inside email autolinks: 9139 9140```````````````````````````````` example 9141<foo\+@bar.example.com> 9142. 9143<p><foo+@bar.example.com></p> 9144```````````````````````````````` 9145 9146 9147These are not autolinks: 9148 9149```````````````````````````````` example 9150<> 9151. 9152<p><></p> 9153```````````````````````````````` 9154 9155 9156```````````````````````````````` example 9157< http://foo.bar > 9158. 9159<p>< http://foo.bar ></p> 9160```````````````````````````````` 9161 9162 9163```````````````````````````````` example 9164<m:abc> 9165. 9166<p><m:abc></p> 9167```````````````````````````````` 9168 9169 9170```````````````````````````````` example 9171<foo.bar.baz> 9172. 9173<p><foo.bar.baz></p> 9174```````````````````````````````` 9175 9176 9177```````````````````````````````` example 9178http://example.com 9179. 9180<p>http://example.com</p> 9181```````````````````````````````` 9182 9183 9184```````````````````````````````` example 9185foo@bar.example.com 9186. 9187<p>foo@bar.example.com</p> 9188```````````````````````````````` 9189 9190<div class="extension"> 9191 9192## Autolinks (extension) 9193 9194GFM enables the `autolink` extension, where autolinks will be recognised in a 9195greater number of conditions. 9196 9197[Autolink]s can also be constructed without requiring the use of `<` and to `>` 9198to delimit them, although they will be recognized under a smaller set of 9199circumstances. All such recognized autolinks can only come at the beginning of 9200a line, after whitespace, or any of the delimiting characters `*`, `_`, `~`, 9201and `(`. 9202 9203An [extended www autolink](@) will be recognized 9204when the text `www.` is found followed by a [valid domain]. 9205A [valid domain](@) consists of segments 9206of alphanumeric characters, underscores (`_`) and hyphens (`-`) 9207separated by periods (`.`). 9208There must be at least one period, 9209and no underscores may be present in the last two segments of the domain. 9210 9211The scheme `http` will be inserted automatically: 9212 9213```````````````````````````````` example autolink 9214www.commonmark.org 9215. 9216<p><a href="http://www.commonmark.org">www.commonmark.org</a></p> 9217```````````````````````````````` 9218 9219After a [valid domain], zero or more non-space non-`<` characters may follow: 9220 9221```````````````````````````````` example autolink 9222Visit www.commonmark.org/help for more information. 9223. 9224<p>Visit <a href="http://www.commonmark.org/help">www.commonmark.org/help</a> for more information.</p> 9225```````````````````````````````` 9226 9227We then apply [extended autolink path validation](@) as follows: 9228 9229Trailing punctuation (specifically, `?`, `!`, `.`, `,`, `:`, `*`, `_`, and `~`) 9230will not be considered part of the autolink, though they may be included in the 9231interior of the link: 9232 9233```````````````````````````````` example autolink 9234Visit www.commonmark.org. 9235 9236Visit www.commonmark.org/a.b. 9237. 9238<p>Visit <a href="http://www.commonmark.org">www.commonmark.org</a>.</p> 9239<p>Visit <a href="http://www.commonmark.org/a.b">www.commonmark.org/a.b</a>.</p> 9240```````````````````````````````` 9241 9242When an autolink ends in `)`, we scan the entire autolink for the total number 9243of parentheses. If there is a greater number of closing parentheses than 9244opening ones, we don't consider the unmatched trailing parentheses part of the 9245autolink, in order to facilitate including an autolink inside a parenthesis: 9246 9247```````````````````````````````` example autolink 9248www.google.com/search?q=Markup+(business) 9249 9250www.google.com/search?q=Markup+(business))) 9251 9252(www.google.com/search?q=Markup+(business)) 9253 9254(www.google.com/search?q=Markup+(business) 9255. 9256<p><a href="http://www.google.com/search?q=Markup+(business)">www.google.com/search?q=Markup+(business)</a></p> 9257<p><a href="http://www.google.com/search?q=Markup+(business)">www.google.com/search?q=Markup+(business)</a>))</p> 9258<p>(<a href="http://www.google.com/search?q=Markup+(business)">www.google.com/search?q=Markup+(business)</a>)</p> 9259<p>(<a href="http://www.google.com/search?q=Markup+(business)">www.google.com/search?q=Markup+(business)</a></p> 9260```````````````````````````````` 9261 9262This check is only done when the link ends in a closing parentheses `)`, so if 9263the only parentheses are in the interior of the autolink, no special rules are 9264applied: 9265 9266```````````````````````````````` example autolink 9267www.google.com/search?q=(business))+ok 9268. 9269<p><a href="http://www.google.com/search?q=(business))+ok">www.google.com/search?q=(business))+ok</a></p> 9270```````````````````````````````` 9271 9272If an autolink ends in a semicolon (`;`), we check to see if it appears to 9273resemble an [entity reference][entity references]; if the preceding text is `&` 9274followed by one or more alphanumeric characters. If so, it is excluded from 9275the autolink: 9276 9277```````````````````````````````` example autolink 9278www.google.com/search?q=commonmark&hl=en 9279 9280www.google.com/search?q=commonmark&hl; 9281. 9282<p><a href="http://www.google.com/search?q=commonmark&hl=en">www.google.com/search?q=commonmark&hl=en</a></p> 9283<p><a href="http://www.google.com/search?q=commonmark">www.google.com/search?q=commonmark</a>&hl;</p> 9284```````````````````````````````` 9285 9286`<` immediately ends an autolink. 9287 9288```````````````````````````````` example autolink 9289www.commonmark.org/he<lp 9290. 9291<p><a href="http://www.commonmark.org/he">www.commonmark.org/he</a><lp</p> 9292```````````````````````````````` 9293 9294An [extended url autolink](@) will be recognised when one of the schemes 9295`http://`, `https://`, or `ftp://`, followed by a [valid domain], then zero or 9296more non-space non-`<` characters according to 9297[extended autolink path validation]: 9298 9299```````````````````````````````` example autolink 9300http://commonmark.org 9301 9302(Visit https://encrypted.google.com/search?q=Markup+(business)) 9303 9304Anonymous FTP is available at ftp://foo.bar.baz. 9305. 9306<p><a href="http://commonmark.org">http://commonmark.org</a></p> 9307<p>(Visit <a href="https://encrypted.google.com/search?q=Markup+(business)">https://encrypted.google.com/search?q=Markup+(business)</a>)</p> 9308<p>Anonymous FTP is available at <a href="ftp://foo.bar.baz">ftp://foo.bar.baz</a>.</p> 9309```````````````````````````````` 9310 9311 9312An [extended email autolink](@) will be recognised when an email address is 9313recognised within any text node. Email addresses are recognised according to 9314the following rules: 9315 9316* One ore more characters which are alphanumeric, or `.`, `-`, `_`, or `+`. 9317* An `@` symbol. 9318* One or more characters which are alphanumeric, or `-` or `_`, 9319 separated by periods (`.`). 9320 There must be at least one period. 9321 The last character must not be one of `-` or `_`. 9322 9323The scheme `mailto:` will automatically be added to the generated link: 9324 9325```````````````````````````````` example autolink 9326foo@bar.baz 9327. 9328<p><a href="mailto:foo@bar.baz">foo@bar.baz</a></p> 9329```````````````````````````````` 9330 9331`+` can occur before the `@`, but not after. 9332 9333```````````````````````````````` example autolink 9334hello@mail+xyz.example isn't valid, but hello+xyz@mail.example is. 9335. 9336<p>hello@mail+xyz.example isn't valid, but <a href="mailto:hello+xyz@mail.example">hello+xyz@mail.example</a> is.</p> 9337```````````````````````````````` 9338 9339`.`, `-`, and `_` can occur on both sides of the `@`, but only `.` may occur at 9340the end of the email address, in which case it will not be considered part of 9341the address: 9342 9343```````````````````````````````` example autolink 9344a.b-c_d@a.b 9345 9346a.b-c_d@a.b. 9347 9348a.b-c_d@a.b- 9349 9350a.b-c_d@a.b_ 9351. 9352<p><a href="mailto:a.b-c_d@a.b">a.b-c_d@a.b</a></p> 9353<p><a href="mailto:a.b-c_d@a.b">a.b-c_d@a.b</a>.</p> 9354<p>a.b-c_d@a.b-</p> 9355<p>a.b-c_d@a.b_</p> 9356```````````````````````````````` 9357 9358</div> 9359 9360## Raw HTML 9361 9362Text between `<` and `>` that looks like an HTML tag is parsed as a 9363raw HTML tag and will be rendered in HTML without escaping. 9364Tag and attribute names are not limited to current HTML tags, 9365so custom tags (and even, say, DocBook tags) may be used. 9366 9367Here is the grammar for tags: 9368 9369A [tag name](@) consists of an ASCII letter 9370followed by zero or more ASCII letters, digits, or 9371hyphens (`-`). 9372 9373An [attribute](@) consists of [whitespace], 9374an [attribute name], and an optional 9375[attribute value specification]. 9376 9377An [attribute name](@) 9378consists of an ASCII letter, `_`, or `:`, followed by zero or more ASCII 9379letters, digits, `_`, `.`, `:`, or `-`. (Note: This is the XML 9380specification restricted to ASCII. HTML5 is laxer.) 9381 9382An [attribute value specification](@) 9383consists of optional [whitespace], 9384a `=` character, optional [whitespace], and an [attribute 9385value]. 9386 9387An [attribute value](@) 9388consists of an [unquoted attribute value], 9389a [single-quoted attribute value], or a [double-quoted attribute value]. 9390 9391An [unquoted attribute value](@) 9392is a nonempty string of characters not 9393including [whitespace], `"`, `'`, `=`, `<`, `>`, or `` ` ``. 9394 9395A [single-quoted attribute value](@) 9396consists of `'`, zero or more 9397characters not including `'`, and a final `'`. 9398 9399A [double-quoted attribute value](@) 9400consists of `"`, zero or more 9401characters not including `"`, and a final `"`. 9402 9403An [open tag](@) consists of a `<` character, a [tag name], 9404zero or more [attributes], optional [whitespace], an optional `/` 9405character, and a `>` character. 9406 9407A [closing tag](@) consists of the string `</`, a 9408[tag name], optional [whitespace], and the character `>`. 9409 9410An [HTML comment](@) consists of `<!-->`, `<!--->`, or `<!--`, a string of 9411characters not including the string `-->`, and `-->` (see the 9412[HTML spec](https://html.spec.whatwg.org/multipage/parsing.html#markup-declaration-open-state)). 9413 9414A [processing instruction](@) 9415consists of the string `<?`, a string 9416of characters not including the string `?>`, and the string 9417`?>`. 9418 9419A [declaration](@) consists of the 9420string `<!`, a name consisting of one or more uppercase ASCII letters, 9421[whitespace], a string of characters not including the 9422character `>`, and the character `>`. 9423 9424A [CDATA section](@) consists of 9425the string `<![CDATA[`, a string of characters not including the string 9426`]]>`, and the string `]]>`. 9427 9428An [HTML tag](@) consists of an [open tag], a [closing tag], 9429an [HTML comment], a [processing instruction], a [declaration], 9430or a [CDATA section]. 9431 9432Here are some simple open tags: 9433 9434```````````````````````````````` example 9435<a><bab><c2c> 9436. 9437<p><a><bab><c2c></p> 9438```````````````````````````````` 9439 9440 9441Empty elements: 9442 9443```````````````````````````````` example 9444<a/><b2/> 9445. 9446<p><a/><b2/></p> 9447```````````````````````````````` 9448 9449 9450[Whitespace] is allowed: 9451 9452```````````````````````````````` example 9453<a /><b2 9454data="foo" > 9455. 9456<p><a /><b2 9457data="foo" ></p> 9458```````````````````````````````` 9459 9460 9461With attributes: 9462 9463```````````````````````````````` example 9464<a foo="bar" bam = 'baz <em>"</em>' 9465_boolean zoop:33=zoop:33 /> 9466. 9467<p><a foo="bar" bam = 'baz <em>"</em>' 9468_boolean zoop:33=zoop:33 /></p> 9469```````````````````````````````` 9470 9471 9472Custom tag names can be used: 9473 9474```````````````````````````````` example 9475Foo <responsive-image src="foo.jpg" /> 9476. 9477<p>Foo <responsive-image src="foo.jpg" /></p> 9478```````````````````````````````` 9479 9480 9481Illegal tag names, not parsed as HTML: 9482 9483```````````````````````````````` example 9484<33> <__> 9485. 9486<p><33> <__></p> 9487```````````````````````````````` 9488 9489 9490Illegal attribute names: 9491 9492```````````````````````````````` example 9493<a h*#ref="hi"> 9494. 9495<p><a h*#ref="hi"></p> 9496```````````````````````````````` 9497 9498 9499Illegal attribute values: 9500 9501```````````````````````````````` example 9502<a href="hi'> <a href=hi'> 9503. 9504<p><a href="hi'> <a href=hi'></p> 9505```````````````````````````````` 9506 9507 9508Illegal [whitespace]: 9509 9510```````````````````````````````` example 9511< a>< 9512foo><bar/ > 9513<foo bar=baz 9514bim!bop /> 9515. 9516<p>< a>< 9517foo><bar/ > 9518<foo bar=baz 9519bim!bop /></p> 9520```````````````````````````````` 9521 9522 9523Missing [whitespace]: 9524 9525```````````````````````````````` example 9526<a href='bar'title=title> 9527. 9528<p><a href='bar'title=title></p> 9529```````````````````````````````` 9530 9531 9532Closing tags: 9533 9534```````````````````````````````` example 9535</a></foo > 9536. 9537<p></a></foo ></p> 9538```````````````````````````````` 9539 9540 9541Illegal attributes in closing tag: 9542 9543```````````````````````````````` example 9544</a href="foo"> 9545. 9546<p></a href="foo"></p> 9547```````````````````````````````` 9548 9549 9550Comments: 9551 9552```````````````````````````````` example 9553foo <!-- this is a -- 9554comment - with hyphens --> 9555. 9556<p>foo <!-- this is a -- 9557comment - with hyphens --></p> 9558```````````````````````````````` 9559 9560```````````````````````````````` example 9561foo <!--> foo --> 9562 9563foo <!---> foo --> 9564. 9565<p>foo <!--> foo --></p> 9566<p>foo <!---> foo --></p> 9567```````````````````````````````` 9568 9569 9570Processing instructions: 9571 9572```````````````````````````````` example 9573foo <?php echo $a; ?> 9574. 9575<p>foo <?php echo $a; ?></p> 9576```````````````````````````````` 9577 9578 9579Declarations: 9580 9581```````````````````````````````` example 9582foo <!ELEMENT br EMPTY> 9583. 9584<p>foo <!ELEMENT br EMPTY></p> 9585```````````````````````````````` 9586 9587 9588CDATA sections: 9589 9590```````````````````````````````` example 9591foo <![CDATA[>&<]]> 9592. 9593<p>foo <![CDATA[>&<]]></p> 9594```````````````````````````````` 9595 9596 9597Entity and numeric character references are preserved in HTML 9598attributes: 9599 9600```````````````````````````````` example 9601foo <a href="ö"> 9602. 9603<p>foo <a href="ö"></p> 9604```````````````````````````````` 9605 9606 9607Backslash escapes do not work in HTML attributes: 9608 9609```````````````````````````````` example 9610foo <a href="\*"> 9611. 9612<p>foo <a href="\*"></p> 9613```````````````````````````````` 9614 9615 9616```````````````````````````````` example 9617<a href="\""> 9618. 9619<p><a href="""></p> 9620```````````````````````````````` 9621 9622 9623<div class="extension"> 9624 9625## Disallowed Raw HTML (extension) 9626 9627GFM enables the `tagfilter` extension, where the following HTML tags will be 9628filtered when rendering HTML output: 9629 9630* `<title>` 9631* `<textarea>` 9632* `<style>` 9633* `<xmp>` 9634* `<iframe>` 9635* `<noembed>` 9636* `<noframes>` 9637* `<script>` 9638* `<plaintext>` 9639 9640Filtering is done by replacing the leading `<` with the entity `<`. These 9641tags are chosen in particular as they change how HTML is interpreted in a way 9642unique to them (i.e. nested HTML is interpreted differently), and this is 9643usually undesireable in the context of other rendered Markdown content. 9644 9645All other HTML tags are left untouched. 9646 9647```````````````````````````````` example tagfilter 9648<strong> <title> <style> <em> 9649 9650<blockquote> 9651 <xmp> is disallowed. <XMP> is also disallowed. 9652</blockquote> 9653. 9654<p><strong> <title> <style> <em></p> 9655<blockquote> 9656 <xmp> is disallowed. <XMP> is also disallowed. 9657</blockquote> 9658```````````````````````````````` 9659 9660</div> 9661 9662## Hard line breaks 9663 9664A line break (not in a code span or HTML tag) that is preceded 9665by two or more spaces and does not occur at the end of a block 9666is parsed as a [hard line break](@) (rendered 9667in HTML as a `<br />` tag): 9668 9669```````````````````````````````` example 9670foo 9671baz 9672. 9673<p>foo<br /> 9674baz</p> 9675```````````````````````````````` 9676 9677 9678For a more visible alternative, a backslash before the 9679[line ending] may be used instead of two spaces: 9680 9681```````````````````````````````` example 9682foo\ 9683baz 9684. 9685<p>foo<br /> 9686baz</p> 9687```````````````````````````````` 9688 9689 9690More than two spaces can be used: 9691 9692```````````````````````````````` example 9693foo 9694baz 9695. 9696<p>foo<br /> 9697baz</p> 9698```````````````````````````````` 9699 9700 9701Leading spaces at the beginning of the next line are ignored: 9702 9703```````````````````````````````` example 9704foo 9705 bar 9706. 9707<p>foo<br /> 9708bar</p> 9709```````````````````````````````` 9710 9711 9712```````````````````````````````` example 9713foo\ 9714 bar 9715. 9716<p>foo<br /> 9717bar</p> 9718```````````````````````````````` 9719 9720 9721Line breaks can occur inside emphasis, links, and other constructs 9722that allow inline content: 9723 9724```````````````````````````````` example 9725*foo 9726bar* 9727. 9728<p><em>foo<br /> 9729bar</em></p> 9730```````````````````````````````` 9731 9732 9733```````````````````````````````` example 9734*foo\ 9735bar* 9736. 9737<p><em>foo<br /> 9738bar</em></p> 9739```````````````````````````````` 9740 9741 9742Line breaks do not occur inside code spans 9743 9744```````````````````````````````` example 9745`code 9746span` 9747. 9748<p><code>code span</code></p> 9749```````````````````````````````` 9750 9751 9752```````````````````````````````` example 9753`code\ 9754span` 9755. 9756<p><code>code\ span</code></p> 9757```````````````````````````````` 9758 9759 9760or HTML tags: 9761 9762```````````````````````````````` example 9763<a href="foo 9764bar"> 9765. 9766<p><a href="foo 9767bar"></p> 9768```````````````````````````````` 9769 9770 9771```````````````````````````````` example 9772<a href="foo\ 9773bar"> 9774. 9775<p><a href="foo\ 9776bar"></p> 9777```````````````````````````````` 9778 9779 9780Hard line breaks are for separating inline content within a block. 9781Neither syntax for hard line breaks works at the end of a paragraph or 9782other block element: 9783 9784```````````````````````````````` example 9785foo\ 9786. 9787<p>foo\</p> 9788```````````````````````````````` 9789 9790 9791```````````````````````````````` example 9792foo 9793. 9794<p>foo</p> 9795```````````````````````````````` 9796 9797 9798```````````````````````````````` example 9799### foo\ 9800. 9801<h3>foo\</h3> 9802```````````````````````````````` 9803 9804 9805```````````````````````````````` example 9806### foo 9807. 9808<h3>foo</h3> 9809```````````````````````````````` 9810 9811 9812## Soft line breaks 9813 9814A regular line break (not in a code span or HTML tag) that is not 9815preceded by two or more spaces or a backslash is parsed as a 9816[softbreak](@). (A softbreak may be rendered in HTML either as a 9817[line ending] or as a space. The result will be the same in 9818browsers. In the examples here, a [line ending] will be used.) 9819 9820```````````````````````````````` example 9821foo 9822baz 9823. 9824<p>foo 9825baz</p> 9826```````````````````````````````` 9827 9828 9829Spaces at the end of the line and beginning of the next line are 9830removed: 9831 9832```````````````````````````````` example 9833foo 9834 baz 9835. 9836<p>foo 9837baz</p> 9838```````````````````````````````` 9839 9840 9841A conforming parser may render a soft line break in HTML either as a 9842line break or as a space. 9843 9844A renderer may also provide an option to render soft line breaks 9845as hard line breaks. 9846 9847## Textual content 9848 9849Any characters not given an interpretation by the above rules will 9850be parsed as plain textual content. 9851 9852```````````````````````````````` example 9853hello $.;'there 9854. 9855<p>hello $.;'there</p> 9856```````````````````````````````` 9857 9858 9859```````````````````````````````` example 9860Foo χρῆν 9861. 9862<p>Foo χρῆν</p> 9863```````````````````````````````` 9864 9865 9866Internal spaces are preserved verbatim: 9867 9868```````````````````````````````` example 9869Multiple spaces 9870. 9871<p>Multiple spaces</p> 9872```````````````````````````````` 9873 9874 9875<!-- END TESTS --> 9876 9877# Appendix: A parsing strategy 9878 9879In this appendix we describe some features of the parsing strategy 9880used in the CommonMark reference implementations. 9881 9882## Overview 9883 9884Parsing has two phases: 9885 98861. In the first phase, lines of input are consumed and the block 9887structure of the document---its division into paragraphs, block quotes, 9888list items, and so on---is constructed. Text is assigned to these 9889blocks but not parsed. Link reference definitions are parsed and a 9890map of links is constructed. 9891 98922. In the second phase, the raw text contents of paragraphs and headings 9893are parsed into sequences of Markdown inline elements (strings, 9894code spans, links, emphasis, and so on), using the map of link 9895references constructed in phase 1. 9896 9897At each point in processing, the document is represented as a tree of 9898**blocks**. The root of the tree is a `document` block. The `document` 9899may have any number of other blocks as **children**. These children 9900may, in turn, have other blocks as children. The last child of a block 9901is normally considered **open**, meaning that subsequent lines of input 9902can alter its contents. (Blocks that are not open are **closed**.) 9903Here, for example, is a possible document tree, with the open blocks 9904marked by arrows: 9905 9906``` tree 9907-> document 9908 -> block_quote 9909 paragraph 9910 "Lorem ipsum dolor\nsit amet." 9911 -> list (type=bullet tight=true bullet_char=-) 9912 list_item 9913 paragraph 9914 "Qui *quodsi iracundia*" 9915 -> list_item 9916 -> paragraph 9917 "aliquando id" 9918``` 9919 9920## Phase 1: block structure 9921 9922Each line that is processed has an effect on this tree. The line is 9923analyzed and, depending on its contents, the document may be altered 9924in one or more of the following ways: 9925 99261. One or more open blocks may be closed. 99272. One or more new blocks may be created as children of the 9928 last open block. 99293. Text may be added to the last (deepest) open block remaining 9930 on the tree. 9931 9932Once a line has been incorporated into the tree in this way, 9933it can be discarded, so input can be read in a stream. 9934 9935For each line, we follow this procedure: 9936 99371. First we iterate through the open blocks, starting with the 9938root document, and descending through last children down to the last 9939open block. Each block imposes a condition that the line must satisfy 9940if the block is to remain open. For example, a block quote requires a 9941`>` character. A paragraph requires a non-blank line. 9942In this phase we may match all or just some of the open 9943blocks. But we cannot close unmatched blocks yet, because we may have a 9944[lazy continuation line]. 9945 99462. Next, after consuming the continuation markers for existing 9947blocks, we look for new block starts (e.g. `>` for a block quote). 9948If we encounter a new block start, we close any blocks unmatched 9949in step 1 before creating the new block as a child of the last 9950matched block. 9951 99523. Finally, we look at the remainder of the line (after block 9953markers like `>`, list markers, and indentation have been consumed). 9954This is text that can be incorporated into the last open 9955block (a paragraph, code block, heading, or raw HTML). 9956 9957Setext headings are formed when we see a line of a paragraph 9958that is a [setext heading underline]. 9959 9960Reference link definitions are detected when a paragraph is closed; 9961the accumulated text lines are parsed to see if they begin with 9962one or more reference link definitions. Any remainder becomes a 9963normal paragraph. 9964 9965We can see how this works by considering how the tree above is 9966generated by four lines of Markdown: 9967 9968``` markdown 9969> Lorem ipsum dolor 9970sit amet. 9971> - Qui *quodsi iracundia* 9972> - aliquando id 9973``` 9974 9975At the outset, our document model is just 9976 9977``` tree 9978-> document 9979``` 9980 9981The first line of our text, 9982 9983``` markdown 9984> Lorem ipsum dolor 9985``` 9986 9987causes a `block_quote` block to be created as a child of our 9988open `document` block, and a `paragraph` block as a child of 9989the `block_quote`. Then the text is added to the last open 9990block, the `paragraph`: 9991 9992``` tree 9993-> document 9994 -> block_quote 9995 -> paragraph 9996 "Lorem ipsum dolor" 9997``` 9998 9999The next line, 10000 10001``` markdown 10002sit amet. 10003``` 10004 10005is a "lazy continuation" of the open `paragraph`, so it gets added 10006to the paragraph's text: 10007 10008``` tree 10009-> document 10010 -> block_quote 10011 -> paragraph 10012 "Lorem ipsum dolor\nsit amet." 10013``` 10014 10015The third line, 10016 10017``` markdown 10018> - Qui *quodsi iracundia* 10019``` 10020 10021causes the `paragraph` block to be closed, and a new `list` block 10022opened as a child of the `block_quote`. A `list_item` is also 10023added as a child of the `list`, and a `paragraph` as a child of 10024the `list_item`. The text is then added to the new `paragraph`: 10025 10026``` tree 10027-> document 10028 -> block_quote 10029 paragraph 10030 "Lorem ipsum dolor\nsit amet." 10031 -> list (type=bullet tight=true bullet_char=-) 10032 -> list_item 10033 -> paragraph 10034 "Qui *quodsi iracundia*" 10035``` 10036 10037The fourth line, 10038 10039``` markdown 10040> - aliquando id 10041``` 10042 10043causes the `list_item` (and its child the `paragraph`) to be closed, 10044and a new `list_item` opened up as child of the `list`. A `paragraph` 10045is added as a child of the new `list_item`, to contain the text. 10046We thus obtain the final tree: 10047 10048``` tree 10049-> document 10050 -> block_quote 10051 paragraph 10052 "Lorem ipsum dolor\nsit amet." 10053 -> list (type=bullet tight=true bullet_char=-) 10054 list_item 10055 paragraph 10056 "Qui *quodsi iracundia*" 10057 -> list_item 10058 -> paragraph 10059 "aliquando id" 10060``` 10061 10062## Phase 2: inline structure 10063 10064Once all of the input has been parsed, all open blocks are closed. 10065 10066We then "walk the tree," visiting every node, and parse raw 10067string contents of paragraphs and headings as inlines. At this 10068point we have seen all the link reference definitions, so we can 10069resolve reference links as we go. 10070 10071``` tree 10072document 10073 block_quote 10074 paragraph 10075 str "Lorem ipsum dolor" 10076 softbreak 10077 str "sit amet." 10078 list (type=bullet tight=true bullet_char=-) 10079 list_item 10080 paragraph 10081 str "Qui " 10082 emph 10083 str "quodsi iracundia" 10084 list_item 10085 paragraph 10086 str "aliquando id" 10087``` 10088 10089Notice how the [line ending] in the first paragraph has 10090been parsed as a `softbreak`, and the asterisks in the first list item 10091have become an `emph`. 10092 10093### An algorithm for parsing nested emphasis and links 10094 10095By far the trickiest part of inline parsing is handling emphasis, 10096strong emphasis, links, and images. This is done using the following 10097algorithm. 10098 10099When we're parsing inlines and we hit either 10100 10101- a run of `*` or `_` characters, or 10102- a `[` or `. 10106 10107The [delimiter stack] is a doubly linked list. Each 10108element contains a pointer to a text node, plus information about 10109 10110- the type of delimiter (`[`, `![`, `*`, `_`) 10111- the number of delimiters, 10112- whether the delimiter is "active" (all are active to start), and 10113- whether the delimiter is a potential opener, a potential closer, 10114 or both (which depends on what sort of characters precede 10115 and follow the delimiters). 10116 10117When we hit a `]` character, we call the *look for link or image* 10118procedure (see below). 10119 10120When we hit the end of the input, we call the *process emphasis* 10121procedure (see below), with `stack_bottom` = NULL. 10122 10123#### *look for link or image* 10124 10125Starting at the top of the delimiter stack, we look backwards 10126through the stack for an opening `[` or `![` delimiter. 10127 10128- If we don't find one, we return a literal text node `]`. 10129 10130- If we do find one, but it's not *active*, we remove the inactive 10131 delimiter from the stack, and return a literal text node `]`. 10132 10133- If we find one and it's active, then we parse ahead to see if 10134 we have an inline link/image, reference link/image, compact reference 10135 link/image, or shortcut reference link/image. 10136 10137 + If we don't, then we remove the opening delimiter from the 10138 delimiter stack and return a literal text node `]`. 10139 10140 + If we do, then 10141 10142 * We return a link or image node whose children are the inlines 10143 after the text node pointed to by the opening delimiter. 10144 10145 * We run *process emphasis* on these inlines, with the `[` opener 10146 as `stack_bottom`. 10147 10148 * We remove the opening delimiter. 10149 10150 * If we have a link (and not an image), we also set all 10151 `[` delimiters before the opening delimiter to *inactive*. (This 10152 will prevent us from getting links within links.) 10153 10154#### *process emphasis* 10155 10156Parameter `stack_bottom` sets a lower bound to how far we 10157descend in the [delimiter stack]. If it is NULL, we can 10158go all the way to the bottom. Otherwise, we stop before 10159visiting `stack_bottom`. 10160 10161Let `current_position` point to the element on the [delimiter stack] 10162just above `stack_bottom` (or the first element if `stack_bottom` 10163is NULL). 10164 10165We keep track of the `openers_bottom` for each delimiter 10166type (`*`, `_`) and each length of the closing delimiter run 10167(modulo 3). Initialize this to `stack_bottom`. 10168 10169Then we repeat the following until we run out of potential 10170closers: 10171 10172- Move `current_position` forward in the delimiter stack (if needed) 10173 until we find the first potential closer with delimiter `*` or `_`. 10174 (This will be the potential closer closest 10175 to the beginning of the input -- the first one in parse order.) 10176 10177- Now, look back in the stack (staying above `stack_bottom` and 10178 the `openers_bottom` for this delimiter type) for the 10179 first matching potential opener ("matching" means same delimiter). 10180 10181- If one is found: 10182 10183 + Figure out whether we have emphasis or strong emphasis: 10184 if both closer and opener spans have length >= 2, we have 10185 strong, otherwise regular. 10186 10187 + Insert an emph or strong emph node accordingly, after 10188 the text node corresponding to the opener. 10189 10190 + Remove any delimiters between the opener and closer from 10191 the delimiter stack. 10192 10193 + Remove 1 (for regular emph) or 2 (for strong emph) delimiters 10194 from the opening and closing text nodes. If they become empty 10195 as a result, remove them and remove the corresponding element 10196 of the delimiter stack. If the closing node is removed, reset 10197 `current_position` to the next element in the stack. 10198 10199- If none is found: 10200 10201 + Set `openers_bottom` to the element before `current_position`. 10202 (We know that there are no openers for this kind of closer up to and 10203 including this point, so this puts a lower bound on future searches.) 10204 10205 + If the closer at `current_position` is not a potential opener, 10206 remove it from the delimiter stack (since we know it can't 10207 be a closer either). 10208 10209 + Advance `current_position` to the next element in the stack. 10210 10211After we're done, we remove all delimiters above `stack_bottom` from the 10212delimiter stack. 10213