Name |
Date |
Size |
#Lines |
LOC |
||
---|---|---|---|---|---|---|
.. | 10-Jan-2021 | - | ||||
bin/ | H | 10-Jan-2021 | - | 171 | 126 | |
block/ | H | 10-Jan-2021 | - | 830 | 527 | |
inline/ | H | 10-Jan-2021 | - | 514 | 344 | |
tests/ | H | 10-Jan-2021 | - | 17,175 | 15,573 | |
.scrutinizer.yml | H A D | 26-Mar-2018 | 97 | 6 | 5 | |
.travis.yml | H A D | 26-Mar-2018 | 951 | 43 | 34 | |
CHANGELOG.md | H A D | 26-Mar-2018 | 5.4 KiB | 105 | 76 | |
CONTRIBUTING.md | H A D | 26-Mar-2018 | 1.6 KiB | 37 | 22 | |
GithubMarkdown.php | H A D | 26-Mar-2018 | 2.7 KiB | 115 | 69 | |
LICENSE | H A D | 26-Mar-2018 | 1.1 KiB | 21 | 17 | |
Markdown.php | H A D | 26-Mar-2018 | 3.2 KiB | 129 | 73 | |
MarkdownExtra.php | H A D | 26-Mar-2018 | 7 KiB | 256 | 184 | |
Parser.php | H A D | 26-Mar-2018 | 9.4 KiB | 390 | 201 | |
README.md | H A D | 26-Mar-2018 | 21.3 KiB | 522 | 379 | |
composer.json | H A D | 26-Mar-2018 | 852 | 43 | 42 | |
phpunit.xml.dist | H A D | 26-Mar-2018 | 697 | 28 | 23 |
README.md
1A super fast, highly extensible markdown parser for PHP 2======================================================= 3 4[](https://packagist.org/packages/cebe/markdown) 5[](https://packagist.org/packages/cebe/markdown) 6[](http://travis-ci.org/cebe/markdown) 7[](https://scrutinizer-ci.com/g/cebe/markdown/) 8[](https://scrutinizer-ci.com/g/cebe/markdown/) 9 10What is this? <a name="what"></a> 11------------- 12 13A set of [PHP][] classes, each representing a [Markdown][] flavor, and a command line tool 14for converting markdown files to HTML files. 15 16The implementation focus is to be **fast** (see [benchmark][]) and **extensible**. 17Parsing Markdown to HTML is as simple as calling a single method (see [Usage](#usage)) providing a solid implementation 18that gives most expected results even in non-trivial edge cases. 19 20Extending the Markdown language with new elements is as simple as adding a new method to the class that converts the 21markdown text to the expected output in HTML. This is possible without dealing with complex and error prone regular expressions. 22It is also possible to hook into the markdown structure and add elements or read meta information using the internal representation 23of the Markdown text as an abstract syntax tree (see [Extending the language](#extend)). 24 25Currently the following markdown flavors are supported: 26 27- **Traditional Markdown** according to <http://daringfireball.net/projects/markdown/syntax> ([try it!](http://markdown.cebe.cc/try?flavor=default)). 28- **Github flavored Markdown** according to <https://help.github.com/articles/github-flavored-markdown> ([try it!](http://markdown.cebe.cc/try?flavor=gfm)). 29- **Markdown Extra** according to <http://michelf.ca/projects/php-markdown/extra/> (currently not fully supported WIP see [#25][], [try it!](http://markdown.cebe.cc/try?flavor=extra)) 30- Any mixed Markdown flavor you like because of its highly extensible structure (See documentation below). 31 32Future plans are to support: 33 34- Smarty Pants <http://daringfireball.net/projects/smartypants/> 35- ... (Feel free to [suggest](https://github.com/cebe/markdown/issues/new) further additions!) 36 37[PHP]: http://php.net/ "PHP is a popular general-purpose scripting language that is especially suited to web development." 38[Markdown]: http://en.wikipedia.org/wiki/Markdown "Markdown on Wikipedia" 39[#25]: https://github.com/cebe/markdown/issues/25 "issue #25" 40[benchmark]: https://github.com/kzykhys/Markbench#readme "kzykhys/Markbench on github" 41 42### Who is using it? 43 44- It powers the [API-docs and the definitive guide](http://www.yiiframework.com/doc-2.0/) for the [Yii Framework][] [2.0](https://github.com/yiisoft/yii2). 45 46[Yii Framework]: http://www.yiiframework.com/ "The Yii PHP Framework" 47 48 49Installation <a name="installation"></a> 50------------ 51 52[PHP 5.4 or higher](http://www.php.net/downloads.php) is required to use it. 53It will also run on facebook's [hhvm](http://hhvm.com/). 54 55The library uses PHPDoc annotations to determine the markdown elements that should be parsed. 56So in case you are using PHP `opcache`, make sure 57[it does not strip comments](http://php.net/manual/en/opcache.configuration.php#ini.opcache.save-comments). 58 59Installation is recommended to be done via [composer][] by running: 60 61 composer require cebe/markdown "~1.2.0" 62 63Alternatively you can add the following to the `require` section in your `composer.json` manually: 64 65```json 66"cebe/markdown": "~1.2.0" 67``` 68 69Run `composer update` afterwards. 70 71[composer]: https://getcomposer.org/ "The PHP package manager" 72 73> Note: If you have configured PHP with opcache you need to enable the 74> [opcache.save_comments](http://php.net/manual/en/opcache.configuration.php#ini.opcache.save-comments) option because inline element parsing relies on PHPdoc annotations to find declared elements. 75 76Usage <a name="usage"></a> 77----- 78 79### In your PHP project 80 81To parse your markdown you need only two lines of code. The first one is to choose the markdown flavor as 82one of the following: 83 84- Traditional Markdown: `$parser = new \cebe\markdown\Markdown();` 85- Github Flavored Markdown: `$parser = new \cebe\markdown\GithubMarkdown();` 86- Markdown Extra: `$parser = new \cebe\markdown\MarkdownExtra();` 87 88The next step is to call the `parse()`-method for parsing the text using the full markdown language 89or calling the `parseParagraph()`-method to parse only inline elements. 90 91Here are some examples: 92 93```php 94// traditional markdown and parse full text 95$parser = new \cebe\markdown\Markdown(); 96echo $parser->parse($markdown); 97 98// use github markdown 99$parser = new \cebe\markdown\GithubMarkdown(); 100echo $parser->parse($markdown); 101 102// use markdown extra 103$parser = new \cebe\markdown\MarkdownExtra(); 104echo $parser->parse($markdown); 105 106// parse only inline elements (useful for one-line descriptions) 107$parser = new \cebe\markdown\GithubMarkdown(); 108echo $parser->parseParagraph($markdown); 109``` 110 111You may optionally set one of the following options on the parser object: 112 113For all Markdown Flavors: 114 115- `$parser->html5 = true` to enable HTML5 output instead of HTML4. 116- `$parser->keepListStartNumber = true` to enable keeping the numbers of ordered lists as specified in the markdown. 117 The default behavior is to always start from 1 and increment by one regardless of the number in markdown. 118 119For GithubMarkdown: 120 121- `$parser->enableNewlines = true` to convert all newlines to `<br/>`-tags. By default only newlines with two preceding spaces are converted to `<br/>`-tags. 122 123It is recommended to use UTF-8 encoding for the input strings. Other encodings may work, but are currently untested. 124 125### The command line script 126 127You can use it to render this readme: 128 129 bin/markdown README.md > README.html 130 131Using github flavored markdown: 132 133 bin/markdown --flavor=gfm README.md > README.html 134 135or convert the original markdown description to html using the unix pipe: 136 137 curl http://daringfireball.net/projects/markdown/syntax.text | bin/markdown > md.html 138 139Here is the full Help output you will see when running `bin/markdown --help`: 140 141 PHP Markdown to HTML converter 142 ------------------------------ 143 144 by Carsten Brandt <mail@cebe.cc> 145 146 Usage: 147 bin/markdown [--flavor=<flavor>] [--full] [file.md] 148 149 --flavor specifies the markdown flavor to use. If omitted the original markdown by John Gruber [1] will be used. 150 Available flavors: 151 152 gfm - Github flavored markdown [2] 153 extra - Markdown Extra [3] 154 155 --full ouput a full HTML page with head and body. If not given, only the parsed markdown will be output. 156 157 --help shows this usage information. 158 159 If no file is specified input will be read from STDIN. 160 161 Examples: 162 163 Render a file with original markdown: 164 165 bin/markdown README.md > README.html 166 167 Render a file using gihtub flavored markdown: 168 169 bin/markdown --flavor=gfm README.md > README.html 170 171 Convert the original markdown description to html using STDIN: 172 173 curl http://daringfireball.net/projects/markdown/syntax.text | bin/markdown > md.html 174 175 176 [1] http://daringfireball.net/projects/markdown/syntax 177 [2] https://help.github.com/articles/github-flavored-markdown 178 [3] http://michelf.ca/projects/php-markdown/extra/ 179 180 181Extensions 182---------- 183 184Here are some extensions to this library: 185 186- [Bogardo/markdown-codepen](https://github.com/Bogardo/markdown-codepen) - shortcode to embed codepens from http://codepen.io/ in markdown. 187- [kartik-v/yii2-markdown](https://github.com/kartik-v/yii2-markdown) - Advanced Markdown editing and conversion utilities for Yii Framework 2.0. 188- [cebe/markdown-latex](https://github.com/cebe/markdown-latex) - Convert Markdown to LaTeX and PDF 189- [softark/creole](https://github.com/softark/creole) - A creole markup parser 190- [hyn/frontmatter](https://github.com/hyn/frontmatter) - Frontmatter Metadata Support (JSON, TOML, YAML) 191- ... [add yours!](https://github.com/cebe/markdown/edit/master/README.md#L186) 192 193 194Extending the language <a name="extend"></a> 195---------------------- 196 197Markdown consists of two types of language elements, I'll call them block and inline elements simlar to what you have in 198HTML with `<div>` and `<span>`. Block elements are normally spreads over several lines and are separated by blank lines. 199The most basic block element is a paragraph (`<p>`). 200Inline elements are elements that are added inside of block elements i.e. inside of text. 201 202This markdown parser allows you to extend the markdown language by changing existing elements behavior and also adding 203new block and inline elements. You do this by extending from the parser class and adding/overriding class methods and 204properties. For the different element types there are different ways to extend them as you will see in the following sections. 205 206### Adding block elements 207 208The markdown is parsed line by line to identify each non-empty line as one of the block element types. 209To identify a line as the beginning of a block element it calls all protected class methods who's name begins with `identify`. 210An identify function returns true if it has identified the block element it is responsible for or false if not. 211In the following example we will implement support for [fenced code blocks][] which are part of the github flavored markdown. 212 213[fenced code blocks]: https://help.github.com/articles/github-flavored-markdown#fenced-code-blocks 214 "Fenced code block feature of github flavored markdown" 215 216```php 217<?php 218 219class MyMarkdown extends \cebe\markdown\Markdown 220{ 221 protected function identifyFencedCode($line, $lines, $current) 222 { 223 // if a line starts with at least 3 backticks it is identified as a fenced code block 224 if (strncmp($line, '```', 3) === 0) { 225 return true; 226 } 227 return false; 228 } 229 230 // ... 231} 232``` 233 234In the above, `$line` is a string containing the content of the current line and is equal to `$lines[$current]`. 235You may use `$lines` and `$current` to check other lines than the current line. In most cases you can ignore these parameters. 236 237Parsing of a block element is done in two steps: 238 2391. **Consuming** all the lines belonging to it. In most cases this is iterating over the lines starting from the identified 240 line until a blank line occurs. This step is implemented by a method named `consume{blockName}()` where `{blockName}` 241 is the same name as used for the identify function above. The consume method also takes the lines array 242 and the number of the current line. It will return two arguments: an array representing the block element in the abstract syntax tree 243 of the markdown document and the line number to parse next. In the abstract syntax array the first element refers to the name of 244 the element, all other array elements can be freely defined by yourself. 245 In our example we will implement it like this: 246 247 ```php 248 protected function consumeFencedCode($lines, $current) 249 { 250 // create block array 251 $block = [ 252 'fencedCode', 253 'content' => [], 254 ]; 255 $line = rtrim($lines[$current]); 256 257 // detect language and fence length (can be more than 3 backticks) 258 $fence = substr($line, 0, $pos = strrpos($line, '`') + 1); 259 $language = substr($line, $pos); 260 if (!empty($language)) { 261 $block['language'] = $language; 262 } 263 264 // consume all lines until ``` 265 for($i = $current + 1, $count = count($lines); $i < $count; $i++) { 266 if (rtrim($line = $lines[$i]) !== $fence) { 267 $block['content'][] = $line; 268 } else { 269 // stop consuming when code block is over 270 break; 271 } 272 } 273 return [$block, $i]; 274 } 275 ``` 276 2772. **Rendering** the element. After all blocks have been consumed, they are being rendered using the 278 `render{elementName}()`-method where `elementName` refers to the name of the element in the abstract syntax tree: 279 280 ```php 281 protected function renderFencedCode($block) 282 { 283 $class = isset($block['language']) ? ' class="language-' . $block['language'] . '"' : ''; 284 return "<pre><code$class>" . htmlspecialchars(implode("\n", $block['content']) . "\n", ENT_NOQUOTES, 'UTF-8') . '</code></pre>'; 285 } 286 ``` 287 288 You may also add code highlighting here. In general it would also be possible to render ouput in a different language than 289 HTML for example LaTeX. 290 291 292### Adding inline elements 293 294Adding inline elements is different from block elements as they are parsed using markers in the text. 295An inline element is identified by a marker that marks the beginning of an inline element (e.g. `[` will mark a possible 296beginning of a link or `` ` `` will mark inline code). 297 298Parsing methods for inline elements are also protected and identified by the prefix `parse`. Additionally a `@marker` annotation 299in PHPDoc is needed to register the parse function for one or multiple markers. 300The method will then be called when a marker is found in the text. As an argument it takes the text starting at the position of the marker. 301The parser method will return an array containing the element of the abstract sytnax tree and an offset of text it has 302parsed from the input markdown. All text up to this offset will be removed from the markdown before the next marker will be searched. 303 304As an example, we will add support for the [strikethrough][] feature of github flavored markdown: 305 306[strikethrough]: https://help.github.com/articles/github-flavored-markdown#strikethrough "Strikethrough feature of github flavored markdown" 307 308```php 309<?php 310 311class MyMarkdown extends \cebe\markdown\Markdown 312{ 313 /** 314 * @marker ~~ 315 */ 316 protected function parseStrike($markdown) 317 { 318 // check whether the marker really represents a strikethrough (i.e. there is a closing ~~) 319 if (preg_match('/^~~(.+?)~~/', $markdown, $matches)) { 320 return [ 321 // return the parsed tag as an element of the abstract syntax tree and call `parseInline()` to allow 322 // other inline markdown elements inside this tag 323 ['strike', $this->parseInline($matches[1])], 324 // return the offset of the parsed text 325 strlen($matches[0]) 326 ]; 327 } 328 // in case we did not find a closing ~~ we just return the marker and skip 2 characters 329 return [['text', '~~'], 2]; 330 } 331 332 // rendering is the same as for block elements, we turn the abstract syntax array into a string. 333 protected function renderStrike($element) 334 { 335 return '<del>' . $this->renderAbsy($element[1]) . '</del>'; 336 } 337} 338``` 339 340### Composing your own Markdown flavor 341 342This markdown library is composed of traits so it is very easy to create your own markdown flavor by adding and/or removing 343the single feature traits. 344 345Designing your Markdown flavor consists of four steps: 346 3471. Select a base class 3482. Select language feature traits 3493. Define escapeable characters 3504. Optionally add custom rendering behavior 351 352#### Select a base class 353 354If you want to extend from a flavor and only add features you can use one of the existing classes 355(`Markdown`, `GithubMarkdown` or `MarkdownExtra`) as your flavors base class. 356 357If you want to define a subset of the markdown language, i.e. remove some of the features, you have to 358extend your class from `Parser`. 359 360#### Select language feature traits 361 362The following shows the trait selection for traditional Markdown. 363 364```php 365class MyMarkdown extends Parser 366{ 367 // include block element parsing using traits 368 use block\CodeTrait; 369 use block\HeadlineTrait; 370 use block\HtmlTrait { 371 parseInlineHtml as private; 372 } 373 use block\ListTrait { 374 // Check Ul List before headline 375 identifyUl as protected identifyBUl; 376 consumeUl as protected consumeBUl; 377 } 378 use block\QuoteTrait; 379 use block\RuleTrait { 380 // Check Hr before checking lists 381 identifyHr as protected identifyAHr; 382 consumeHr as protected consumeAHr; 383 } 384 // include inline element parsing using traits 385 use inline\CodeTrait; 386 use inline\EmphStrongTrait; 387 use inline\LinkTrait; 388 389 /** 390 * @var boolean whether to format markup according to HTML5 spec. 391 * Defaults to `false` which means that markup is formatted as HTML4. 392 */ 393 public $html5 = false; 394 395 protected function prepare() 396 { 397 // reset references 398 $this->references = []; 399 } 400 401 // ... 402} 403``` 404 405In general, just adding the trait with `use` is enough, however in some cases some fine tuning is desired 406to get most expected parsing results. Elements are detected in alphabetical order of their identification 407function. This means that if a line starting with `-` could be a list or a horizontal rule, the preference has to be set 408by renaming the identification function. This is what is done with renaming `identifyHr` to `identifyAHr` 409and `identifyBUl` to `identifyBUl`. The consume function always has to have the same name as the identification function 410so this has to be renamed too. 411 412There is also a conflict for parsing of the `<` character. This could either be a link/email enclosed in `<` and `>` 413or an inline HTML tag. In order to resolve this conflict when adding the `LinkTrait`, we need to hide the `parseInlineHtml` 414method of the `HtmlTrait`. 415 416If you use any trait that uses the `$html5` property to adjust its output you also need to define this property. 417 418If you use the link trait it may be useful to implement `prepare()` as shown above to reset references before 419parsing to ensure you get a reusable object. 420 421#### Define escapeable characters 422 423Depending on the language features you have chosen there is a different set of characters that can be escaped 424using `\`. The following is the set of escapeable characters for traditional markdown, you can copy it to your class 425as is. 426 427```php 428 /** 429 * @var array these are "escapeable" characters. When using one of these prefixed with a 430 * backslash, the character will be outputted without the backslash and is not interpreted 431 * as markdown. 432 */ 433 protected $escapeCharacters = [ 434 '\\', // backslash 435 '`', // backtick 436 '*', // asterisk 437 '_', // underscore 438 '{', '}', // curly braces 439 '[', ']', // square brackets 440 '(', ')', // parentheses 441 '#', // hash mark 442 '+', // plus sign 443 '-', // minus sign (hyphen) 444 '.', // dot 445 '!', // exclamation mark 446 '<', '>', 447 ]; 448``` 449 450#### Add custom rendering behavior 451 452Optionally you may also want to adjust rendering behavior by overriding some methods. 453You may refer to the `consumeParagraph()` method of the `Markdown` and `GithubMarkdown` classes for some inspiration 454which define different rules for which elements are allowed to interrupt a paragraph. 455 456 457Acknowledgements <a name="ack"></a> 458---------------- 459 460I'd like to thank [@erusev][] for creating [Parsedown][] which heavily influenced this work and provided 461the idea of the line based parsing approach. 462 463[@erusev]: https://github.com/erusev "Emanuil Rusev" 464[Parsedown]: http://parsedown.org/ "The Parsedown PHP Markdown parser" 465 466FAQ <a name="faq"></a> 467--- 468 469### Why another markdown parser? 470 471While reviewing PHP markdown parsers for choosing one to use bundled with the [Yii framework 2.0][] 472I found that most of the implementations use regex to replace patterns instead 473of doing real parsing. This way extending them with new language elements is quite hard 474as you have to come up with a complex regex, that matches your addition but does not mess 475with other elements. Such additions are very common as you see on github which supports referencing 476issues, users and commits in the comments. 477A [real parser][] should use context aware methods that walk trough the text and 478parse the tokens as they find them. The only implentation that I have found that uses 479this approach is [Parsedown][] which also shows that this implementation is [much faster][benchmark] 480than the regex way. Parsedown however is an implementation that focuses on speed and implements 481its own flavor (mainly github flavored markdown) in one class and at the time of this writing was 482not easily extensible. 483 484Given the situation above I decided to start my own implementation using the parsing approach 485from Parsedown and making it extensible creating a class for each markdown flavor that extend each 486other in the way that also the markdown languages extend each other. 487This allows you to choose between markdown language flavors and also provides a way to compose your 488own flavor picking the best things from all. 489I chose this approach as it is easier to implement and also more intuitive approach compared 490to using callbacks to inject functionallity into the parser. 491 492[real parser]: http://en.wikipedia.org/wiki/Parsing#Types_of_parser 493 494[Parsedown]: http://parsedown.org/ "The Parsedown PHP Markdown parser" 495 496[Yii framework 2.0]: https://github.com/yiisoft/yii2 497 498### Where do I report bugs or rendering issues? 499 500Just [open an issue][] on github, post your markdown code and describe the problem. You may also attach screenshots of the rendered HTML result to describe your problem. 501 502[open an issue]: https://github.com/cebe/markdown/issues/new 503 504### How can I contribute to this library? 505 506Check the [CONTRIBUTING.md](CONTRIBUTING.md) file for more info. 507 508 509### Am I free to use this? 510 511This library is open source and licensed under the [MIT License][]. This means that you can do whatever you want 512with it as long as you mention my name and include the [license file][license]. Check the [license][] for details. 513 514[MIT License]: http://opensource.org/licenses/MIT 515 516[license]: https://github.com/cebe/markdown/blob/master/LICENSE 517 518Contact 519------- 520 521Feel free to contact me using [email](mailto:mail@cebe.cc) or [twitter](https://twitter.com/cebe_cc). 522