Parsedown



I've been spending all my intellectual free time on working on my Kavanot site, so I haven't been doing any independent programming. But that site uses raw HTML, which is a pain to type. So I decided to start using Markdown to make writing easier. After a little trial and error, I decided to use Parsedown with Parsedown Extra.

  1. Cached
  2. GitHub - Erusev/parsedown-extra: Markdown Extra Extension For ...
  3. Coding.ms: TYPO3 Parsedown Extra (Markdown)
  4. Laravel Package Development - E07 - Parsing The Body Using The PHP Parsedown Library
  5. Parsedown Xss
  6. Erusev/parsedown - Packagist

Fast and extensible Markdown parser in PHP. It supports GitHub Flavored Markdown and it adheres to CommonMark. Apr 09, 2021 Parsedown is capable of escaping user-input within the HTML that it generates. Additionally Parsedown will apply sanitisation to additional scripting vectors (such as scripting link destinations) that are introduced by the markdown syntax itself. To tell Parsedown that it is processing untrusted user-input, use the following. Parsedown.class.php Api Behavior Home Think; Class Hierarchy Diagram; Errors; Markers; Documentation is powered by phpDocumentor and authored on December 24th, 2019 at 01:48.

See the code.

Cached

This gives me tables and blockquotes along with simple URL's and <em> and <strong>. But it's not perfect.

(As an aside, tables were a bit of work to figure out. They have to start with | whatever | whatever and the next line has to be the divider, |---|---|, with exactly the same number of cells. Only that number of cells will display, so

will only produce

losing that third column. Also, there's no way to eliminate the header entirely, but if the header cells are blank, then the empty <thead>
will take minimal space.)

Under the Hood

I wanted to add things that would make my life easier, such as adding language attributes (since I go between English and Hebrew text, with a smattering of Greek and even some Hieroglyphics) and easily entering <cite> and <i> elements.

So that meant looking at the source code. There is a tutorial for creating extensions, but it is not based on the most recent version (which as of this writing is 1.8.0-beta-7), so it's incomplete.

Parsedown

Cached

Parsedown has only one useful public method, Parsedown::text($text). It works by breaking the text into lines, then calling linesElements($lines) which iterates over each line with linesElements($lines) (yes, it's confusing to have the only difference being an 's' in the middle of the name) to parse the lines into an array of 'element's, each of which is an array of the form:

and the 'handler' array is:

The method elements(array $Elements) then recursively processes the elements to produce a string of markup.

The Details: Block level elements

Parsing a line consists of looking for a marker of a 'block element' as the first character:

or no marker, which is either a <p> or a <pre><code> element, depending on if it is indented or not. Parsedown then creates a method name of 'block'.$blockType (for instance blockQuote, and calls that with the line to be parsed and the current state of the parser, which is called a 'Block' and is an array:

The function returns NULL if it cannot handle the text, returns the original 'Block' array (modified as necessary) or returns a new 'Block' array (in that case, the last 'Block' is processed to produce an array of 'element's).
If the 'Block' is marked 'continuable', then the method 'block'.$blockType.Continue (for instance blockQuoteContinue) is called with the next line. When a 'Block' is processed, the method 'block'.$blockType.Complete (for instance blockQuoteContinue) is called.

If the handling function returns NULL, the next handler in the $BlockTypes[$marker] is called, until the 'Block' is handled, or the paragraph handler is called.

Block-level handlers generally create 'elements' that have 'handler' 'linesElements', and the continuation handlers append the line to the 'argument', so processing will continue recursively and elements can nest.

The Details: inline elements

Once there are no more markers for block elements, each line is scanned for markers for inline elements. For some reason, the program lists these in two places:

where he could have just done

in the constructor. I would do that for any Parsedown extension.

But the handling is similar to that for block elements. For each line, scan for any of the characters in $inlineMarkerList, then for each of the strings for that marker in $InlineTypes, create a method name 'inline'.$inlineType (for instance inlineEmphasis) and calls that with the string to be parsed (starting from the marker, ending at the newline). The handler decides if it wants to handle the line or not. If not, returns NULL. If yes, returns and array with two values:

Processing then continues with the rest of the line. Any text not handled is left untouched.

Now I know enough to create some extensions.

I've been spending all my intellectual free time on working on my Kavanot site, so I haven't been doing any independent programming. But that site uses raw HTML, which is a pain to type. So I decided to start using Markdown to make writing easier. After a little trial and error, I decided to use Parsedown with Parsedown Extra.

See the code.

This gives me tables and blockquotes along with simple URL's and <em> and <strong>. But it's not perfect.

GitHub - Erusev/parsedown-extra: Markdown Extra Extension For ...

(As an aside, tables were a bit of work to figure out. They have to start with | whatever | whatever and the next line has to be the divider, |---|---|, with exactly the same number of cells. Only that number of cells will display, so

will only produce

Parsedown php

losing that third column. Also, there's no way to eliminate the header entirely, but if the header cells are blank, then the empty <thead>
will take minimal space.)

Under the Hood

I wanted to add things that would make my life easier, such as adding language attributes (since I go between English and Hebrew text, with a smattering of Greek and even some Hieroglyphics) and easily entering <cite> and <i> elements.

So that meant looking at the source code. There is a tutorial for creating extensions, but it is not based on the most recent version (which as of this writing is 1.8.0-beta-7), so it's incomplete.

Parsedown has only one useful public method, Parsedown::text($text). It works by breaking the text into lines, then calling linesElements($lines) which iterates over each line with linesElements($lines) (yes, it's confusing to have the only difference being an 's' in the middle of the name) to parse the lines into an array of 'element's, each of which is an array of the form:

and the 'handler' array is:

ParsedownParsedown

The method elements(array $Elements) then recursively processes the elements to produce a string of markup.

Coding.ms: TYPO3 Parsedown Extra (Markdown)

The Details: Block level elements

Parsing a line consists of looking for a marker of a 'block element' as the first character:

or no marker, which is either a <p> or a <pre><code> element, depending on if it is indented or not. Parsedown then creates a method name of 'block'.$blockType (for instance blockQuote, and calls that with the line to be parsed and the current state of the parser, which is called a 'Block' and is an array:

The function returns NULL if it cannot handle the text, returns the original 'Block' array (modified as necessary) or returns a new 'Block' array (in that case, the last 'Block' is processed to produce an array of 'element's).
If the 'Block' is marked 'continuable', then the method 'block'.$blockType.Continue (for instance blockQuoteContinue) is called with the next line. When a 'Block' is processed, the method 'block'.$blockType.Complete (for instance blockQuoteContinue) is called.

If the handling function returns NULL, the next handler in the $BlockTypes[$marker] is called, until the 'Block' is handled, or the paragraph handler is called.

Block-level handlers generally create 'elements' that have 'handler' 'linesElements', and the continuation handlers append the line to the 'argument', so processing will continue recursively and elements can nest.

Laravel Package Development - E07 - Parsing The Body Using The PHP Parsedown Library

The Details: inline elements

Once there are no more markers for block elements, each line is scanned for markers for inline elements. For some reason, the program lists these in two places:

Parsedown Xss

where he could have just done

in the constructor. I would do that for any Parsedown extension.

But the handling is similar to that for block elements. For each line, scan for any of the characters in $inlineMarkerList, then for each of the strings for that marker in $InlineTypes, create a method name 'inline'.$inlineType (for instance inlineEmphasis) and calls that with the string to be parsed (starting from the marker, ending at the newline). The handler decides if it wants to handle the line or not. If not, returns NULL. If yes, returns and array with two values:

Erusev/parsedown - Packagist

Processing then continues with the rest of the line. Any text not handled is left untouched.

Now I know enough to create some extensions.