[Proposal] Require XML input instead of Markdown #356

colinodell · 2019-04-01T16:28:37Z

Problem

Parsing Markdown is very difficult to do accurately. Sure, the CommonMark spec does a great job documenting the various edge cases and how to handle them, and this library goes a great job implementing that, but we still need to maintain code that performs those (sometimes complex) conversions. What happens if the spec changes in the future? Or what if there are some undocumented edge cases it doesn't currently cover?

Solution

Instead of parsing Markdown input, change this library to only accept a valid XML representation of a CommonMark AST as input. By no longer accepting Markdown input, this would completely eliminate all parsing issues, making it super easy to process and also drastically speed up the conversion process.

We can leverage the spec's DTD for this: https://github.com/commonmark/CommonMark/blob/master/CommonMark.dtd Because this is an open standard, adoption should be relatively straight-forward.

Databases containing Markdown content would need to store the XML version instead. Site admins can perform this conversion by manually feeding their Markdown input into the commonmark.js dingus and copypasting the output from the AST tab.

Examples

Instead of:

# Hello World!

We'd change this library to require this input:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE document SYSTEM "CommonMark.dtd">

<document xmlns="http://commonmark.org/xml/1.0">
  <heading level="1">
    <text>Hello World</text>
    <text>!</text>
  </heading>
</document>

Note how the AST is already structured properly for our consumption - we can directly convert that to Node objects.

Here's another example - to get the same output as:

CommonMark is *fun* and **easy**!

We'd simply require the user to provide the following input:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE document SYSTEM "CommonMark.dtd">

<document xmlns="http://commonmark.org/xml/1.0">
  <paragraph>
    <text>CommonMark is </text>
    <emph>
      <text>fun</text>
    </emph>
    <text> and </text>
    <strong>
      <text>easy</text>
    </strong>
    <text>!</text>
  </paragraph>
</document>

Future Considerations

We could also consider dropping the HTML rendering completely and stick with XML-only output. This would also have the same benefits of reducing complexity since the AST nodes map directly to XML elements.

TL;DR:

Instead of converting Markdown -> PHP representation -> HTML we'll just do XML -> PHP represenation -> XML because that's super easy for everyone, right?

The text was updated successfully, but these errors were encountered:

pmjones · 2019-04-01T17:52:02Z

Why the half-measures? Instead, accept only LISP s-expressions.

colinodell · 2019-04-01T23:24:56Z

Despite this being an April Fools' joke I will say that two-way XML conversion actually is on our roadmap! Don't worry, this will NEVER fully replace Markdown parsing and HTML rendering like this fake proposal claimed to do - what would be the point? Rather, this would simply be additional functionality for those who may want it :)

rlbaxter · 2019-04-03T13:03:09Z

I don't know what this library is, but Google decided to put this in my Now Feed for some reason. I read the proposal and wanted to stab you. Now that I know this was only a joke, I still want to flip a table or something.

colinodell · 2019-04-03T17:17:47Z

@rlbaxter I've got just the library for you: https://github.com/sgolemon/table-flip

colinodell added good first issue An easy issue for new collaborators april fools labels Apr 1, 2019

colinodell closed this as completed Apr 1, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Proposal] Require XML input instead of Markdown #356

[Proposal] Require XML input instead of Markdown #356

colinodell commented Apr 1, 2019 •

edited

Loading

pmjones commented Apr 1, 2019

colinodell commented Apr 1, 2019

rlbaxter commented Apr 3, 2019

colinodell commented Apr 3, 2019

[Proposal] Require XML input instead of Markdown #356

[Proposal] Require XML input instead of Markdown #356

Comments

colinodell commented Apr 1, 2019 • edited Loading

Problem

Solution

Examples

Future Considerations

TL;DR:

pmjones commented Apr 1, 2019

colinodell commented Apr 1, 2019

rlbaxter commented Apr 3, 2019

colinodell commented Apr 3, 2019

colinodell commented Apr 1, 2019 •

edited

Loading