-
Notifications
You must be signed in to change notification settings - Fork 274
Markdown Formatter
Formatter renders the AST as markdown with various formatting options to clean up and make the source consistent. This also comes with an API to allow extensions to provide formatting options and handle rendering of markdown for custom nodes.
ℹ️ in versions prior to 0.60.0 formatter functionality was implemented in
flexmark-formatter
module and required an additional dependency.
The formatter module also implements an API for implementing markdown document translation to another language described in Translation Helper API
The Formatter
class is a renderer that outputs markdown and formats it to specified options.
Use it in place of HtmlRenderer
to get formatted markdown. It can also be used to convert
indentations from one ParserEmulationProfile
to another:
package com.vladsch.flexmark.samples;
import com.vladsch.flexmark.formatter.Formatter;
import com.vladsch.flexmark.parser.Parser;
import com.vladsch.flexmark.profile.pegdown.Extensions;
import com.vladsch.flexmark.profile.pegdown.PegdownOptionsAdapter;
import com.vladsch.flexmark.util.data.DataHolder;
import com.vladsch.flexmark.util.data.MutableDataSet;
public class PegdownToCommonMark {
final private static DataHolder OPTIONS = PegdownOptionsAdapter.flexmarkOptions(
Extensions.ALL
);
static final MutableDataSet FORMAT_OPTIONS = new MutableDataSet();
static {
// copy extensions from Pegdown compatible to Formatting, but leave the rest default
FORMAT_OPTIONS.set(Parser.EXTENSIONS, Parser.EXTENSIONS.get(OPTIONS));
}
static final Parser PARSER = Parser.builder(OPTIONS).build();
static final Formatter RENDERER = Formatter.builder(FORMAT_OPTIONS).build();
// use the PARSER to parse pegdown indentation rules and RENDERER to render CommonMark
}
will convert pegdown 4 space indent to CommonMark list item text column indent.
#Heading
-----
paragraph text
lazy continuation
* list item
> block quote
lazy continuation
~~~info
with uneven indent
with uneven indent
indented code
~~~
with uneven indent
with uneven indent
indented code
1. numbered item 1
1. numbered item 2
1. numbered item 3
- bullet item 1
- bullet item 2
- bullet item 3
1. numbered sub-item 1
1. numbered sub-item 2
1. numbered sub-item 3
~~~info
with uneven indent
with uneven indent
indented code
~~~
with uneven indent
with uneven indent
indented code
Converted to CommonMark indents, ATX heading spaces added, blank lines added, fenced and indented code indents minimized:
# Heading
-----
paragraph text
lazy continuation
* list item
> block quote
> lazy continuation
~~~info
with uneven indent
with uneven indent
indented code
~~~
with uneven indent
with uneven indent
indented code
1. numbered item 1
2. numbered item 2
3. numbered item 3
- bullet item 1
- bullet item 2
- bullet item 3
1. numbered sub-item 1
2. numbered sub-item 2
3. numbered sub-item 3
~~~info
with uneven indent
with uneven indent
indented code
~~~
with uneven indent
with uneven indent
indented code
Get the full sample source PegdownToCommonMark.java.
These are options available in the Formatter
class. Extensions which handle formatting of
their custom node can and do provide their own formatting options.
These are defined in the Formatter
class:
-
FORMATTER_EMULATION_PROFILE
: defaultParser.PARSER_EMULATION_PROFILE
, emulation profile to use for formatting. Can be used to change indenting rules from the ones used by the parser. -
MAX_BLANK_LINES
: default2
, maximum number of blank lines to keep in the file -
MAX_TRAILING_BLANK_LINES
: default1
, maximum trailing blank lines in file -
SPACE_AFTER_ATX_MARKER
: defaultDiscretionaryText.ADD
, handling of space after atx marker -
SETEXT_HEADER_EQUALIZE_MARKER
: defaulttrue
, when true equalizes the setext marker to header text length -
ATX_HEADER_TRAILING_MARKER
: defaultEqualizeTrailingMarker.AS_IS
, trailing atx#
markers:-
AS_IS
: do nothing -
ADD
: add the same number of#
as opening marker -
EQUALIZE
: add the same number of#
as opening marker -
REMOVE
: remove
-
-
THEMATIC_BREAK
: default(String)null
, string to use for thematic break.null
means leave as is. -
BLOCK_QUOTE_BLANK_LINES
: defaulttrue
, add blank lines around block quotes -
BLOCK_QUOTE_MARKERS
: defaultBlockQuoteMarker.ADD_COMPACT_WITH_SPACE
-
AS_IS
: no change, first line marker is propagated to full block quote content -
ADD_COMPACT
: use>
and>>..>>
for nested block quotes -
ADD_COMPACT_WITH_SPACE
: use>
and>>..>>
for nested block quotes -
ADD_SPACED
: use>
and> > ..> >
for nested block quotes
-
-
INDENTED_CODE_MINIMIZE_INDENT
: defaulttrue
, when true will remove extra indent common to all content lines -
FENCED_CODE_MINIMIZE_INDENT
: defaulttrue
, when true will remove extra indent common to all content lines -
FENCED_CODE_MATCH_CLOSING_MARKER
: defaulttrue
, when true opening marker will be used for closing marker -
FENCED_CODE_SPACE_BEFORE_INFO
: defaultfalse
, when true a space will be added between open marker and info string -
FENCED_CODE_MARKER_LENGTH
: default3
, minimum code fence marker length -
FENCED_CODE_MARKER_TYPE
: defaultCodeFenceMarker.ANY
,-
ANY
: no change, whatever is used -
BACK_TICK
: change to back ticks -
TILDE
: change to~
-
-
LIST_ADD_BLANK_LINE_BEFORE
: defaultfalse
, whentrue
will add a blank line before the first list item if it follows a paragraph -
LIST_RENUMBER_ITEMS
: defaulttrue
, whentrue
renumbers the ordered list items -
LIST_BULLET_MARKER
: defaultListBulletMarker.ANY
,-
ANY
: no change -
DASH
: change all to-
-
ASTERISK
: change all to*
-
PLUS
: change all to+
-
-
LIST_NUMBERED_MARKER
: defaultListNumberedMarker.ANY
,-
ANY
: no change -
DOT
: change all to.
-
PAREN
: change all to)
-
-
LIST_SPACING
: defaultListSpacing.AS_IS
,-
AS_IS
: no change -
LOOSEN
: loose if has loose item -
TIGHTEN
: tight if has tight item -
LOOSE
: always loose -
TIGHT
: always tight
-
-
REFERENCE_PLACEMENT
: defaultElementPlacement.AS_IS
,-
AS_IS
: no change -
DOCUMENT_TOP
: put all references at top of document -
GROUP_WITH_FIRST
: group all with first reference -
GROUP_WITH_LAST
: group all with last reference -
DOCUMENT_BOTTOM
: document bottom
-
-
REFERENCE_SORT
: defaultElementPlacementSort.AS_IS
, only applies ifREFERENCE_PLACEMENT
is notAS_IS
-
AS_IS
: no change -
SORT
: sort in alphabetical order by reference text -
SORT_UNUSED_LAST
: sort in alphabetical order by reference text, put unreferenced ones last
-
-
KEEP_IMAGE_LINKS_AT_START
: defaultfalse
, whentrue
image links will always be wrapped to be the first non space on the line -
KEEP_EXPLICIT_LINKS_AT_START
: defaultfalse
, whentrue
image links will always be wrapped to be the first non space on the line -
KEEP_HARD_LINE_BREAKS
: defaulttrue
, whenfalse
hard line breaks are eliminated along with their EOL. -
KEEP_SOFT_LINE_BREAKS
: defaulttrue
, whenfalse
soft line breaks are eliminated. Allows to format markdown for processors that treat soft line breaks as hard line breaks. -
APPEND_TRANSFERRED_REFERENCES
: defaultfalse
, whentrue
will append transferred references to the bottom of the document being formatted. -
SKIP_FENCED_CODE
, defaultfalse
. Whentrue
will convert fenced code to indented code in generated markdown. -
SKIP_CHAR_ESCAPE
, defaultfalse
. Whentrue
will not escape special characters. -
RIGHT_MARGIN
, default0
, since 0.60.0, if >0 then text will be wrapped to given margin. -
APPLY_SPECIAL_LEAD_IN_HANDLERS
, defaulttrue
, since 0.60.0, when true will escape special lead-in characters which wrap to beginning of line and un-escape any which wrap from beginning of line. Used to prevent special characters inside paragraph body from starting a new element when wrapped to beginning of line.
Best source of sample code is existing extensions that implement custom node formatting and the formatter module itself:
flexmark-ext-abbreviation
flexmark-ext-definition
flexmark-ext-footnotes
flexmark-ext-jekyll-front-matter
flexmark-ext-tables
flexmark-formatter