Full Changelog: https://github.com/executablebooks/markdown-it-py/compare/v2.2.0...v3.0.0
Also add testing for Python 3.11
A key change is the addition of a new Token
type, text_special
, which is used to represent HTML entities and backslash escaped characters.
This ensures that (core) typographic transformation rules are not incorrectly applied to these texts.
The final core rule is now the new text_join
rule, which joins adjacent text
/text_special
tokens,
and so no text_special
tokens should be present in the final token stream.
Any custom typographic rules should be inserted before text_join
.
A new linkify
rule has also been added to the inline chain, which will linkify full URLs (e.g. https://example.com
),
and fixes collision of emphasis and linkifier (so http://example.org/foo._bar_-_baz
is now a single link, not emphasized).
Emails and fuzzy links are not affected by this.
- ♻️ Refactor backslash escape logic, add
text_special
#276 - ♻️ Parse entities to
text_special
token #280 - ♻️ Refactor: Add linkifier rule to inline chain for full links #279
‼️ Remove(p)
=>§
replacement in typographer #281‼️ Remove unusedsilent
arg inParserBlock.tokenize
#284- 🐛 FIX: numeric character reference passing #272
- 🐛 Fix: tab preventing paragraph continuation in lists #274
- 👌 Improve nested emphasis parsing #273
- 👌 fix possible ReDOS in newline rule #275
- 👌 Improve performance of
skipSpaces
/skipChars
#271 - 👌 Show text of
text_special
intree.pretty
#282
The use of StateBase.srcCharCode
is deprecated (with backward-compatibility), and all core uses are replaced by StateBase.src
.
Conversion of source string characters to an integer representing the Unicode character is prevalent in the upstream JavaScript implementation, to improve performance.
However, it is unnecessary in Python and leads to harder to read code and performance deprecations (during the conversion in the StateBase
initialisation).
See #270, thanks to @hukkinj1.
For CommonMark, the presence of indented code blocks prevent any other block element from having an indent of greater than 4 spaces.
Certain Markdown flavors and derivatives, such as mdx and djot, disable these code blocks though, since it is more common to use code fences and/or arbitrary indenting is desirable.
Previously, disabling code blocks did not remove the indent limitation, since most block elements had the 3 space limitation hard-coded.
This change centralised the logic of applying this limitation (in StateBlock.is_code_block
), and only applies it when indented code blocks are enabled.
This allows for e.g.
<div>
<div>
I can indent as much as I want here.
<div>
<div>
See #260
Strict type annotation checking has been applied to the whole code base, ruff is now used for linting, and fuzzing tests have been added to the CI, to integrate with Google OSS-Fuzz testing, thanks to @DavidKorczynski.
- 🔧 MAINTAIN: Make type checking strict #
- 🔧 Add typing of rule functions #283
- 🔧 Move linting from flake8 to ruff #268
- 🧪 CI: Add fuzzing workflow for PRs #262
- 🔧 Add tox env for fuzz testcase run #263
- 🧪 Add OSS-Fuzz set up by @DavidKorczynski in #255
- 🧪 Fix fuzzing test failures #254
- ⬆️ UPGRADE: Allow linkify-it-py v2 by @hukkin in #218
- 🐛 FIX: CVE-2023-26303 by @chrisjsewell in #246
- 🐛 FIX: CLI crash on non-utf8 character by @chrisjsewell in #247
- 📚 DOCS: Update the example by @redstoneleo in #229
- 📚 DOCS: Add section about markdown renderer by @holamgadol in #227
- 🔧 Create SECURITY.md by @chrisjsewell in #248
- 🔧 MAINTAIN: Update mypy's additional dependencies by @hukkin in #217
- Fix typo by @jwilk in #230
- 🔧 Bump GH actions by @chrisjsewell in #244
- 🔧 Update benchmark pkg versions by @chrisjsewell in #245
Thanks to 🎉
- @jwilk made their first contribution in #230
- @holamgadol made their first contribution in #227
- @redstoneleo made their first contribution in #229
Full Changelog: https://github.com/executablebooks/markdown-it-py/compare/v2.1.0...v2.2.0
This release is primarily to replace the attrs
package dependency,
with the built-in Python dataclasses
package.
This should not be a breaking change, for most use cases.
- ⬆️ UPGRADE: Drop support for EOL Python 3.6 (#194)
- ♻️ REFACTOR: Move
Rule
/Delimiter
classes fromattrs
todataclass
(#211) - ♻️ REFACTOR: Move
Token
class fromattrs
todataclass
(#211) ‼️ Remove deprecatedNestedTokens
andnest_tokens
- ✨ NEW: Save ordered list numbering (#192)
- 🐛 FIX: Combination of blockquotes, list and newlines causes
IndexError
(#207)
- 🐛 FIX: Crash when file ends with empty blockquote line.
- ✨ NEW: Add
inline_definitions
option. This option allows fordefinition
token to be inserted into the token stream, at the point where the definition is located in the source text. It is useful for cases where one wishes to capture a "loseless" syntax tree of the parsed Markdown (in conjunction with thestore_labels
option).
- ⬆️ Update: Sync with markdown-it v12.1.0 and CommonMark v0.30
- ♻️ REFACTOR: Port
mdurl
andpunycode
for URL normalisation (thanks to @hukkin!). This port fixes the outstanding CommonMark compliance tests. - ♻️ REFACTOR: Remove
AttrDict
. This is no longer used is core or mdit-py-plugins, instead standard dictionaries are used. - 👌 IMPROVE: Use
__all__
to signal re-exports
⬆️ UPGRADE: attrs
-> v21 (#165)
This release has no breaking changes (see: https://github.com/python-attrs/attrs/blob/main/CHANGELOG.rst)
The first stable release of markdown-it-py 🎉
See the changes in the beta releases below, thanks to all the contributors in the last year!
- 👌 IMPROVE: Add
RendererProtocol
type, for typing renderers (thanks to @hukkinj1) - 🔧 MAINTAIN:
None
is no longer allowed as a validsrc
input forStateBase
subclasses
mdit-py-plugins
out of the core install requirements and into a plugins
extra.
Synchronised code with the upstream Markdown-It v12.0.6
:
- 🐛 FIX: Raise HTML blocks priority to resolve conflict with headings
- 🐛 FIX: Newline not rendered in image alt attribute
This is the first beta release of the stable v1.x series.
There are four notable (and breaking) changes:
- The code has been synchronised with the upstream Markdown-It
v12.0.4
. In particular, this update alters the parsing of tables to be consistent with the GFM specification: https://github.github.com/gfm/#tables-extension- A number of parsing performance and validation improvements are also included. Token.attrs
are now stored as dictionaries, rather than a list of lists. This is a departure from upstream Markdown-It, allowed by Pythons guarantee of ordered dictionaries (see #142), and is the more natural representation. NoteattrGet
,attrSet
,attrPush
andattrJoin
methods remain identical to those upstream, andToken.as_dict(as_upstream=True)
will convert the token back to a directly comparable dict.- The use of
AttrDict
has been replaced: Forenv
any Python mutable mapping is now allowed, and so attribute access to keys is not (differing from the Javascript dictionary). ForMarkdownIt.options
it is now set as anOptionsDict
, which is a dictionary sub-class, with attribute access only for core MarkdownIt configuration keys. - Introduction of the
SyntaxTreeNode
. This is a more comprehensive replacement fornest_tokens
andNestedTokens
(which are now deprecated). It allows for theToken
stream to be converted to/from a nested tree structure, with opening/closing tokens collapsed into a singleSyntaxTreeNode
and the intermediate tokens set as children. See Creating a syntax tree documentation for details.
- Fix exception due to empty lines after blockquote+footnote
- Fix linkify link nesting levels
- Fix the use of
Ruler.at
for plugins - Avoid fenced token mutations during rendering
- Fix CLI version info and correct return of exit codes
This release brings Markdown-It-Py inline with Markdown-It v11.0.1 (2020-09-14), applying two fixes:
Thanks to @hukkinj1!
This release provides some improvements to the code base:
- 🐛 FIX: Do not resolve backslash escapes inside auto-links
- 🐛 FIX: Add content to image tokens
- 👌 IMPROVE: Add more type annotations, thanks to @hukkinj1
🗑 DEPRECATE: Move plugins to mdit_py_plugins
Plugins (in markdown_it.extensions
) have now been moved to executablebooks/mdit-py-plugins.
This will allow for their maintenance to occur on a different cycle to the core code, facilitating the release of a v1.0.0 for this package
🔧 MAINTAIN: Add mypy type-checking, thanks to @hukkinj1.
✨ NEW: Add linkify, thanks to @tsutsu3.
This extension uses linkify-it-py to identify URL links within text:
github.com
-><a href="http://github.com">github.com</a>
Important: To use this extension you must install linkify-it-py; pip install markdown-it-py[linkify]
It can then be activated by:
from markdown_it import MarkdownIt
md = MarkdownIt().enable("linkify")
md.options["linkify"] = True
✨ NEW: Add smartquotes, thanks to @tsutsu3.
This extension will convert basic quote marks to their opening and closing variants:
- 'single quotes' -> ‘single quotes’
- "double quotes" -> “double quotes”
It can be activated by:
from markdown_it import MarkdownIt
md = MarkdownIt().enable("smartquotes")
md.options["typographer"] = True
✨ NEW: Add markdown-it-task-lists plugin, thanks to @wna-se.
This is a port of the JS markdown-it-task-lists,
for building task/todo lists out of markdown lists with items starting with [ ]
or [x]
.
For example:
- [ ] An item that needs doing
- [x] An item that is complete
This plugin can be activated by:
from markdown_it import MarkdownIt
from markdown_it.extensions.tasklists import tasklists_plugin
md = MarkdownIt().use(tasklists_plugin)
🐛 Various bug fixes, thanks to @hukkinj1:
- Do not copy empty
env
arg inMarkdownIt.render
_Entities.__contains__
fix return data- Parsing of unicode ordinals
- Handling of final character in
skipSpacesBack
andskipCharsBack
methods - Avoid exception when document ends in heading/blockquote marker
🧪 TESTS: Add CI for Python 3.9 and PyPy3
-
✨ NEW: Add simple typographic replacements, thanks to @tsutsu3: This allows you to add the
typographer
option to the parser, to replace particular text constructs:(c)
,(C)
→ ©(tm)
,(TM)
→ ™(r)
,(R)
→ ®(p)
,(P)
→ §+-
→ ±...
→ …?....
→ ?..!....
→ !..????????
→ ???!!!!!
→ !!!,,,
→ ,--
→ &ndash---
→ &mdash
md = MarkdownIt().enable("replacements") md.options["typographer"] = True
-
📚 DOCS: Improve documentation for CLI, thanks to @westurner
-
👌 IMPROVE: Use
re.sub()
instead ofre.subn()[0]
, thanks to @hukkinj1 -
🐛 FIX: An exception raised by having multiple blank lines at the end of some files
👌 IMPROVE: Add store_labels
option.
This allows for storage of original reference label in link/image token's metadata, which can be useful for renderers.
✨ NEW: Add anchors_plugin
for headers, which can produce:
<h1 id="title-string">Title String <a class="header-anchor" href="#title-string">¶</a></h1>
🐛 Fixed an undefined variable in the reference block.
🐛 Fixed an IndexError
in container_plugin
, when there is no newline on the closing tag line.
⬆️ UPGRADE: attrs -> v20
This is not breaking, since it only deprecates Python 3.4 (see CHANGELOG.rst)
deflist
anddollarmath
plugins (see plugins list).
- Added benchmarking tests and CI (see https://executablebooks.github.io/markdown-it-py/dev/bench/)
- Improved performance of computing ordinals (=> 10-15% parsing speed increase). Thanks to @sildar!
- Stopped empty lines at the end of the document, after certain list blocks, raising an exception (#36).
- Allow myst-role to accept names containing digits (0-9).
containers
plugin (see plugins list)
- Plugins and improved contributing section