pegdown is a pure Java library for clean and lightweight Markdown processing based on a parboiled PEG parser.
pegdown is nearly 100% compatible with the original Markdown specification and fully passes the original Markdown test suite.
On top of the standard Markdown feature set pegdown implements a number of extensions similar to what other popular Markdown processors offer.
Currently pegdown supports the following extensions over standard Markdown:
- SMARTS: Beautifies apostrophes, ellipses ("..." and ". . .") and dashes ("--" and "---")
- QUOTES: Beautifies single quotes, double quotes and double angle quotes (« and »)
- SMARTYPANTS: Convenience extension enabling both, SMARTS and QUOTES, at once.
- ABBREVIATIONS: Abbreviations in the way of PHP Markdown Extra.
- HARDWRAPS: Alternative handling of newlines, see Github-flavoured-Markdown
- AUTOLINKS: Plain (undelimited) autolinks the way Github-flavoured-Markdown implements them.
- TABLES: Tables similar to MultiMarkdown (which is in turn like the PHP Markdown Extra tables, but with colspan support).
- DEFINITION LISTS: Definition lists in the way of PHP Markdown Extra.
- FENCED CODE BLOCKS: Fenced Code Blocks in the way of PHP Markdown Extra or Github-flavoured-Markdown.
- HTML BLOCK SUPPRESSION: Suppresses the output of HTML blocks.
- INLINE HTML SUPPRESSION: Suppresses the output of inline HTML elements.
- WIKILINKS: Support
[[Wiki-style links]]
with a customizable URL rendering logic.
Note: pegdown differs from the original Markdown in that it ignores in-word emphasis as in
> my_cool_file.txt
> 2*3*4=5
Currently this "extension" cannot be switched off.
You have two options:
-
Download the JAR for the latest version from the download page. pegdown 1.2.0 has only one dependency: parboiled for Java, version 1.1.3.
-
The pegdown artifact is also available from maven central with group id org.pegdown and artifact-id pegdown.
Using pegdown is very simple: Just create a new instance of a PegDownProcessor and call one of its
markdownToHtml
methods to convert the given Markdown source to an HTML string. If you'd like to customize the
rendering of HTML links (Auto-Links, Explicit-Links, Mail-Links, Reference-Links and/or Wiki-Links), e.g. for adding
rel="nofollow"
attributes based on some logic you can supply your own instance of a LinkRenderer with the call
to markdownToHtml
.
You can also use pegdown only for the actual parsing of the Markdown source and do the serialization to the
target format (e.g. XML) yourself. To do this just call the parseMarkdown
method of the PegDownProcessor to obtain
the root node of the Astract Syntax Tree for the document.
With a custom Visitor implementation you can do whatever serialization you want. As an example you might want to
take a look at the sources of the ToHtmlSerializer.
Note that the first time you create a PegDownProcessor it can take up to a few hundred milliseconds to prepare the underlying parboiled parser instance. However, once the first processor has been built all further instantiations will be fast. Also, you can reuse an existing PegDownProcessor instance as often as you want, as long as you prevent concurrent accesses, since neither the PegDownProcessor nor the underlying parser is thread-safe.
See http://sirthias.github.com/pegdown/api for the pegdown API documentation.
Since Markdown has no official grammar and contains a number of ambiguities the parsing of Markdown source, especially with enabled language extensions, can be "hard" and result, in certain corner cases, in exponential parsing time. In order to provide a somewhat predictable behavior pegdown therefore supports the specification of a parsing timeout, which you can supply to the PegDownProcessor constructor.
If the parser happens to run longer than the specified timeout period it terminates itself with an exception, which
causes the markdownToHtml
method to return null
. Your application should then deal with this case accordingly and,
for example, inform the user.
The default timeout, if not explicitly specified, is 2 seconds.
The excellent idea-markdown plugin for IntelliJ IDEA, RubyMine, PhpStorm, WebStorm, PyCharm and appCode uses pegdown as its underlying parsing engine. The plugin gives you proper syntax-highlighting for markdown source and shows you exactly, how pegdown will parse your texts.
A large part of the underlying PEG grammar was developed by John MacFarlane and made available with his tool peg-markdown.
pegdown is licensed under Apache License 2.0.
Feedback and contributions to the project, no matter what kind, are always very welcome. However, patches can only be accepted from their original author. Along with any patches, please state that the patch is your original work and that you license the work to the pegdown project under the project’s open source license.