An HTML to Markdown converter written in JavaScript.
The API is as follows:
toMarkdown(stringOfHTML, options);
Download the compiled script located at dist/to-markdown.js
.
<script src="PATH/TO/to-markdown.js"></script>
<script>toMarkdown('<h1>Hello world!</h1>')</script>
Or with Bower:
$ bower install to-markdown
<script src="PATH/TO/bower_components/to-markdown/dist/to-markdown.js"></script>
<script>toMarkdown('<h1>Hello world!</h1>')</script>
Install the to-markdown
module:
$ npm install to-markdown
Then you can use it like below:
var toMarkdown = require('to-markdown');
toMarkdown('<h1>Hello world!</h1>');
(Note it is no longer necessary to call .toMarkdown
on the required module as of v1.)
to-markdown can be extended by passing in an array of converters to the options object:
toMarkdown(stringOfHTML, { converters: [converter1, converter2, …] });
A converter object consists of a filter, and a replacement. This example from the source replaces code
elements:
{
filter: 'code',
replacement: function(content) {
return '`' + content + '`';
}
}
The filter property determines whether or not an element should be replaced. DOM nodes can be selected simply by filtering by tag name, with strings or with arrays of strings:
filter: 'p'
will selectp
elementsfilter: ['em', 'i']
will selectem
ori
elements
Alternatively, the filter can be a function that returns a boolean depending on whether a given node should be replaced. The function is passed a DOM node as its only argument. For example, the following will match any span
element with an italic
font style:
filter: function (node) {
return node.nodeName === 'SPAN' && /italic/i.test(node.style.fontStyle);
}
The replacement function determines how an element should be converted. It should return the markdown string for a given node. The function is passed the node’s content, as well as the node itself (used in more complex conversions). It is called in the context of toMarkdown
, and therefore has access to the methods detailed below.
The following converter replaces heading elements (h1
-h6
):
{
filter: ['h1', 'h2', 'h3', 'h4', 'h5', 'h6'],
replacement: function(innerHTML, node) {
var hLevel = node.tagName.charAt(1);
var hPrefix = '';
for(var i = 0; i < hLevel; i++) {
hPrefix += '#';
}
return '\n' + hPrefix + ' ' + innerHTML + '\n\n';
}
}
to-markdown has beta support for GitHub flavored markdown (GFM). Set the gfm
option to true:
toMarkdown('<del>Hello world!</del>', { gfm: true });
The following methods can be called on the toMarkdown
object.
Returns true
/false
depending on whether the element is block level.
Returns true
/false
depending on whether the element is void.
Returns the string with leading and trailing whitespace removed.
Returns the content of the node along with the element itself.
First make sure you have node.js/npm installed, then:
$ npm install --dev
$ bower install --dev
Automatically browserify the module when source files change by running:
$ npm start
To run the tests in the browser, open test/index.html
.
To run in node.js:
$ npm test
Thanks to all contributors. Also, thanks to Alex Cornejo for advice and inspiration for the breadth-first search algorithm.
to-markdown is copyright © 2011-15 Dom Christie and released under the MIT license.