Simple javascript parser for docutils xml documents.
This package uses sax-js to parse and turn docutils xml documents into plain javascript objects. This can be useful for working with documentation generated by tools like sphinx.
const docutils = require("docutils");
const document = docutils.parse(`
<document source=".../hello.rst">
<section ids="hello-world" names="hello,\\ world!">
<title>Hello, world!</title>
</section>
</document>
`);
console.log(document.children[0].children[0]);
// Output: { tag: 'title', attributes: {}, children: [ 'Hello, world!' ] }
You can install docutils
with your npm
client of choice.
$ npm install docutils
Parse the input string and return a hierarchy of plain javascript objects. The function will throw an error if the input string isn't valid xml.
Here's what the function would've returned in the previous example:
{
tag: 'document',
attributes: {
source: '.../hello.rst'
},
children: [
{
tag: 'section',
attributes: {
ids: 'hello-world',
names: 'hello, world!'
},
children: [
{
tag: 'title',
attributes: {},
children: [
'Hello, world!'
]
}
]
}
]
}
Elements are turned into plain javascript objects with a specific structure:
- The
tag
property is the name of the element - The
attributes
property is an object mapping each attribute name to its value - The
children
property is an array that can contain strings and other elements
Keep in mind that you might need to catch parsing errors where appropriate:
try {
docutils.parse("invalid document");
} catch (err) {
console.log(err);
// Error: Start tag expected, '<' not found
}
The second argument of the docutils.parse()
function is an optional array of plugins. Plugins are functions that take an instance of docutils.DocumentParser
as parameter.
const titleToUpperCase = (parser) => {
parser.on("element:title", (element) => {
element.children[0] = element.children[0].toUpperCase();
});
};
const document = docutils.parse(string, [titleToUpperCase]);
console.log(document.children[0].children[0]);
// Output: { tag: 'title', attributes: {}, children: [ 'HELLO, WORLD!' ] }
It's probably a good idea to always use the docutils.parse()
function directly, but it's also possible to instantiate the parser manually.
const parser = new docutils.DocumentParser();
const document = parser.parse(string);
Most of the time, you'll only interact with the parser through plugins. The docutils.DocumentParser
class inherits from the nodejs EventEmitter
and lets you hook into various stages of the parsing process.
Event | Arguments | Description |
---|---|---|
document:start |
Emitted before parsing a document | |
document:end |
document |
Emitted after parsing a document |
element |
element |
Emitted after parsing an element |
element:TAG_NAME |
element |
Emitted after parsing a TAG_NAME element |
Contributions are welcome. This project uses jest for testing.
$ npm test
The code follows the javascript standard style guide.
$ npm run lint
License - MIT