Skip to content

Simple javascript parser for docutils xml documents.

License

Notifications You must be signed in to change notification settings

vberlier/docutils

Repository files navigation

docutils

Build Status npm

Simple javascript parser for docutils xml documents.

This package uses sax-js to parse and turn docutils xml documents into plain javascript objects. This can be useful for working with documentation generated by tools like sphinx.

const docutils = require("docutils");

const document = docutils.parse(`
  <document source=".../hello.rst">
    <section ids="hello-world" names="hello,\\ world!">
      <title>Hello, world!</title>
    </section>
  </document>
`);

console.log(document.children[0].children[0]);
// Output: { tag: 'title', attributes: {}, children: [ 'Hello, world!' ] }

Installation

You can install docutils with your npm client of choice.

$ npm install docutils

Usage

docutils.parse(string, plugins = [])

Parse the input string and return a hierarchy of plain javascript objects. The function will throw an error if the input string isn't valid xml.

Here's what the function would've returned in the previous example:

{
  tag: 'document',
  attributes: {
    source: '.../hello.rst'
  },
  children: [
    {
      tag: 'section',
      attributes: {
        ids: 'hello-world',
        names: 'hello, world!'
      },
      children: [
        {
          tag: 'title',
          attributes: {},
          children: [
            'Hello, world!'
          ]
        }
      ]
    }
  ]
}

Elements are turned into plain javascript objects with a specific structure:

  • The tag property is the name of the element
  • The attributes property is an object mapping each attribute name to its value
  • The children property is an array that can contain strings and other elements

Keep in mind that you might need to catch parsing errors where appropriate:

try {
  docutils.parse("invalid document");
} catch (err) {
  console.log(err);
  // Error: Start tag expected, '<' not found
}

Plugins

The second argument of the docutils.parse() function is an optional array of plugins. Plugins are functions that take an instance of docutils.DocumentParser as parameter.

const titleToUpperCase = (parser) => {
  parser.on("element:title", (element) => {
    element.children[0] = element.children[0].toUpperCase();
  });
};

const document = docutils.parse(string, [titleToUpperCase]);

console.log(document.children[0].children[0]);
// Output: { tag: 'title', attributes: {}, children: [ 'HELLO, WORLD!' ] }

docutils.DocumentParser({ plugins = [] } = {})

It's probably a good idea to always use the docutils.parse() function directly, but it's also possible to instantiate the parser manually.

const parser = new docutils.DocumentParser();
const document = parser.parse(string);

Most of the time, you'll only interact with the parser through plugins. The docutils.DocumentParser class inherits from the nodejs EventEmitter and lets you hook into various stages of the parsing process.

Event Arguments Description
document:start Emitted before parsing a document
document:end document Emitted after parsing a document
element element Emitted after parsing an element
element:TAG_NAME element Emitted after parsing a TAG_NAME element

Contributing

Contributions are welcome. This project uses jest for testing.

$ npm test

The code follows the javascript standard style guide.

$ npm run lint

License - MIT

About

Simple javascript parser for docutils xml documents.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •