Skip to content

cyderize/xml-rs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

xml-rs, an XML library for Rust

Build Status

xml-rs is an XML library for Rust programming language. It is heavily inspired by Java stream-based XML API (StAX).

This library currently contains pull parser much like StAX event reader. It provides iterator API, so you can leverage Rust's existing iterators library features.

This parser is mostly full-featured, however, there are limitation:

  • no other encodings but UTF-8 are supported yet, because no stream-based encoding library is available now; when (or if) one will be available, I'll try to make use of it;
  • DTD validation is not supported, <!DOCTYPE> declarations are completely ignored; thus no support for custom entities too; internal DTD declarations are likely to cause parsing errors;
  • attribute value normalization is not performed, and end-of-line characters are not normalized too.

Other than that the parser tries to be mostly XML-1.0-compliant.

What is planned (highest priority first):

  1. XML emitter, that is, an analog of StAX event writer, including pretty printing;
  2. parsing into a DOM tree and its serialization back to XML text;
  3. SAX-like callback-based parser (fairly easy to implement over pull parser);
  4. some kind of test infrastructure;
  5. more convenience features, like filtering over produced events;
  6. missing features required by XML standard (e.g. aforementioned normalization);
  7. DTD validation;
  8. (let's dream a bit) XML Schema validation.

Hopefully XML emitter will be implemented soon. This will allow easy stream processing, for example, transformation of large XML documents.

Building and using

xml-rs uses Cargo, so just add a dependency section in your project's manifest:

[dependencies.xml-rs]
version = "*"

Parsing

xml::reader::EventReader requires a Buffer to read from. When proper stream-based encoding library will be available, it is likely that it will be switched to use whatever character stream structure this library will provide, but currently it is a Buffer. However, there are several static methods which allow to create a parser from string or a byte vector.

EventReader usage is very straightforward. Just provide a Buffer and then create an iterator over events:

extern crate xml;

use std::io::{File, BufferedReader};

use xml::reader::EventReader;
use xml::reader::events::*;

fn indent(size: uint) -> String {
    let mut result = String::with_capacity(size*4);
    for _ in range(0, size) {
        result.push_str("    ");
    }
    result
}

fn main() {
    let file = File::open(&Path::new("file.xml")).unwrap();
    let reader = BufferedReader::new(file);

    let mut parser = EventReader::new(reader);
    let mut depth = 0;
    for e in parser.events() {
        match e {
            StartElement { name, _, _ } => {
                println!("{}/{}", indent(depth), name);
                depth += 1;
            }
            EndElement(name) => {
                depth -= 1;
                println!("{}/{}", indent(depth), name);
            }
            Error(e) => {
                println!("Error: {}", e);
                break;
            }
            _ => {}
        }
    }
}

events() should be called only once, that is, every instance of an iterator it returns will always use the same underlying parser (TODO: make consuming iterator). Document parsing can end normally or with an error. Regardless of exact cause, the parsing process will be stopped, and iterator will terminate normally.

You can also have finer control over when to pull the next event from the parser using its own next() method:

match parser.next() {
    ...
}

Upon end of document or an error encounter the parser will rememeber that last event and will always return it in the result of next() call afterwards.

It is also possible to tweak parsing process a little using xml::reader::ParserConfig structure. See its documentation for more information and examples.

Other things

No performance tests or measurements are done. The implementation is rather naive, and no specific optimizations are made. Hopefully the library is sufficiently fast to process documents of common size.

License

This library is licensed under MIT license. Feel free to post found issues on GitHub issue tracker: http://github.com/netvl/xml-rs/issues.


Copyright (C) Vladimir Matveev, 2014

About

An XML library in Rust

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Rust 100.0%