Skip to content

Commit

Permalink
Format documents
Browse files Browse the repository at this point in the history
  • Loading branch information
juchiast authored and Geal committed Oct 6, 2018
1 parent 3615cf6 commit 5d69046
Show file tree
Hide file tree
Showing 2 changed files with 18 additions and 37 deletions.
4 changes: 1 addition & 3 deletions doc/error_management.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
% Error management

# Error management

Parser combinators are useful tools to build parsers, but they are notoriously bad at error reporting. This happens because a tree of parser acts as a single parser, and the only error you get will come from the root parser.
Expand Down Expand Up @@ -46,7 +44,7 @@ fn main() {

It will print, along with the result and the parser, a hexdump of the input buffer passed to the parser.

```ignore
```
Error(Position(0, [101, 102, 103, 104, 105, 106, 107, 108])) at l.5 by " tag ! ( "abcd" ) "
00000000 65 66 67 68 69 6a 6b 6c efghijkl
```
Expand Down
51 changes: 17 additions & 34 deletions doc/making_a_new_parser_from_scratch.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
% Making a new parser from scratch
# Making a new parser from scratch

Writing a parser is a very fun, interactive process, but sometimes a daunting task. How do you test it? How to see ambiguities in specifications?

Expand All @@ -20,20 +20,15 @@ Usually, you can separate the parsing functions in their own module, so you coul


```rust
# #[macro_use] extern crate nom;
# use nom::IResult;
# fn main() {
fn take_wrapper(input: &[u8], i: u8) -> IResult<&[u8],&[u8]> {
take!(input, i * 10)
}

// will make a parser taking 20 bytes
named!(parser, apply!(take_wrapper, 2));
# }

```

```ignore
```rust
#[macro_use]
extern crate nom;
pub mod parser;
Expand All @@ -42,9 +37,7 @@ pub mod parser;
And use the methods and structure from `parser` there. The `src/parser.rs` would then import nom functions and structures if needed:

```rust
# #[macro_use] extern crate nom;
use nom::{be_u16, be_u32};
# fn main() {}
```

# Writing a first parser
Expand All @@ -56,9 +49,10 @@ error type. This enum can either be `Ok((i,o))` containing the remaining input
and the output value, or, on the `Err` side, an error or an indication that more
data is needed.

```ignore
```rust
pub type IResult<I, O, E = u32> = Result<(I, O), Err<I, E>>;
#[derive(Debug,PartialEq,Eq,CLone,Copy)]

#[derive(Debug, PartialEq, Eq, CLone, Copy)]
pub enum Needed {
Unknown,
Size(u32)
Expand All @@ -71,18 +65,18 @@ pub enum Err<I, E = u32> {
Failure(Context<I, E>),
}

#[derive(Debug,PartialEq,Eq,Clone)]
#[derive(Debug, PartialEq, Eq, Clone)]
pub enum Err<P,E=u32>{
Code(ErrorKind<E>),
Node(ErrorKind<E>, Box<Err<P,E>>),
Node(ErrorKind<E>, Box<Err<P, E>>),
Position(ErrorKind<E>, P),
NodePosition(ErrorKind<E>, P, Box<Err<P,E>>)
NodePosition(ErrorKind<E>, P, Box<Err<P, E>>)
}

#[derive(Debug,PartialEq,Eq)]
#[derive(Debug, PartialEq, Eq)]
pub enum IResult<I,O,E=u32> {
Done(I,O),
Error(Err<I,E>),
Done(I, O),
Error(Err<I, E>),
Incomplete(Needed)
}

Expand All @@ -99,8 +93,6 @@ nom uses this type everywhere. Every combination of parsers will pattern match o
nom provides a macro for function definition, called `named!`:

```rust
# #[macro_use] extern crate nom;
# fn main() {}
named!(my_function(&[u8]) -> &[u8], tag!("abcd"));

named!(my_function2<&[u8], &[u8]>, tag!("abcd"));
Expand All @@ -111,9 +103,6 @@ named!(my_function3, tag!("abcd"));
But you could as easily define the function yourself like this:

```rust
# #[macro_use] extern crate nom;
# fn main() {}
# use nom::IResult;
fn my_function(input: &[u8]) -> IResult<&[u8], &[u8]> {
tag!(input, "abcd")
}
Expand All @@ -122,16 +111,12 @@ fn my_function(input: &[u8]) -> IResult<&[u8], &[u8]> {
Note that we pass the input to the first parser in the manual definition, while we do not when we use `named!`. This is a macro trick specific to nom: every parser takes the input as first parameter, and the macros take care of giving the remaining input to the next parser. As an example, take a simple parser like the following one, which recognizes the word "hello" then takes the next 5 bytes:

```rust
# #[macro_use] extern crate nom;
# fn main() {}
named!(prefixed, preceded!(tag!("hello"), take!(5)));
```

Once the macros have expanded, this would correspond to:

```ignore
# #[macro_use] extern crate nom;
# fn main() {}
```rust
fn prefixed(i: &[u8]) -> ::nom::IResult<&[u8], &[u8]> {
{
match {
Expand Down Expand Up @@ -204,7 +189,7 @@ Regular expression related macros are in [src/regexp.rs](https://github.com/Geal

Once you have a parser function, a good trick is to test it on a lot of the samples you gathered, and integrate this to your unit tests. To that end, put all of the test files in a folder like `assets` and refer to test files like this:

```ignore
```rust
#[test]
fn header_test() {
let data = include_bytes!("../assets/axolotl-piano.gif");
Expand All @@ -217,7 +202,7 @@ The `include_bytes!` macro (provided by Rust's standard library) will integrate

If your parser handles textual data, you can just use a lot of strings directly in the test, like this:

```ignore
```rust
#[test]
fn factor_test() {
assert_eq!(factor(&b"3"[..]), Ok((&b""[..], 3)));
Expand All @@ -233,7 +218,7 @@ The more samples and test cases you get, the more you can experiment with your p

While Rust macros are really useful to get a simpler syntax, they can sometimes give cryptic errors. As an example, `named!(manytag, many0!(take!(5)));` would result in the following error:

```ignore
```
<nom macros>:6:38: 6:41 error: mismatched types:
expected `&[u8]`,
found `collections::vec::Vec<&[u8]>`
Expand All @@ -255,15 +240,15 @@ There are a few tools you can use to debug how code is generated.

The `trace_macros` feature show how macros are applied. To use it, add `#![feature(trace_macros)]` at the top of your file (you need Rust nightly for this), then apply it like this:

```ignore
```rust
trace_macros!(true);
named!(manytag< Vec<&[u8]> >, many0!(take!(5)));
trace_macros!(false);
```

It will result in the following output during compilation:

```ignore
```rust
named! { manytag , many0 ! ( take ! ( 5 ) ) }
many0! { i , take ! ( 5 ) }
take! { input , 5 }
Expand All @@ -276,8 +261,6 @@ rustc can show how code is expanded with the option `--pretty=expanded`. If you
It will print the `manytag` function like this:

```rust
# #[macro_use] extern crate nom;
# fn main() {}
fn manytag(i: &[u8]) -> ::nom::IResult<&[u8], Vec<&[u8]>> {
let mut res = Vec::new();
let mut input = i;
Expand Down

0 comments on commit 5d69046

Please sign in to comment.