Crate nom [−] [src]
nom, eating data byte by byte
nom is a parser combinator library with a focus on safe parsing, streaming patterns, and as much as possible zero copy.
The code is available on Github
There are a few guides with more details about the design of nom, how to write parsers, or the error management system.
If you are upgrading to nom 2.0, please read the migration document.
See also the FAQ.
Example
#[macro_use] extern crate nom; use nom::{IResult,digit}; // Parser definition use std::str; use std::str::FromStr; // We parse any expr surrounded by parens, ignoring all whitespaces around those named!(parens<i64>, ws!(delimited!( tag!("("), expr, tag!(")") )) ); // We transform an integer string into a i64, ignoring surrounding whitespaces // We look for a digit suite, and try to convert it. // If either str::from_utf8 or FromStr::from_str fail, // we fallback to the parens parser defined above named!(factor<i64>, alt!( map_res!( map_res!( ws!(digit), str::from_utf8 ), FromStr::from_str ) | parens ) ); // We read an initial factor and for each time we find // a * or / operator followed by another factor, we do // the math by folding everything named!(term <i64>, do_parse!( init: factor >> res: fold_many0!( pair!(alt!(tag!("*") | tag!("/")), factor), init, |acc, (op, val): (&[u8], i64)| { if (op[0] as char) == '*' { acc * val } else { acc / val } } ) >> (res) ) ); named!(expr <i64>, do_parse!( init: term >> res: fold_many0!( pair!(alt!(tag!("+") | tag!("-")), term), init, |acc, (op, val): (&[u8], i64)| { if (op[0] as char) == '+' { acc + val } else { acc - val } } ) >> (res) ) ); fn main() { assert_eq!(expr(b"1+2"), IResult::Done(&b""[..], 3)); assert_eq!(expr(b"12+6-4+3"), IResult::Done(&b""[..], 17)); assert_eq!(expr(b"1+2*3+4"), IResult::Done(&b""[..], 11)); assert_eq!(expr(b"(2)"), IResult::Done(&b""[..], 2)); assert_eq!(expr(b"2*(3+4)"), IResult::Done(&b""[..], 14)); assert_eq!(expr(b"2*2/(5-1)+3"), IResult::Done(&b""[..], 4)); }
Reexports
| pub use self::simple_errors::*; | 
| pub use self::methods::*; | 
| pub use self::bits::*; | 
| pub use self::whitespace::*; | 
Modules
| bits | Bit level parsers and combinators | 
| methods | Method macro combinators | 
| simple_errors | Error management | 
| whitespace | Support for whitespace delimited formats | 
Macros
| add_error | Add an error if the child parser fails | 
| alt | 
 | 
| alt_complete | This is a combination of the  | 
| apply | emulate function currying:  | 
| apply_m | emulate function currying for method calls on structs
 | 
| bits | 
 | 
| call | Used to wrap common expressions and function as macros | 
| call_m | Used to called methods then move self back into self | 
| chain | 
 | 
| char | matches one character: `char!(char) => &[u8] -> IResult<&[u8], char> | 
| closure | Wraps a parser in a closure | 
| complete | replaces a  | 
| cond | 
 | 
| cond_reduce | 
 | 
| cond_with_error | 
 | 
| consumer_from_parser | |
| count | 
 | 
| count_fixed | 
 | 
| dbg | Prints a message if the parser fails | 
| dbg_dmp | Prints a message and the input if the parser fails | 
| delimited | 
 | 
| do_parse | 
 | 
| eat_separator | helper macros to build a separator parser | 
| eof | 
 | 
| error_code | creates a parse error from a  | 
| error_node | creates a parse error from a  | 
| error_node_position | creates a parse error from a  | 
| error_position | creates a parse error from a  | 
| escaped | 
 | 
| escaped_transform | 
 | 
| expr_opt | 
 | 
| expr_res | 
 | 
| fix_error | translate parser result from IResult to IResult with a custom type | 
| flat_map | 
 | 
| fold_many0 | 
 | 
| fold_many1 | 
 | 
| fold_many_m_n | 
 | 
| i16 | if the parameter is nom::Endianness::Big, parse a big endian i16 integer, otherwise a little endian i16 integer | 
| i32 | if the parameter is nom::Endianness::Big, parse a big endian i32 integer, otherwise a little endian i32 integer | 
| i64 | if the parameter is nom::Endianness::Big, parse a big endian i64 integer, otherwise a little endian i64 integer | 
| is_a | 
 | 
| is_a_s | 
 | 
| is_not | 
 | 
| is_not_s | 
 | 
| length_bytes | `length_bytes!(&[T] -> IResult<&[T], nb>) => &[T] -> IResult<&[T], &[T]> gets a number from the first parser, then extracts that many bytes from the remaining stream | 
| length_count | 
 | 
| length_data | 
 | 
| length_value | 
 | 
| many0 | 
 | 
| many1 | 
 | 
| many_m_n | 
 | 
| many_till | 
 | 
| map | 
 | 
| map_opt | 
 | 
| map_res | 
 | 
| method | Makes a method from a parser combination | 
| named | Makes a function from a parser combination | 
| named_attr | Makes a function from a parser combination, with attributes | 
| none_of | matches anything but the provided characters | 
| not | 
 | 
| one_of | matches one of the provided characters | 
| opt | 
 | 
| opt_res | 
 | 
| pair | 
 | 
| peek | 
 | 
| permutation | 
 | 
| preceded | 
 | 
| recognize | 
 | 
| return_error | Prevents backtracking if the child parser fails | 
| sep | sep is the parser rewriting macro for whitespace separated formats | 
| separated_list | 
 | 
| separated_nonempty_list | 
 | 
| separated_pair | 
 | 
| switch | 
 | 
| tag | 
 | 
| tag_bits | matches an integer pattern to a bitstream. The number of bits of the input to compare must be specified | 
| tag_no_case | 
 | 
| tag_no_case_s | 
 | 
| tag_s | 
 | 
| take | 
 | 
| take_bits | 
 | 
| take_s | 
 | 
| take_str | 
 | 
| take_till | 
 | 
| take_till_s | 
 | 
| take_until | 
 | 
| take_until_and_consume | 
 | 
| take_until_and_consume_s | 
 | 
| take_until_either | 
 | 
| take_until_either_and_consume | 
 | 
| take_until_s | 
 | 
| take_while | 
 | 
| take_while1 | 
 | 
| take_while1_s | 
 | 
| take_while_s | 
 | 
| tap | 
 | 
| terminated | 
 | 
| try_parse | A bit like  | 
| tuple | 
 | 
| u16 | if the parameter is nom::Endianness::Big, parse a big endian u16 integer, otherwise a little endian u16 integer | 
| u32 | if the parameter is nom::Endianness::Big, parse a big endian u32 integer, otherwise a little endian u32 integer | 
| u64 | if the parameter is nom::Endianness::Big, parse a big endian u64 integer, otherwise a little endian u64 integer | 
| value | 
 | 
| wrap_sep | |
| ws | 
 | 
Structs
| ChainConsumer | ChainConsumer takes a consumer C1 R -> S, and a consumer C2 S -> T, and makes a consumer R -> T by applying C2 on C1's result | 
| FileProducer | |
| MapConsumer | MapConsumer takes a function S -> T and applies it on a consumer producing values of type S | 
| MemProducer | A MemProducer generates values from an in memory byte buffer | 
| ProducerRepeat | ProducerRepeat takes a single value, and generates it at each step | 
Enums
| CompareResult | indicates wether a comparison was successful, an error, or if more data was needed | 
| ConsumerState | Stores a consumer's current computation state | 
| Endianness | Configurable endianness | 
| ErrorKind | indicates which parser returned an error | 
| FileProducerState | |
| IError | This is the same as IResult, but without Done | 
| IResult | Holds the result of parsing functions | 
| Input | |
| Move | |
| Needed | Contains information on needed data if a parser returned  | 
Traits
| AsBytes | |
| AsChar | transforms common types to a char for basic token parsing | 
| Compare | abstracts comparison operations | 
| Consumer | The Consumer trait wraps a computation and its state | 
| FindSubstring | look for a substring in self | 
| FindToken | look for self in the given input stream | 
| GetInput | |
| GetOutput | |
| HexDisplay | |
| InputIter | abstracts common iteration operations on the input type | 
| InputLength | abstract method to calculate the input length | 
| InputTake | abstracts slicing operations | 
| Offset | useful functions to calculate the offset between slices and show a hexdump of a slice | 
| Producer | The producer wraps a data source, like file or network, and applies a consumer on it | 
| Slice | slicing operations using ranges | 
Functions
| alpha | Recognizes lowercase and uppercase alphabetic characters: a-zA-Z | 
| alphanumeric | Recognizes numerical and alphabetic characters: 0-9a-zA-Z | 
| anychar | |
| be_f32 | Recognizes big endian 4 bytes floating point number | 
| be_f64 | Recognizes big endian 8 bytes floating point number | 
| be_i16 | Recognizes big endian signed 2 bytes integer | 
| be_i32 | Recognizes big endian signed 4 bytes integer | 
| be_i64 | Recognizes big endian signed 8 bytes integer | 
| be_i8 | Recognizes a signed 1 byte integer (equivalent to take!(1) | 
| be_u16 | Recognizes big endian unsigned 2 bytes integer | 
| be_u32 | Recognizes big endian unsigned 4 bytes integer | 
| be_u64 | Recognizes big endian unsigned 8 bytes integer | 
| be_u8 | Recognizes an unsigned 1 byte integer (equivalent to take!(1) | 
| begin | |
| code_from_offset | |
| crlf | |
| digit | Recognizes numerical characters: 0-9 | 
| eol | |
| error_to_u32 | |
| hex_digit | Recognizes hexadecimal numerical characters: 0-9, A-F, a-f | 
| hex_u32 | Recognizes a hex-encoded integer | 
| is_alphabetic | Tests if byte is ASCII alphabetic: A-Z, a-z | 
| is_alphanumeric | Tests if byte is ASCII alphanumeric: A-Z, a-z, 0-9 | 
| is_digit | Tests if byte is ASCII digit: 0-9 | 
| is_hex_digit | Tests if byte is ASCII hex digit: 0-9, A-F, a-f | 
| is_oct_digit | Tests if byte is ASCII octal digit: 0-7 | 
| is_space | Tests if byte is ASCII space or tab | 
| le_f32 | Recognizes little endian 4 bytes floating point number | 
| le_f64 | Recognizes little endian 8 bytes floating point number | 
| le_i16 | Recognizes little endian signed 2 bytes integer | 
| le_i32 | Recognizes little endian signed 4 bytes integer | 
| le_i64 | Recognizes little endian signed 8 bytes integer | 
| le_i8 | Recognizes a signed 1 byte integer (equivalent to take!(1) | 
| le_u16 | Recognizes little endian unsigned 2 bytes integer | 
| le_u32 | Recognizes little endian unsigned 4 bytes integer | 
| le_u64 | Recognizes little endian unsigned 8 bytes integer | 
| le_u8 | Recognizes an unsigned 1 byte integer (equivalent to take!(1) | 
| line_ending | Recognizes lowercase and uppercase alphabetic characters: a-zA-Z | 
| multispace | Recognizes spaces, tabs, carriage returns and line feeds | 
| newline | Matches a newline character '\n' | 
| non_empty | Recognizes non empty buffers | 
| not_line_ending | |
| oct_digit | Recognizes octal characters: 0-7 | 
| print_codes | |
| reset_color | |
| rest | Return the remaining input. | 
| rest_s | Return the remaining input, for strings. | 
| shift | |
| sized_buffer | |
| slice_to_offsets | |
| space | Recognizes spaces and tabs | 
| tab | Matches a tab character '\t' | 
| tag_cl | |
| write_color |