Crate nom [−] [src]
nom, eating data byte by byte
nom is a parser combinator library with a focus on safe parsing, streaming patterns, and as much as possible zero copy.
The code is available on Github
There are a few guides with more details about the design of nom, how to write parsers, or the error management system.
If you are upgrading to nom 2.0, please read the migration document.
See also the FAQ.
Example
#[macro_use] extern crate nom; use nom::{IResult,digit}; // Parser definition use std::str; use std::str::FromStr; // We parse any expr surrounded by parens, ignoring all whitespaces around those named!(parens<i64>, ws!(delimited!( tag!("("), expr, tag!(")") )) ); // We transform an integer string into a i64, ignoring surrounding whitespaces // We look for a digit suite, and try to convert it. // If either str::from_utf8 or FromStr::from_str fail, // we fallback to the parens parser defined above named!(factor<i64>, alt!( map_res!( map_res!( ws!(digit), str::from_utf8 ), FromStr::from_str ) | parens ) ); // We read an initial factor and for each time we find // a * or / operator followed by another factor, we do // the math by folding everything named!(term <i64>, do_parse!( init: factor >> res: fold_many0!( pair!(alt!(tag!("*") | tag!("/")), factor), init, |acc, (op, val): (&[u8], i64)| { if (op[0] as char) == '*' { acc * val } else { acc / val } } ) >> (res) ) ); named!(expr <i64>, do_parse!( init: term >> res: fold_many0!( pair!(alt!(tag!("+") | tag!("-")), term), init, |acc, (op, val): (&[u8], i64)| { if (op[0] as char) == '+' { acc + val } else { acc - val } } ) >> (res) ) ); fn main() { assert_eq!(expr(b"1+2"), IResult::Done(&b""[..], 3)); assert_eq!(expr(b"12+6-4+3"), IResult::Done(&b""[..], 17)); assert_eq!(expr(b"1+2*3+4"), IResult::Done(&b""[..], 11)); assert_eq!(expr(b"(2)"), IResult::Done(&b""[..], 2)); assert_eq!(expr(b"2*(3+4)"), IResult::Done(&b""[..], 14)); assert_eq!(expr(b"2*2/(5-1)+3"), IResult::Done(&b""[..], 4)); }
Reexports
pub use self::simple_errors::*; |
pub use self::methods::*; |
pub use self::bits::*; |
pub use self::whitespace::*; |
Modules
bits |
Bit level parsers and combinators |
methods |
Method macro combinators |
simple_errors |
Error management |
whitespace |
Support for whitespace delimited formats |
Macros
add_return_error |
Add an error if the child parser fails |
alt |
Try a list of parsers and return the result of the first successful one |
alt_complete |
Is equivalent to the |
apply |
emulate function currying: |
apply_m |
emulate function currying for method calls on structs
|
bits |
|
call |
Used to wrap common expressions and function as macros |
call_m |
Used to called methods then move self back into self |
char |
matches one character: `char!(char) => &[u8] -> IResult<&[u8], char> |
closure |
Wraps a parser in a closure |
complete |
replaces a |
cond |
|
cond_reduce |
|
cond_with_error |
|
consumer_from_parser | |
count |
|
count_fixed |
|
dbg |
Prints a message if the parser fails |
dbg_dmp |
Prints a message and the input if the parser fails |
delimited |
|
do_parse |
|
eat_separator |
helper macros to build a separator parser |
eof |
|
error_code |
creates a parse error from a |
error_node |
creates a parse error from a |
error_node_position |
creates a parse error from a |
error_position |
creates a parse error from a |
escaped |
|
escaped_transform |
|
expr_opt |
|
expr_res |
|
fix_error |
translate parser result from IResult to IResult with a custom type |
flat_map |
|
fold_many0 |
|
fold_many1 |
|
fold_many_m_n |
|
i16 |
if the parameter is nom::Endianness::Big, parse a big endian i16 integer, otherwise a little endian i16 integer |
i32 |
if the parameter is nom::Endianness::Big, parse a big endian i32 integer, otherwise a little endian i32 integer |
i64 |
if the parameter is nom::Endianness::Big, parse a big endian i64 integer, otherwise a little endian i64 integer |
is_a |
|
is_a_s |
|
is_not |
|
is_not_s |
|
length_bytes |
|
length_count |
|
length_data |
|
length_value |
|
many0 |
|
many1 |
|
many_m_n |
|
many_till |
|
map |
|
map_opt |
|
map_res |
|
method |
Makes a method from a parser combination |
named |
Makes a function from a parser combination |
named_args |
Makes a function from a parser combination with arguments. |
named_attr |
Makes a function from a parser combination, with attributes |
none_of |
matches anything but the provided characters |
not |
|
one_of |
matches one of the provided characters |
opt |
|
opt_res |
|
pair |
|
parse_to |
|
peek |
|
permutation |
|
preceded |
|
recognize |
|
return_error |
Prevents backtracking if the child parser fails |
sep |
sep is the parser rewriting macro for whitespace separated formats |
separated_list |
|
separated_nonempty_list |
|
separated_pair |
|
switch |
|
tag |
|
tag_bits |
matches an integer pattern to a bitstream. The number of bits of the input to compare must be specified |
tag_no_case |
|
tag_no_case_s |
|
tag_s |
|
take |
|
take_bits |
|
take_s |
|
take_str |
|
take_till |
|
take_till1 |
|
take_till1_s |
|
take_till_s |
|
take_until |
|
take_until1 |
|
take_until_and_consume |
|
take_until_and_consume1 |
|
take_until_and_consume_s |
|
take_until_either |
|
take_until_either_and_consume |
|
take_until_s |
|
take_while |
|
take_while1 |
|
take_while1_s |
|
take_while_s |
|
tap |
|
terminated |
|
try_parse |
A bit like |
tuple |
|
u16 |
if the parameter is nom::Endianness::Big, parse a big endian u16 integer, otherwise a little endian u16 integer |
u32 |
if the parameter is nom::Endianness::Big, parse a big endian u32 integer, otherwise a little endian u32 integer |
u64 |
if the parameter is nom::Endianness::Big, parse a big endian u64 integer, otherwise a little endian u64 integer |
value |
|
verify |
|
wrap_sep | |
ws |
|
Structs
ChainConsumer |
ChainConsumer takes a consumer C1 R -> S, and a consumer C2 S -> T, and makes a consumer R -> T by applying C2 on C1's result |
FileProducer | |
MapConsumer |
MapConsumer takes a function S -> T and applies it on a consumer producing values of type S |
MemProducer |
A MemProducer generates values from an in memory byte buffer |
ProducerRepeat |
ProducerRepeat takes a single value, and generates it at each step |
Enums
CompareResult |
indicates wether a comparison was successful, an error, or if more data was needed |
ConsumerState |
Stores a consumer's current computation state |
Endianness |
Configurable endianness |
ErrorKind |
indicates which parser returned an error |
FileProducerState | |
IError |
This is the same as IResult, but without Done |
IResult |
Holds the result of parsing functions |
Input | |
Move | |
Needed |
Contains information on needed data if a parser returned |
Traits
AsBytes | |
AsChar |
transforms common types to a char for basic token parsing |
Compare |
abstracts comparison operations |
Consumer |
The Consumer trait wraps a computation and its state |
FindSubstring |
look for a substring in self |
FindToken |
look for self in the given input stream |
GetInput | |
GetOutput | |
HexDisplay | |
InputIter |
abstracts common iteration operations on the input type |
InputLength |
abstract method to calculate the input length |
InputTake |
abstracts slicing operations |
Offset |
useful functions to calculate the offset between slices and show a hexdump of a slice |
ParseTo |
used to integrate str's parse() method |
Producer |
The producer wraps a data source, like file or network, and applies a consumer on it |
Slice |
slicing operations using ranges |
Functions
alpha |
Recognizes one or more lowercase and uppercase alphabetic characters: a-zA-Z |
alphanumeric |
Recognizes one or more numerical and alphabetic characters: 0-9a-zA-Z |
anychar | |
be_f32 |
Recognizes big endian 4 bytes floating point number |
be_f64 |
Recognizes big endian 8 bytes floating point number |
be_i8 |
Recognizes a signed 1 byte integer (equivalent to take!(1) |
be_i16 |
Recognizes big endian signed 2 bytes integer |
be_i32 |
Recognizes big endian signed 4 bytes integer |
be_i64 |
Recognizes big endian signed 8 bytes integer |
be_u8 |
Recognizes an unsigned 1 byte integer (equivalent to take!(1) |
be_u16 |
Recognizes big endian unsigned 2 bytes integer |
be_u24 |
Recognizes big endian unsigned 3 byte integer |
be_u32 |
Recognizes big endian unsigned 4 bytes integer |
be_u64 |
Recognizes big endian unsigned 8 bytes integer |
begin | |
code_from_offset | |
crlf | |
digit |
Recognizes one or more numerical characters: 0-9 |
double |
Recognizes floating point number in a byte string and returs a f64 |
double_s |
Recognizes floating point number in a string and returs a f64 |
eol | |
error_to_u32 | |
float |
Recognizes floating point number in a byte string and returs a f32 |
float_s |
Recognizes floating point number in a string and returs a f32 |
hex_digit |
Recognizes one or more hexadecimal numerical characters: 0-9, A-F, a-f |
hex_u32 |
Recognizes a hex-encoded integer |
is_alphabetic |
Tests if byte is ASCII alphabetic: A-Z, a-z |
is_alphanumeric |
Tests if byte is ASCII alphanumeric: A-Z, a-z, 0-9 |
is_digit |
Tests if byte is ASCII digit: 0-9 |
is_hex_digit |
Tests if byte is ASCII hex digit: 0-9, A-F, a-f |
is_oct_digit |
Tests if byte is ASCII octal digit: 0-7 |
is_space |
Tests if byte is ASCII space or tab |
le_f32 |
Recognizes little endian 4 bytes floating point number |
le_f64 |
Recognizes little endian 8 bytes floating point number |
le_i8 |
Recognizes a signed 1 byte integer (equivalent to take!(1) |
le_i16 |
Recognizes little endian signed 2 bytes integer |
le_i32 |
Recognizes little endian signed 4 bytes integer |
le_i64 |
Recognizes little endian signed 8 bytes integer |
le_u8 |
Recognizes an unsigned 1 byte integer (equivalent to take!(1) |
le_u16 |
Recognizes little endian unsigned 2 bytes integer |
le_u24 |
Recongnizes little endan unsigned 3 byte integer |
le_u32 |
Recognizes little endian unsigned 4 bytes integer |
le_u64 |
Recognizes little endian unsigned 8 bytes integer |
line_ending |
Recognizes an end of line (both '\n' and '\r\n') |
multispace |
Recognizes one or more spaces, tabs, carriage returns and line feeds |
newline |
Matches a newline character '\n' |
non_empty |
Recognizes non empty buffers |
not_line_ending | |
oct_digit |
Recognizes one or more octal characters: 0-7 |
print_codes | |
reset_color | |
rest |
Return the remaining input. |
rest_s |
Return the remaining input, for strings. |
shift | |
sized_buffer | |
slice_to_offsets | |
space |
Recognizes one or more spaces and tabs |
tab |
Matches a tab character '\t' |
tag_cl | |
write_color |