Crate combine [−] [src]
This crate contains parser combinators, roughly based on the Haskell library parsec.
A parser in this library can be described as a function which takes some input and if it
is succesful, returns a value together with the remaining input.
A parser combinator is a function which takes one or more parsers and returns a new parser.
For instance the many
parser can be used to convert a parser for single digits into one that
parses multiple digits. By modeling parsers in this way it becomes easy to compose complex
parsers in an almost declarative way.
Overview
combine
limits itself to creating LL(1) parsers
(it is possible to opt-in to LL(k) parsing using the try
combinator) which makes the
parsers easy to reason about in both function and performance while sacrificing
some generality. In addition to you being able to reason better about the parsers you
construct combine
the library also takes the knowledge of being an LL parser and uses it to
automatically construct good error messages.
extern crate combine; use combine::Parser; use combine::stream::state::State; use combine::parser::char::{digit, letter}; const MSG: &'static str = r#"Parse error at line: 1, column: 1 Unexpected `|` Expected `digit` or `letter` "#; fn main() { // Wrapping a `&str` with `State` provides automatic line and column tracking. If `State` // was not used the positions would instead only be pointers into the `&str` if let Err(err) = digit().or(letter()).easy_parse(State::new("|")) { assert_eq!(MSG, format!("{}", err)); } }
This library currently contains five modules:
combinator
contains the before mentioned parser combinators and thus contains the main building exprs for creating any sort of complex parsers. It consists of free functions such asmany
andsatisfy
as well as a few methods on theParser
trait which provides a few functions such asor
which are more natural to use method calls.error
contains theParser
andStream
traits which are the core abstractions in combine as well as various structs dealing with input streams and errors. You usually only need to use this module if you want more control over parsing and input streams.char
andbyte
provides parsers specifically working with streams of characters (char
) and bytes (u8
) respectively. As a few examples it has parsers for accepting digits, letters or whitespace.range
provides some zero-copy parsers forRangeStream
s.
Examples
extern crate combine; use combine::parser::char::{spaces, digit, char}; use combine::{many1, sep_by, Parser}; use combine::stream::easy; fn main() { //Parse spaces first and use the with method to only keep the result of the next parser let integer = spaces() //parse a string of digits into an i32 .with(many1(digit()).map(|string: String| string.parse::<i32>().unwrap())); //Parse integers separated by commas, skipping whitespace let mut integer_list = sep_by(integer, spaces().skip(char(','))); //Call parse with the input to execute the parser let input = "1234, 45,78"; let result: Result<(Vec<i32>, &str), easy::ParseError<&str>> = integer_list.easy_parse(input); match result { Ok((value, _remaining_input)) => println!("{:?}", value), Err(err) => println!("{}", err) } }
If we need a parser that is mutually recursive we can define a free function which internally
can in turn be used as a parser by using the parser
function which turns a
function with the correct signature into a parser. In this case we define expr
to work on any
type of Stream
which is combine's way of abstracting over different data sources such as
array slices, string slices, iterators etc. If instead you would only need to parse string
already in memory you could define expr
as fn expr(input: &str) -> ParseResult<Expr, &str>
#[macro_use] extern crate combine; use combine::parser::char::{char, letter, spaces}; use combine::{between, many1, parser, sep_by, Parser}; use combine::error::ParseResult; use combine::stream::{Stream, Positioned}; use combine::stream::state::State; #[derive(Debug, PartialEq)] pub enum Expr { Id(String), Array(Vec<Expr>), Pair(Box<Expr>, Box<Expr>) } // The `parser!` macro can be used to define parser producing functions in most cases // (for more advanced uses standalone functions can be defined to handle parsing) parser!{ fn expr[I]()(I) -> Expr where [I: Stream<Item=char>] { let word = many1(letter()); //Creates a parser which parses a char and skips any trailing whitespace let lex_char = |c| char(c).skip(spaces()); let comma_list = sep_by(expr(), lex_char(',')); let array = between(lex_char('['), lex_char(']'), comma_list); //We can use tuples to run several parsers in sequence //The resulting type is a tuple containing each parsers output let pair = (lex_char('('), expr(), lex_char(','), expr(), lex_char(')')) .map(|t| Expr::Pair(Box::new(t.1), Box::new(t.3))); word.map(Expr::Id) .or(array.map(Expr::Array)) .or(pair) .skip(spaces()) } } fn main() { let result = expr() .parse("[[], (hello, world), [rust]]"); let expr = Expr::Array(vec![ Expr::Array(Vec::new()) , Expr::Pair(Box::new(Expr::Id("hello".to_string())), Box::new(Expr::Id("world".to_string()))) , Expr::Array(vec![Expr::Id("rust".to_string())]) ]); assert_eq!(result, Ok((expr, ""))); }
Reexports
pub extern crate byteorder; |
pub extern crate either; |
Modules
byte |
Module containing parsers specialized on byte streams. |
char |
Module containing parsers specialized on character streams. |
combinator |
Re-exported parsers for compatibility with older versions |
easy |
Stream wrapper which provides an informative and easy to use error type. |
error |
Error types and traits which define what kind of errors combine parsers may emit |
parser |
A collection of both concrete parsers as well as parser combinators. |
range |
Module containing zero-copy parsers. |
regex |
Module containing regex parsers on streams returning ranges of |
stream |
Traits and implementations of arbitrary data streams. |
Macros
choice |
Takes a number of parsers and tries to apply them each in order. Fails if all the parsers fails or if an applied parser consumes input before failing. |
parser |
Declares a named parser which can easily be reused. |
struct_parser |
Sequences multiple parsers and builds a struct out of them. |
Traits
ParseError |
Trait which defines a combine parse error. |
Parser |
By implementing the |
Positioned |
A type which has a position. |
RangeStream |
A |
Stream |
A stream of tokens which can be duplicated |
StreamOnce |
|
Functions
any |
Parses any token. |
between |
Parses |
chainl1 |
Parses |
chainr1 |
Parses |
choice |
Takes a tuple, a slice or an array of parsers and tries to apply them each in order. Fails if all the parsers fails or if an applied parser consumes input before failing. |
count |
Parses |
count_min_max |
Parses |
env_parser |
Constructs a parser out of an environment and a function which needs the given environment to do the parsing. This is commonly useful to allow multiple parsers to share some environment while still allowing the parsers to be written in separate functions. |
eof |
Succeeds only if the stream is at end of input, fails otherwise. |
look_ahead |
|
many |
Parses |
many1 |
Parses |
none_of |
Extract one token and succeeds if it is not part of |
not_followed_by |
Succeeds only if |
one_of |
Extract one token and succeeds if it is part of |
optional |
Parses |
parser |
Wraps a function, turning it into a parser. |
position |
Parser which just returns the current position in the stream. |
satisfy |
Parses a token and succeeds depending on the result of |
satisfy_map |
Parses a token and passes it to |
sep_by |
Parses |
sep_by1 |
Parses |
sep_end_by |
Parses |
sep_end_by1 |
Parses |
skip_count |
Parses |
skip_count_min_max |
Parses |
skip_many |
Parses |
skip_many1 |
Parses |
token |
Parses a character and succeeds if the character is equal to |
tokens |
Parses multiple tokens. |
try |
|
unexpected |
Always fails with |
value |
Always returns the value |
Type Definitions
ConsumedResult |
A |
ParseResult |
A type alias over the specific |