Crate parser_combinators

Source
Expand description

This crate contains parser combinators, roughly based on the Haskell library parsec.

A parser in this library can be described as a function which takes some input and if it is succesful, returns a value together with the remaining input. A parser combinator is a function which takes one or more parsers and returns a new parser. For instance the many parser can be used to convert a parser for single digits into one that parses multiple digits.

§Overview

This library is currently split into three modules.

  • primitives contains the Parser trait as well as various structs dealing with input streams and errors.

  • combinator contains the before mentioned parser combinators and thus contains the main building blocks for creating any sort of more complex parsers. It consists of free functions as well as a the ParserExt trait which provides a few functions which are more naturally used through method calls.

  • char is the last module. It provides parsers specifically working with streams of characters. As a few examples it has parsers for accepting digits, letters or whitespace.

§Examples

 extern crate parser_combinators;
 use parser_combinators::{spaces, many1, sep_by, digit, char, Parser, ParserExt, ParseError};
 
 fn main() {
     let input = "1234, 45,78";
     let spaces = spaces();
     let integer = spaces.clone()//Parse spaces first and use the with method to only keep the result of the next parser
         .with(many1(digit()).map(|string: String| string.parse::<i32>().unwrap()));//parse a string of digits into an i32
     //Parse integers separated by commas, skipping whitespace
     let mut integer_list = sep_by(integer, spaces.skip(char(',')));
 
     //Call parse with the input to execute the parser
     let result: Result<(Vec<i32>, &str), ParseError<char>> = integer_list.parse(input);
     match result {
         Ok((value, _remaining_input)) => println!("{:?}", value),
         Err(err) => println!("{}", err)
     }
 }

If we need a parser that is mutually recursive we can define a free function which internally can in turn be used as a parser (Note that we need to explicitly cast the function, this should not be necessary once changes in rustc to make orphan checking less restrictive gets implemented)

expr is written fully general here which may not be necessary in a specific implementation The Stream trait is predefined to work with array slices, string slices and iterators meaning that in this case it could be defined as fn expr(input: State<&str>) -> ParseResult<Expr, &str>

 extern crate parser_combinators;
 use parser_combinators::{between, char, letter, spaces, many1, parser, sep_by, Parser, ParserExt,
 ParseResult};
 use parser_combinators::primitives::{State, Stream};

 #[derive(Debug, PartialEq)]
 enum Expr {
     Id(String),
     Array(Vec<Expr>),
     Pair(Box<Expr>, Box<Expr>)
 }

 fn expr<I>(input: State<I>) -> ParseResult<Expr, I>
     where I: Stream<Item=char> {
     let word = many1(letter());
     //Creates a parser which parses a char and skips any trailing whitespace
     let lex_char = |c| char(c).skip(spaces());
     let comma_list = sep_by(parser(expr::<I>), lex_char(','));
     let array = between(lex_char('['), lex_char(']'), comma_list);
     //We can use tuples to run several parsers in sequence
     //The resulting type is a tuple containing each parsers output
     let pair = (lex_char('('), parser(expr::<I>), lex_char(','), parser(expr::<I>), lex_char(')'))
         .map(|t| Expr::Pair(Box::new(t.1), Box::new(t.3)));
     word.map(Expr::Id)
         .or(array.map(Expr::Array))
         .or(pair)
         .skip(spaces())
         .parse_state(input)
 }
 
 fn main() {
     let result = parser(expr)
         .parse("[[], (hello, world), [rust]]");
     let expr = Expr::Array(vec![
           Expr::Array(Vec::new())
         , Expr::Pair(Box::new(Expr::Id("hello".to_string())),
                      Box::new(Expr::Id("world".to_string())))
         , Expr::Array(vec![Expr::Id("rust".to_string())])
     ]);
     assert_eq!(result, Ok((expr, "")));
 }

Modules§

char
Module containg parsers specialized on character streams
combinator
Module containing all specific parsers
primitives
Module containing the primitive types which is used to create and compose more advanced parsers

Structs§

ParseError
Struct which hold information about an error that occured at a specific position. Can hold multiple instances of Error if more that one error occured at the position.
State
The State<I> struct keeps track of the current position in the stream I

Traits§

Parser
By implementing the Parser trait a type says that it can be used to parse an input stream into the type Output.
ParserExt
Extension trait which provides functions that are more conveniently used through method calls

Functions§

alpha_num
Parses either an alphabet letter or digit
any
Parses any token
between
Parses open followed by parser followed by close Returns the value of parser
chainl1
Parses p 1 or more times separated by op The value returned is the one produced by the left associative application of op
char
Parses a character and succeeds if the characther is equal to c
choice
Takes an array of parsers and tries them each in turn. Fails if all parsers fails or when a parsers fails with a consumed state.
crlf
Parses carriage return and newline, returning the newline character.
digit
Parses a digit from a stream containing characters
from_iter
Converts an Iterator into a stream.
hex_digit
Parses a hexdecimal digit with uppercase and lowercase
letter
Parses an alphabet letter
lower
Parses an lowercase letter
many
Parses p zero or more times returning a collection with the values from p. If the returned collection cannot be inferred type annotations must be supplied, either by annotating the resulting type binding let collection: Vec<_> = ... or by specializing when calling many, many::<Vec<_>, _>(...)
many1
Parses p one or more times returning a collection with the values from p. If the returned collection cannot be inferred type annotations must be supplied, either by annotating the resulting type binding let collection: Vec<_> = ... or by specializing when calling many1 many1::<Vec<_>, _>(...)
newline
Parses a newline character
not_followed_by
Succeeds only if parser fails. Never consumes any input.
oct_digit
Parses an octal digit
optional
Returns Some(value) and None on parse failure (always succeeds)
parser
Wraps a function, turning it into a parser Mainly needed to turn closures into parsers as function types can be casted to function pointers to make them usable as a parser
satisfy
Parses a token and succeeds depending on the result of predicate
sep_by
Parses parser zero or more time separated by separator, returning a collection with the values from p. If the returned collection cannot be inferred type annotations must be supplied, either by annotating the resulting type binding let collection: Vec<_> = ... or by specializing when calling sep_by, sep_by::<Vec<_>, _, _>(...)
skip_many
Parses p zero or more times ignoring the result
skip_many1
Parses p one or more times ignoring the result
space
Parses whitespace
spaces
Skips over zero or more spaces
string
Parses the string s
tab
Parses a tab character
token
Parses a character and succeeds if the characther is equal to c
try
Try acts as p except it acts as if the parser hadn’t consumed any input if p returns an error after consuming input
unexpected
Always fails with message as the error. Never consumes any input.
upper
Parses an uppercase letter
value
Always returns the value v without consuming any input.

Type Aliases§

ParseResult