Trait chumsky::Parser[][src]

pub trait Parser<I, O> {
    type Error: Error<I>;
Show 23 methods fn parse_inner<S: Stream<I, <Self::Error as Error<I>>::Span>>(
        &self,
        stream: &mut S,
        errors: &mut Vec<Self::Error>
    ) -> (usize, Result<(O, Option<Self::Error>), Self::Error>)
    where
        Self: Sized
; fn parse_recovery<S: IntoStream<I, <Self::Error as Error<I>>::Span>>(
        &self,
        stream: S
    ) -> (Option<O>, Vec<Self::Error>)
    where
        Self: Sized
, { ... }
fn parse<S: IntoStream<I, <Self::Error as Error<I>>::Span>>(
        &self,
        stream: S
    ) -> Result<O, Vec<Self::Error>>
    where
        Self: Sized
, { ... }
fn map<U, F: Fn(O) -> U>(self, f: F) -> Map<Self, F, O>
    where
        Self: Sized
, { ... }
fn map_with_span<U, F: Fn(O, Option<<Self::Error as Error<I>>::Span>) -> U>(
        self,
        f: F
    ) -> MapWithSpan<Self, F, O>
    where
        Self: Sized
, { ... }
fn map_err<F: Fn(Self::Error) -> Self::Error>(self, f: F) -> MapErr<Self, F>
    where
        Self: Sized
, { ... }
fn labelled<L: Into<<Self::Error as Error<I>>::Pattern> + Clone>(
        self,
        label: L
    ) -> Label<Self, L>
    where
        Self: Sized
, { ... }
fn to<U: Clone>(self, x: U) -> To<Self, O, U>
    where
        Self: Sized
, { ... }
fn foldl<A, B, F: Fn(A, B) -> A>(self, f: F) -> Foldl<Self, F, A, B>
    where
        Self: Parser<I, (A, Vec<B>)> + Sized
, { ... }
fn foldr<'a, A, B, F: Fn(A, B) -> B + 'a>(
        self,
        f: F
    ) -> Foldr<Self, F, A, B>
    where
        Self: Parser<I, (Vec<A>, B)> + Sized
, { ... }
fn ignored(self) -> Ignored<Self, O>
    where
        Self: Sized
, { ... }
fn collect<C: FromIterator<O::Item>>(self) -> Map<Self, fn(_: O) -> C, O>
    where
        Self: Sized,
        O: IntoIterator
, { ... }
fn then<U, P: Parser<I, U>>(self, other: P) -> Then<Self, P>
    where
        Self: Sized
, { ... }
fn chain<T, U, P: Parser<I, U, Error = Self::Error>>(
        self,
        other: P
    ) -> Map<Then<Self, P>, fn(_: (O, U)) -> Vec<T>, (O, U)>
    where
        Self: Sized,
        U: Chain<T>,
        O: Chain<T>
, { ... }
fn flatten<T, Inner>(self) -> Map<Self, fn(_: O) -> Vec<T>, O>
    where
        Self: Sized,
        O: IntoIterator<Item = Inner>,
        Inner: IntoIterator<Item = T>
, { ... }
fn padding_for<U, P: Parser<I, U>>(
        self,
        other: P
    ) -> PaddingFor<Self, P, O, U>
    where
        Self: Sized
, { ... }
fn padded_by<U, P: Parser<I, U>>(self, other: P) -> PaddedBy<Self, P, O, U>
    where
        Self: Sized
, { ... }
fn delimited_by(self, start: I, end: I) -> DelimitedBy<Self, I>
    where
        Self: Sized
, { ... }
fn or<P: Parser<I, O>>(self, other: P) -> Or<Self, P>
    where
        Self: Sized
, { ... }
fn or_not(self) -> OrNot<Self>
    where
        Self: Sized
, { ... }
fn repeated(self) -> Repeated<Self>
    where
        Self: Sized
, { ... }
fn repeated_at_least(self, n: usize) -> Repeated<Self>
    where
        Self: Sized
, { ... }
fn boxed<'a>(self) -> BoxedParser<'a, I, O, Self::Error>
    where
        Self: Sized + 'a
, { ... }
}
Expand description

A trait implemented by parsers.

Parsers take a stream of tokens of type I and attempt to parse them into a value of type O. In doing so, they may encounter errors. These need not be fatal to the parsing process: syntactic errors can be recovered from and a valid output may still be generated alongside any syntax errors that were encountered along the way. Usually, this output comes in the form of an Abstract Syntax Tree (AST).

Parsers currently only support LL(1) grammars. More concretely, this means that the rules that compose this parser are only permitted to ‘look’ a single token into the future to determine the path through the grammar rules to be taken by the parser. Unlike other techniques, such as recursive decent, arbitrary backtracking is not permitted. The reasons for this are numerous, but perhaps the most obvious is that it makes error detection and recovery significantly simpler and easier. In the future, this crate may be extended to support more complex grammars.

LL(1) parsers by themselves are not particularly powerful. Indeed, even very old languages such as C cannot parsed by an LL(1) parser in a single pass. However, this limitation quickly vanishes (and, indeed, makes the design of both the language and the parser easier) when one introduces multiple passes. For example, C compilers generally have a lexical pass prior to the main parser that groups the input characters into tokens.

Associated Types

The type of errors emitted by this parser.

Required methods

Parse a stream with all the bells & whistles. You can use this to implement your own parser combinators. Note that both the signature and semantic requirements of this function are very likely to change in later versions. Where possible, prefer more ergonomic combinators provided elsewhere in the crate rather than implementing your own.

Provided methods

Parse an iterator of tokens, yielding an output if possible, and any errors encountered along the way.

If you don’t care about producing an output if errors are encountered, use Parser::parse instead.

Parse an iterator of tokens, yielding an output or any errors that were encountered along the way.

If you wish to attempt to produce an output even if errors are encountered, use Parser::parse_recovery.

Map the output of this parser to aanother value.

Examples

use chumsky::prelude::*;

#[derive(Debug, PartialEq)]
enum Token { Word(String), Num(u64) }

let word = filter::<_, _, Simple<char>>(|c: &char| c.is_alphabetic())
    .repeated_at_least(1)
    .collect::<String>()
    .map(Token::Word);

let num = filter::<_, _, Simple<char>>(|c: &char| c.is_ascii_digit())
    .repeated_at_least(1)
    .collect::<String>()
    .map(|s| Token::Num(s.parse().unwrap()));

let token = word.or(num);

assert_eq!(token.parse("test"), Ok(Token::Word("test".to_string())));
assert_eq!(token.parse("42"), Ok(Token::Num(42)));

Map the output of this parser to another value, making use of the pattern’s span.

Map the primary error of this parser to another value.

This does not map error emitted by sub-patterns within the parser.

Label the pattern parsed by this parser for more useful error messages.

This is useful when you want to give users a more useful description of an expected pattern than simply a list of possible initial tokens. For example, it’s common to use the term “expression” at a catch-all for a number of different constructs in many languages.

This does not label recovered errors generated by sub-patterns within the parser, only error directly emitted by the parser.

This does not label errors where the labelled pattern consumed input (i.e: in unambiguous cases).

Examples

use chumsky::prelude::*;

let frac = text::digits()
    .chain(just('.'))
    .chain::<char, _, _>(text::digits())
    .collect::<String>()
    .padded_by(end())
    .labelled("number");

assert_eq!(frac.parse("42.3"), Ok("42.3".to_string()));
assert_eq!(frac.parse("hello"), Err(vec![Simple::expected_label_found(Some(0..1), "number", Some('h'))]));
assert_eq!(frac.parse("42!"), Err(vec![Simple::expected_token_found(Some(2..3), vec!['.'], Some('!'))]));

Transform all outputs of this parser to a pretermined value.

Examples

use chumsky::prelude::*;

#[derive(Clone, Debug, PartialEq)]
enum Op { Add, Sub, Mul, Div }

let op = just::<_, Simple<char>>('+').to(Op::Add)
    .or(just('-').to(Op::Sub))
    .or(just('*').to(Op::Mul))
    .or(just('/').to(Op::Div));

assert_eq!(op.parse("+"), Ok(Op::Add));
assert_eq!(op.parse("/"), Ok(Op::Div));

Left-fold the output of the parser into a single value, where the output is of type (_, Vec<_>).

Examples

use chumsky::prelude::*;

let int = text::int::<Simple<char>>()
    .collect::<String>()
    .map(|s| s.parse().unwrap());

let sum = int
    .then(just('+').padding_for(int).repeated())
    .foldl(|a, b| a + b);

assert_eq!(sum.parse("1+12+3+9"), Ok(25));
assert_eq!(sum.parse("6"), Ok(6));

Right-fold the output of the parser into a single value, where the output is of type (Vec<_>, _).

Examples

use chumsky::prelude::*;

let int = text::int::<Simple<char>>()
    .collect::<String>()
    .map(|s| s.parse().unwrap());

let signed = just('+').to(1)
    .or(just('-').to(-1))
    .repeated()
    .then(int)
    .foldr(|a, b| a * b);

assert_eq!(signed.parse("3"), Ok(3));
assert_eq!(signed.parse("-17"), Ok(-17));
assert_eq!(signed.parse("--+-+-5"), Ok(5));

Ignore the output of this parser, yielding () as an output instead.

This can be used to reduce the cost of passing by avoiding unnecessary allocations (most collections containing ZSTs do not allocate). For example, it’s common to want to ignore whitespace in many grammars.

Examples

use chumsky::prelude::*;

// A parser that parses any number of whitespace characters without allocating
let whitespace = filter::<_, _, Simple<char>>(|c: &char| c.is_whitespace())
    .ignored()
    .repeated();

assert_eq!(whitespace.parse("    "), Ok(vec![(); 4]));
assert_eq!(whitespace.parse("  hello"), Ok(vec![(); 2]));

Collect the output of this parser into a collection.

This is commonly useful for collecting Vec<char> outputs into Strings.

Examples

use chumsky::prelude::*;

let word = filter::<_, _, Simple<char>>(|c: &char| c.is_alphabetic())
    .repeated()
    .collect::<String>();

assert_eq!(word.parse("hello"), Ok("hello".to_string()));

Parse one thing and then another thing, yielding a tuple of the two outputs.

Examples

use chumsky::prelude::*;

let word = filter::<_, _, Simple<char>>(|c: &char| c.is_alphabetic())
    .repeated_at_least(1)
    .collect::<String>();
let two_words = word.padded_by(just(' ')).then(word);

assert_eq!(two_words.parse("dog cat"), Ok(("dog".to_string(), "cat".to_string())));
assert!(two_words.parse("hedgehog").is_err());

Parse one thing and then another thing, attempting to chain the two outputs into a Vec.

Examples

use chumsky::prelude::*;

let int = just('-').or_not()
    .chain(filter::<_, _, Simple<char>>(|c: &char| c.is_ascii_digit() && *c != '0')
        .chain(filter::<_, _, Simple<char>>(|c: &char| c.is_ascii_digit()).repeated()))
    .or(just('0').map(|c| vec![c]))
    .padded_by(end())
    .collect::<String>()
    .map(|s| s.parse().unwrap());

assert_eq!(int.parse("0"), Ok(0));
assert_eq!(int.parse("415"), Ok(415));
assert_eq!(int.parse("-50"), Ok(-50));
assert!(int.parse("-0").is_err());
assert!(int.parse("05").is_err());

Flatten a nested collection.

Parse one thing and then another thing, yielding only the output of the latter.

Examples

use chumsky::prelude::*;

let zeroes = filter::<_, _, Simple<char>>(|c: &char| *c == '0').ignored().repeated();
let digits = filter(|c: &char| c.is_ascii_digit()).repeated();
let integer = zeroes
    .padding_for(digits)
    .collect::<String>()
    .map(|s| s.parse().unwrap());

assert_eq!(integer.parse("00064"), Ok(64));
assert_eq!(integer.parse("32"), Ok(32));

Parse one thing and then another thing, yielding only the output of the former.

Examples

use chumsky::prelude::*;

let word = filter::<_, _, Simple<char>>(|c: &char| c.is_alphabetic())
    .repeated_at_least(1)
    .collect::<String>();

let punctuated = word
    .padded_by(just('!').or(just('?')).or_not());

let sentence = punctuated
    .padded() // Allow for whitespace gaps
    .repeated();

assert_eq!(
    sentence.parse("hello! how are you?"),
    Ok(vec![
        "hello".to_string(),
        "how".to_string(),
        "are".to_string(),
        "you".to_string(),
    ]),
);

Parse the pattern surrounded by the given delimiters, performing error recovery where possible.

If parsing of the inner pattern is successful, the output is Some(_). If an error occurs, the output is None.

The delimiters are assumed to allow nesting, so error recovery will attempt to balance the delimiters where possible. A syntax error within the delimiters should not prevent correct parsing of tokens beyond the delimiters.

Examples

use chumsky::prelude::*;

// A LISP-style S-expression
#[derive(Debug, PartialEq)]
enum SExpr {
    Error,
    Ident(String),
    Num(u64),
    List(Vec<SExpr>),
}

let ident = filter::<_, _, Simple<char>>(|c: &char| c.is_alphabetic())
    .repeated_at_least(1)
    .collect::<String>();

let num = text::int()
    .collect::<String>()
    .map(|s| s.parse().unwrap());

let s_expr = recursive(|s_expr| s_expr
    .padded()
    .repeated()
    .delimited_by('(', ')')
    .map(|list| list.map_or(SExpr::Error, SExpr::List))
    .or(ident.map(SExpr::Ident))
    .or(num.map(SExpr::Num)));

// A valid input
assert_eq!(
    s_expr.parse_recovery("(add (mul 42 3) 15)"),
    (
        Some(SExpr::List(vec![
            SExpr::Ident("add".to_string()),
            SExpr::List(vec![
                SExpr::Ident("mul".to_string()),
                SExpr::Num(42),
                SExpr::Num(3),
            ]),
            SExpr::Num(15),
        ])),
        Vec::new(), // No errors!
    ),
);

// An input with a syntax error at position 11! Thankfully, we're able to recover
// and still produce a useful output for later compilation stages (i.e: type-checking).
assert_eq!(
    s_expr.parse_recovery("(add (mul ! 3) 15)"),
    (
        Some(SExpr::List(vec![
            SExpr::Ident("add".to_string()),
            SExpr::Error,
            SExpr::Num(15),
        ])),
        vec![Simple::expected_token_found(Some(10..11), vec!['(', '0', ')'], Some('!'))], // A syntax error!
    ),
);

Parse one thing or, on failure, another thing.

Examples

use chumsky::prelude::*;

let op = just::<_, Simple<char>>('+')
    .or(just('-'))
    .or(just('*'))
    .or(just('/'));

assert_eq!(op.parse("+"), Ok('+'));
assert_eq!(op.parse("/"), Ok('/'));
assert!(op.parse("!").is_err());

Attempt to parse something, but only if it exists.

If parsing of the pattern is successful, the output is Some(_). Otherwise, the output is None.

Examples

use chumsky::prelude::*;

let word = filter::<_, _, Simple<char>>(|c: &char| c.is_alphabetic())
    .repeated_at_least(1)
    .collect::<String>();

let word_or_question = word
    .then(just('?').or_not());

assert_eq!(word_or_question.parse("hello?"), Ok(("hello".to_string(), Some('?'))));
assert_eq!(word_or_question.parse("wednesday"), Ok(("wednesday".to_string(), None)));

Parse an expression any number of times (including zero times).

Input is eagerly parsed. Be aware that the parser will accept no occurences of the pattern too. Consider using Parser::repeated_at_least instead if it better suits your use-case.

Examples

use chumsky::prelude::*;

let num = filter::<_, _, Simple<char>>(|c: &char| c.is_ascii_digit())
    .repeated_at_least(1)
    .collect::<String>()
    .map(|s| s.parse().unwrap());

let sum = num.then(just('+').padding_for(num).repeated())
    .foldl(|a, b| a + b);

assert_eq!(sum.parse("2+13+4+0+5"), Ok(24));

Parse an expression at least a given number of times.

Input is eagerly parsed. If n is 0, this function is equivalent to Parser::repeated

Examples

use chumsky::prelude::*;

let long_word = filter::<_, _, Simple<char>>(|c: &char| c.is_alphabetic())
    .repeated_at_least(5)
    .collect::<String>();

assert_eq!(long_word.parse("hello"), Ok("hello".to_string()));
assert!(long_word.parse("hi").is_err());

Box the parser, yielding a parser that performs parsing through dynamic dispatch.

Boxing a parser might be useful for:

  • Passing a parser over an FFI boundary

  • Getting around compiler implementation problems with long types such as this.

  • Places where you need to name the type of a parser

Boxing a parser is loosely equivalent to boxing other combinators, such as Iterator.

Implementors