Trait chumsky::Parser[][src]

pub trait Parser<I: Clone, O> {
    type Error: Error<I>;
Show 28 methods fn parse_recovery<'a, Iter: Iterator<Item = (I, <Self::Error as Error<I>>::Span)> + 'a, S: Into<Stream<'a, I, <Self::Error as Error<I>>::Span, Iter>>>(
        &self,
        stream: S
    ) -> (Option<O>, Vec<Self::Error>)
    where
        Self: Sized
, { ... }
fn parse_recovery_verbose<'a, Iter: Iterator<Item = (I, <Self::Error as Error<I>>::Span)> + 'a, S: Into<Stream<'a, I, <Self::Error as Error<I>>::Span, Iter>>>(
        &self,
        stream: S
    ) -> (Option<O>, Vec<Self::Error>)
    where
        Self: Sized
, { ... }
fn parse<'a, Iter: Iterator<Item = (I, <Self::Error as Error<I>>::Span)> + 'a, S: Into<Stream<'a, I, <Self::Error as Error<I>>::Span, Iter>>>(
        &self,
        stream: S
    ) -> Result<O, Vec<Self::Error>>
    where
        Self: Sized
, { ... }
fn debug<T: Display + 'static>(self, x: T) -> Debug<Self>
    where
        Self: Sized
, { ... }
fn map<U, F: Fn(O) -> U>(self, f: F) -> Map<Self, F, O>
    where
        Self: Sized
, { ... }
fn map_with_span<U, F: Fn(O, <Self::Error as Error<I>>::Span) -> U>(
        self,
        f: F
    ) -> MapWithSpan<Self, F, O>
    where
        Self: Sized
, { ... }
fn map_err<F: Fn(Self::Error) -> Self::Error>(self, f: F) -> MapErr<Self, F>
    where
        Self: Sized
, { ... }
fn try_map<U, F: Fn(O, <Self::Error as Error<I>>::Span) -> Result<U, Self::Error>>(
        self,
        f: F
    ) -> TryMap<Self, F, O>
    where
        Self: Sized
, { ... }
fn validate<F: Fn(O, <Self::Error as Error<I>>::Span, &mut dyn FnMut(Self::Error)) -> O>(
        self,
        f: F
    ) -> Validate<Self, F>
    where
        Self: Sized
, { ... }
fn labelled<L: Into<<Self::Error as Error<I>>::Label> + Clone>(
        self,
        label: L
    ) -> Label<Self, L>
    where
        Self: Sized
, { ... }
fn to<U: Clone>(self, x: U) -> To<Self, O, U>
    where
        Self: Sized
, { ... }
fn foldl<A, B, F: Fn(A, B::Item) -> A>(self, f: F) -> Foldl<Self, F, A, B>
    where
        Self: Parser<I, (A, B)> + Sized,
        B: IntoIterator
, { ... }
fn foldr<'a, A, B, F: Fn(A::Item, B) -> B + 'a>(
        self,
        f: F
    ) -> Foldr<Self, F, A, B>
    where
        Self: Parser<I, (A, B)> + Sized,
        A: IntoIterator,
        A::IntoIter: DoubleEndedIterator
, { ... }
fn ignored(self) -> Ignored<Self, O>
    where
        Self: Sized
, { ... }
fn collect<C: FromIterator<O::Item>>(self) -> Map<Self, fn(_: O) -> C, O>
    where
        Self: Sized,
        O: IntoIterator
, { ... }
fn then<U, P: Parser<I, U>>(self, other: P) -> Then<Self, P>
    where
        Self: Sized
, { ... }
fn chain<T, U, P: Parser<I, U, Error = Self::Error>>(
        self,
        other: P
    ) -> Map<Then<Self, P>, fn(_: (O, U)) -> Vec<T>, (O, U)>
    where
        Self: Sized,
        U: Chain<T>,
        O: Chain<T>
, { ... }
fn flatten<T, Inner>(self) -> Map<Self, fn(_: O) -> Vec<T>, O>
    where
        Self: Sized,
        O: IntoIterator<Item = Inner>,
        Inner: IntoIterator<Item = T>
, { ... }
fn ignore_then<U, P: Parser<I, U>>(
        self,
        other: P
    ) -> IgnoreThen<Self, P, O, U>
    where
        Self: Sized
, { ... }
fn then_ignore<U, P: Parser<I, U>>(
        self,
        other: P
    ) -> ThenIgnore<Self, P, O, U>
    where
        Self: Sized
, { ... }
fn padded_by<U, P: Parser<I, U, Error = Self::Error> + Clone>(
        self,
        other: P
    ) -> ThenIgnore<IgnoreThen<P, Self, U, O>, P, O, U>
    where
        Self: Sized
, { ... }
fn delimited_by(self, start: I, end: I) -> DelimitedBy<Self, I>
    where
        Self: Sized
, { ... }
fn or<P: Parser<I, O>>(self, other: P) -> Or<Self, P>
    where
        Self: Sized
, { ... }
fn recover_with<S: Strategy<I, O, Self::Error>>(
        self,
        strategy: S
    ) -> Recovery<Self, S>
    where
        Self: Sized
, { ... }
fn or_not(self) -> OrNot<Self>
    where
        Self: Sized
, { ... }
fn repeated(self) -> Repeated<Self>
    where
        Self: Sized
, { ... }
fn separated_by<U, P: Parser<I, U>>(
        self,
        other: P
    ) -> SeparatedBy<Self, P, U>
    where
        Self: Sized
, { ... }
fn boxed<'a>(self) -> BoxedParser<'a, I, O, Self::Error>
    where
        Self: Sized + 'a
, { ... }
}
Expand description

A trait implemented by parsers.

Parsers take a stream of tokens of type I and attempt to parse them into a value of type O. In doing so, they may encounter errors. These need not be fatal to the parsing process: syntactic errors can be recovered from and a valid output may still be generated alongside any syntax errors that were encountered along the way. Usually, this output comes in the form of an Abstract Syntax Tree (AST).

You should not need to implement this trait by hand. If you cannot combine existing combintors (and in particular custom) to create the combinator you’re looking for, please open an issue! If you really need to implement this trait, please check the documentation in the source: some implementation details have been deliberately hidden.

Associated Types

The type of errors emitted by this parser.

Provided methods

Parse a stream of tokens, yielding an output if possible, and any errors encountered along the way.

If you don’t care about producing an output if errors are encountered, use Parser::parse instead.

Although the signature of this function looks complicated, it’s simpler than you think! You can pass a [&[I]], a &str, or a Stream to it.

Parse a stream of tokens, yielding an output if possible, and any errors encountered along the way. Unlike Parser::parse_recovery, this function will produce verbose debugging output as it executes.

If you don’t care about producing an output if errors are encountered, use Parser::parse instead.

Although the signature of this function looks complicated, it’s simpler than you think! You can pass a [&[I]], a &str, or a Stream to it.

You’ll probably want to make sure that this doesn’t end up in production code: it exists only to help you debug your parser. Additionally, its API is quite likely to change in future versions.

Parse a stream of tokens, yielding an output or any errors that were encountered along the way.

If you wish to attempt to produce an output even if errors are encountered, use Parser::parse_recovery.

Although the signature of this function looks complicated, it’s simpler than you think! You can pass a [&[I]], a &str, or a Stream to it.

Include this parser in the debugging output produced by Parser::parse_recovery_verbose.

You’ll probably want to make sure that this doesn’t end up in production code: it exists only to help you debug your parser. Additionally, its API is quite likely to change in future versions. Use this parser like a print statement, to display whatever you pass as the argument ‘x’

Map the output of this parser to another value.

Examples
#[derive(Debug, PartialEq)]
enum Token { Word(String), Num(u64) }

let word = filter::<_, _, Cheap<char>>(|c: &char| c.is_alphabetic())
    .repeated().at_least(1)
    .collect::<String>()
    .map(Token::Word);

let num = filter::<_, _, Cheap<char>>(|c: &char| c.is_ascii_digit())
    .repeated().at_least(1)
    .collect::<String>()
    .map(|s| Token::Num(s.parse().unwrap()));

let token = word.or(num);

assert_eq!(token.parse("test"), Ok(Token::Word("test".to_string())));
assert_eq!(token.parse("42"), Ok(Token::Num(42)));

Map the output of this parser to another value, making use of the pattern’s overall span.

This is very useful when generating an AST that attaches a span to each AST node.

Map the primary error of this parser to another value.

This function is most useful when using a custom error type, allowing you to augment errors according to context.

After a successful parse, apply a fallible function to the output. If the function produces an error, treat it as a parsing error.

Examples
let byte = text::int::<_, Simple<char>>(10)
    .try_map(|s, span| s
        .parse::<u8>()
        .map_err(|e| Simple::custom(span, format!("{}", e))));

assert!(byte.parse("255").is_ok());
assert!(byte.parse("256").is_err()); // Out of range

Validate an output, producing non-terminal errors if it does not fulfil certain criteria.

Examples
let large_int = text::int::<char, _>(10)
    .map(|s| s.parse().unwrap())
    .validate(|x: u32, span, emit| {
        if x < 256 { emit(Simple::custom(span, format!("{} must be 256 or higher.", x))) }
        x
    });

assert_eq!(large_int.parse("537"), Ok(537));
assert!(large_int.parse("243").is_err());

Label the pattern parsed by this parser for more useful error messages.

This is useful when you want to give users a more useful description of an expected pattern than simply a list of possible initial tokens. For example, it’s common to use the term “expression” at a catch-all for a number of different constructs in many languages.

This does not label recovered errors generated by sub-patterns within the parser, only error directly emitted by the parser.

This does not label errors where the labelled pattern consumed input (i.e: in unambiguous cases).

Examples
let frac = text::digits(10)
    .chain(just('.'))
    .chain::<char, _, _>(text::digits(10))
    .collect::<String>()
    .then_ignore(end())
    .labelled("number");

assert_eq!(frac.parse("42.3"), Ok("42.3".to_string()));
assert_eq!(frac.parse("hello"), Err(vec![Cheap::expected_input_found(0..1, None, Some('h')).with_label("number")]));
assert_eq!(frac.parse("42!"), Err(vec![Cheap::expected_input_found(2..3, Some('.'), Some('!')).with_label("number")]));

Transform all outputs of this parser to a pretermined value.

Examples
#[derive(Clone, Debug, PartialEq)]
enum Op { Add, Sub, Mul, Div }

let op = just::<_, Cheap<char>>('+').to(Op::Add)
    .or(just('-').to(Op::Sub))
    .or(just('*').to(Op::Mul))
    .or(just('/').to(Op::Div));

assert_eq!(op.parse("+"), Ok(Op::Add));
assert_eq!(op.parse("/"), Ok(Op::Div));

Left-fold the output of the parser into a single value.

The output of the original parser must be of type (A, impl IntoIterator<Item = B>).

Examples
let int = text::int::<char, Cheap<char>>(10)
    .map(|s| s.parse().unwrap());

let sum = int
    .then(just('+').ignore_then(int).repeated())
    .foldl(|a, b| a + b);

assert_eq!(sum.parse("1+12+3+9"), Ok(25));
assert_eq!(sum.parse("6"), Ok(6));

Right-fold the output of the parser into a single value.

The output of the original parser must be of type (impl IntoIterator<Item = A>, B). Because right-folds work backwards, the iterator must implement DoubleEndedIterator so that it can be reversed.

Examples
let int = text::int::<char, Cheap<char>>(10)
    .map(|s| s.parse().unwrap());

let signed = just('+').to(1)
    .or(just('-').to(-1))
    .repeated()
    .then(int)
    .foldr(|a, b| a * b);

assert_eq!(signed.parse("3"), Ok(3));
assert_eq!(signed.parse("-17"), Ok(-17));
assert_eq!(signed.parse("--+-+-5"), Ok(5));

Ignore the output of this parser, yielding () as an output instead.

This can be used to reduce the cost of parsing by avoiding unnecessary allocations (most collections containing ZSTs do not allocate). For example, it’s common to want to ignore whitespace in many grammars (see text::whitespace).

Examples
// A parser that parses any number of whitespace characters without allocating
let whitespace = filter::<_, _, Cheap<char>>(|c: &char| c.is_whitespace())
    .ignored()
    .repeated();

assert_eq!(whitespace.parse("    "), Ok(vec![(); 4]));
assert_eq!(whitespace.parse("  hello"), Ok(vec![(); 2]));

Collect the output of this parser into a type implementing FromIterator.

This is commonly useful for collecting Vec<char> outputs into Strings, or [(T, U)] into a HashMap and is analagous to Iterator::collect.

Examples
let word = filter::<_, _, Cheap<char>>(|c: &char| c.is_alphabetic()) // This parser produces an output of `char`
    .repeated() // This parser produces an output of `Vec<char>`
    .collect::<String>(); // But `Vec<char>` is less useful than `String`, so convert to the latter

assert_eq!(word.parse("hello"), Ok("hello".to_string()));

Parse one thing and then another thing, yielding a tuple of the two outputs.

Examples
let word = filter::<_, _, Cheap<char>>(|c: &char| c.is_alphabetic())
    .repeated().at_least(1)
    .collect::<String>();
let two_words = word.then_ignore(just(' ')).then(word);

assert_eq!(two_words.parse("dog cat"), Ok(("dog".to_string(), "cat".to_string())));
assert!(two_words.parse("hedgehog").is_err());

Parse one thing and then another thing, attempting to chain the two outputs into a Vec.

Examples
let int = just('-').or_not()
    .chain(filter::<_, _, Cheap<char>>(|c: &char| c.is_ascii_digit() && *c != '0')
        .chain(filter::<_, _, Cheap<char>>(|c: &char| c.is_ascii_digit()).repeated()))
    .or(just('0').map(|c| vec![c]))
    .then_ignore(end())
    .collect::<String>()
    .map(|s| s.parse().unwrap());

assert_eq!(int.parse("0"), Ok(0));
assert_eq!(int.parse("415"), Ok(415));
assert_eq!(int.parse("-50"), Ok(-50));
assert!(int.parse("-0").is_err());
assert!(int.parse("05").is_err());

Flatten a nested collection.

This use-cases of this method are broadly similar to those of Iterator::flatten.

Parse one thing and then another thing, yielding only the output of the latter.

Examples
let zeroes = filter::<_, _, Cheap<char>>(|c: &char| *c == '0').ignored().repeated();
let digits = filter(|c: &char| c.is_ascii_digit()).repeated();
let integer = zeroes
    .ignore_then(digits)
    .collect::<String>()
    .map(|s| s.parse().unwrap());

assert_eq!(integer.parse("00064"), Ok(64));
assert_eq!(integer.parse("32"), Ok(32));

Parse one thing and then another thing, yielding only the output of the former.

Examples
let word = filter::<_, _, Cheap<char>>(|c: &char| c.is_alphabetic())
    .repeated().at_least(1)
    .collect::<String>();

let punctuated = word
    .then_ignore(just('!').or(just('?')).or_not());

let sentence = punctuated
    .padded() // Allow for whitespace gaps
    .repeated();

assert_eq!(
    sentence.parse("hello! how are you?"),
    Ok(vec![
        "hello".to_string(),
        "how".to_string(),
        "are".to_string(),
        "you".to_string(),
    ]),
);

Parse a pattern, but with an instance of another pattern on either end, yielding the output of the inner.

Examples
let ident = text::ident::<_, Simple<char>>()
    .padded_by(just('!'));

assert_eq!(ident.parse("!hello!"), Ok("hello".to_string()));
assert!(ident.parse("hello!").is_err());
assert!(ident.parse("!hello").is_err());
assert!(ident.parse("hello").is_err());

Parse the pattern surrounded by the given delimiters.

Examples
// A LISP-style S-expression
#[derive(Debug, PartialEq)]
enum SExpr {
    Ident(String),
    Num(u64),
    List(Vec<SExpr>),
}

let ident = filter::<_, _, Cheap<char>>(|c: &char| c.is_alphabetic())
    .repeated().at_least(1)
    .collect::<String>();

let num = text::int(10)
    .map(|s: String| s.parse().unwrap());

let s_expr = recursive(|s_expr| s_expr
    .padded()
    .repeated()
    .map(SExpr::List)
    .delimited_by('(', ')')
    .or(ident.map(SExpr::Ident))
    .or(num.map(SExpr::Num)));

// A valid input
assert_eq!(
    s_expr.parse_recovery("(add (mul 42 3) 15)"),
    (
        Some(SExpr::List(vec![
            SExpr::Ident("add".to_string()),
            SExpr::List(vec![
                SExpr::Ident("mul".to_string()),
                SExpr::Num(42),
                SExpr::Num(3),
            ]),
            SExpr::Num(15),
        ])),
        Vec::new(), // No errors!
    ),
);

Parse one thing or, on failure, another thing.

If the first parser produces an error, even if recovered, both parsers will be tried and the ‘most correct’ result of either will be produced. ‘Most correct’ is not a well-defined term and its meaning may change in future versions. For now, it means that non-terminal errors are preferred over terminal errors and the number of errors is minimised, where possible. Failing this, the parser will look at which parser made the most progress through the input and choose which result to use based on that. The fact that this behaviour is not ruggedly defined is not a problem. By its nature, it only occurs when the parser encounters an error and so can never result in valid syntax failing to be parsed.

Examples
let op = just::<_, Cheap<char>>('+')
    .or(just('-'))
    .or(just('*'))
    .or(just('/'));

assert_eq!(op.parse("+"), Ok('+'));
assert_eq!(op.parse("/"), Ok('/'));
assert!(op.parse("!").is_err());

Apply a fallback recovery strategy to this parser should it fail.

There is no silver bullet for error recovery, so this function allows you to specify one of several different strategies at the location of your choice.

Note that for implementation reasons, adding an error recovery strategy can cause a parser to ‘over-commit’, missing potentially valid alternative parse routes (TODO: document this and explain why and when it happens). Rest assured that this case is generally quite rare and only happens for very loose, almost-ambiguous syntax. If you run into cases that you believe should parse but do not, try removing or moving recovery strategies to fix the problem.

Examples
#[derive(Debug, PartialEq)]
enum Expr {
    Error,
    Int(String),
    List(Vec<Expr>),
}

let expr = recursive::<_, _, _, _, Simple<char>>(|expr| expr
    .separated_by(just(','))
    .delimited_by('[', ']')
    .map(Expr::List)
    // If parsing a list expression fails, recover at the next delimiter, generating an error AST node
    .recover_with(nested_delimiters('[', ']', [], |_| Expr::Error))
    .or(text::int(10).map(Expr::Int))
    .padded());

assert!(expr.parse("five").is_err()); // Text is not a valid expression in this language...
assert!(expr.parse("[1, 2, 3]").is_ok()); // ...but lists and numbers are!

// This input has two syntax errors...
let (ast, errors) = expr.parse_recovery("[[1, two], [3, four]]");
// ...and error recovery allows us to catch both of them!
assert_eq!(errors.len(), 2);
// Additionally, the AST we get back still has useful information.
assert_eq!(ast, Some(Expr::List(vec![Expr::Error, Expr::Error])));

Attempt to parse something, but only if it exists.

If parsing of the pattern is successful, the output is Some(_). Otherwise, the output is None.

Examples
let word = filter::<_, _, Cheap<char>>(|c: &char| c.is_alphabetic())
    .repeated().at_least(1)
    .collect::<String>();

let word_or_question = word
    .then(just('?').or_not());

assert_eq!(word_or_question.parse("hello?"), Ok(("hello".to_string(), Some('?'))));
assert_eq!(word_or_question.parse("wednesday"), Ok(("wednesday".to_string(), None)));

Parse an expression any number of times (including zero times).

Input is eagerly parsed. Be aware that the parser will accept no occurences of the pattern too. Consider using Repeated::at_least instead if it better suits your use-case.

Examples
let num = filter::<_, _, Cheap<char>>(|c: &char| c.is_ascii_digit())
    .repeated().at_least(1)
    .collect::<String>()
    .map(|s| s.parse().unwrap());

let sum = num.then(just('+').ignore_then(num).repeated())
    .foldl(|a, b| a + b);

assert_eq!(sum.parse("2+13+4+0+5"), Ok(24));

Parse an expression, separated by another, any number of times.

You can use SeparatedBy::allow_leading or SeparatedBy::allow_trailing to allow leading or trailing separators.

Examples
let shopping = text::ident::<_, Simple<char>>()
    .padded()
    .separated_by(just(','));

assert_eq!(shopping.parse("eggs"), Ok(vec!["eggs".to_string()]));
assert_eq!(shopping.parse("eggs, flour, milk"), Ok(vec!["eggs".to_string(), "flour".to_string(), "milk".to_string()]));

See SeparatedBy::allow_leading and SeparatedBy::allow_trailing for more examples.

Box the parser, yielding a parser that performs parsing through dynamic dispatch.

Boxing a parser might be useful for:

  • Dynamically building up parsers at runtime

  • Improving compilation times (Rust can struggle to compile code containing very long times)

  • Passing a parser over an FFI boundary

  • Getting around compiler implementation problems with long types such as this.

  • Places where you need to name the type of a parser

Boxing a parser is broadly equivalent to boxing other combinators, such as Iterator.

Implementations on Foreign Types

Implementors