Trait Parser

Source

pub trait Parser<I, S = ()> {
    type O;

    // Required method
    fn parse(self, input: I, state: &mut S) -> Option<(Self::O, I)>;
}

Expand description

A parser takes input and mutable state, and maybe yields an output and remaining input.

§History

This trait, and especially its function Parser::parse shapes everything that goes on in this crate. It is inspired to a great extent by a rhyme:

A Parser for Things
is a function from Strings
to Lists of Pairs
of Things and Strings!
(Fritz Ruehr, “Dr. Seuss on Parser Monads”)

Ruehr uses the following type signature for parsers in Haskell:

type Parser a = String -> [(a, String)]

The function signature in this crate diverges from that in several aspects:

It generalises the type of input to be I, whereas it is always String in the Haskell case.
It takes some mutable reference to a state S, which is passed to every sub-parser and can be very useful, for example to keep track of errors.
The largest change, however, is the output type, which is an Option instead of a list in Haskell.

The last change deserves some justification. I tried for some time to recreate Ruehr’s approach, using Iterator as output type, because this most closely resembles Haskell’s lazy lists. (The codename for this experiment was “parasite”, standing for “parsers as iterators”.) I imagined the ideal type signature to resemble this:

trait Parser<I> {
    type O;
    fn parse(&self, x: I) -> Box<dyn Iterator<Item = (Self::O, I)>>;
}

However, as I pursued with this approach, I found myself quickly in lifetime and typing hell. Everything was so hard that I stepped back and asked myself:

Is this too much Voodoo?
(Terry Davis, “Hardest Question in Programming”)

I then thought that for my purposes, I could actually do with a single output (instead of arbitrarily many), thus eliminating lifetime hell that stems from returning a dyn Iterator. And thus parcours was born.

§The Importance of Being Your`self`

You might ask why Parser::parse takes self instead of &self. After all, the former consumes a parser when calling it, whereas the latter allows calling a parser multiple times.

To answer your question, curious reader, let us suppose that the Parser trait was defined as follows:

pub trait Parser<I, S = ()> {
    type O;
    fn parse(&self, input: I, state: &mut S) -> Option<(Self::O, I)>;
}

This is basically the same as what we have in parcours, only that parse takes &self, not self.

The problem with this approach is that it effectively prohibits the construction of parsers that take FnOnce closures. That is because we cannot call an FnOnce if we have a reference to it:

fn call_fn_once(f: &impl FnOnce()) {
    f()
}

And that means that if we have only a reference to a parser, the parser cannot call any FnOnce closure that it might possess. Which is logical, if you consider that a function taking &self can be called arbitrarily often on some object, yet FnOnce means that it can be called, well, only once. This clashes.

So what’s the big deal about FnOnce? Can’t we just use Fn and take &self? Yes, we can. That’s actually the first thing I tried. But this has a double negative impact on performance. First, while it allows us to take a Box<dyn Parser> and call parse on it, I found in benchmarks that just putting a single Box around a (heavily used) parser increased runtime by about 10%. So just being able to box a parser might tempt people to do it, without realising the negative impact on performance. Furthermore, when we use Fn, we tend to have to move and clone more data. Consider the following example, which is a very inefficient implementation of core::iter::once:

pub fn stupid_once<T>(x: T) -> impl Iterator<Item = T> {
    core::iter::once(0).map(|_| x)
}

Here, the compiler errors lead us quickly to the following solution:

pub fn stupid_once<T: Clone>(x: T) -> impl Iterator<Item = T> {
    core::iter::once(0).map(move |_| x.clone())
}

This means that we have to clone x here, even if we know that it will be returned just once. That is because Iterator::map does not accept FnOnce. This pattern showed up a lot when using the alternative Parser trait where parse() takes &self. Aside from looking verbose, it is bad for performance.

On the other hand, take the following implementation:

pub fn nice_once<T>(x: T) -> Option<T> {
    Some(0).map(|_| x)
}

Here, because Option::map takes FnOnce, we neither have to move nor clone x.

All this means that once there was the decision for Parser::parse to return Option, there was a clear incentive to take self in Parser::parse in order to allow for FnOnce.

Required Associated Types§

Source

type O

Output of the parser.

Required Methods§

Source

fn parse(self, input: I, state: &mut S) -> Option<(Self::O, I)>

Parse a value of type Self::O.

Trait ParserCopy item path

§History

§The Importance of Being Yourself

Required Associated Types§

type O

Required Methods§

fn parse(self, input: I, state: &mut S) -> Option<(Self::O, I)>

Implementors§

impl<'a, S, F: FnMut(&u8, &mut S) -> bool> Parser<&'a str, S> for TakeWhile<F>

type O = &'a str

impl<'a, S, P: Parser<&'a str, S>> Parser<&'a str, S> for parcours::str::WithConsumed<P>

type O = (<P as Parser<&'a str, S>>::O, &'a str)

impl<'a, T, S, F: FnOnce(&T, &mut S) -> bool> Parser<&'a [T], S> for FirstFilter<F>

type O = &'a T

impl<'a, T, S, P: Parser<&'a [T], S>> Parser<&'a [T], S> for parcours::slice::WithConsumed<P>

type O = (<P as Parser<&'a [T], S>>::O, &'a [T])

impl<'a, T, U, S, F: FnOnce(&T, &mut S) -> Option<U>> Parser<&'a [T], S> for FirstFilterMap<F>

type O = U

impl<I, S, O, F: FnOnce(I, &mut S) -> Option<(O, I)>> Parser<I, S> for FromFn<F>

type O = O

impl<I, S, P1: Parser<I, S>> Parser<I, S> for All<(P1,)>

type O = (<P1 as Parser<I, S>>::O,)

impl<I, S, P1: Parser<I, S>, P2: Parser<I, S>> Parser<I, S> for All<(P1, P2)>

type O = (<P1 as Parser<I, S>>::O, <P2 as Parser<I, S>>::O)

impl<I, S, P1: Parser<I, S>, P2: Parser<I, S>, F: FnOnce(P1::O) -> P2> Parser<I, S> for AndThen<P1, F>

type O = <P2 as Parser<I, S>>::O

impl<I, S, P1: Parser<I, S>, P2: Parser<I, S>, P3: Parser<I, S>> Parser<I, S> for All<(P1, P2, P3)>

type O = (<P1 as Parser<I, S>>::O, <P2 as Parser<I, S>>::O, <P3 as Parser<I, S>>::O)

impl<I, S, P1: Parser<I, S>, P2: Parser<I, S>, P3: Parser<I, S>, P4: Parser<I, S>> Parser<I, S> for All<(P1, P2, P3, P4)>

type O = (<P1 as Parser<I, S>>::O, <P2 as Parser<I, S>>::O, <P3 as Parser<I, S>>::O, <P4 as Parser<I, S>>::O)

impl<I, S, P1: Parser<I, S>, P2: Parser<I, S>, P3: Parser<I, S>, P4: Parser<I, S>, P5: Parser<I, S>> Parser<I, S> for All<(P1, P2, P3, P4, P5)>

type O = (<P1 as Parser<I, S>>::O, <P2 as Parser<I, S>>::O, <P3 as Parser<I, S>>::O, <P4 as Parser<I, S>>::O, <P5 as Parser<I, S>>::O)

impl<I, S, P1: Parser<I, S>, P2: Parser<I, S>, P3: Parser<I, S>, P4: Parser<I, S>, P5: Parser<I, S>, P6: Parser<I, S>> Parser<I, S> for All<(P1, P2, P3, P4, P5, P6)>

type O = (<P1 as Parser<I, S>>::O, <P2 as Parser<I, S>>::O, <P3 as Parser<I, S>>::O, <P4 as Parser<I, S>>::O, <P5 as Parser<I, S>>::O, <P6 as Parser<I, S>>::O)

impl<I, S, P1: Parser<I, S>, P2: Parser<I, S>, P3: Parser<I, S>, P4: Parser<I, S>, P5: Parser<I, S>, P6: Parser<I, S>, P7: Parser<I, S>> Parser<I, S> for All<(P1, P2, P3, P4, P5, P6, P7)>

type O = (<P1 as Parser<I, S>>::O, <P2 as Parser<I, S>>::O, <P3 as Parser<I, S>>::O, <P4 as Parser<I, S>>::O, <P5 as Parser<I, S>>::O, <P6 as Parser<I, S>>::O, <P7 as Parser<I, S>>::O)

impl<I, S, P1: Parser<I, S>, P2: Parser<I, S>, P3: Parser<I, S>, P4: Parser<I, S>, P5: Parser<I, S>, P6: Parser<I, S>, P7: Parser<I, S>, P8: Parser<I, S>> Parser<I, S> for All<(P1, P2, P3, P4, P5, P6, P7, P8)>

type O = (<P1 as Parser<I, S>>::O, <P2 as Parser<I, S>>::O, <P3 as Parser<I, S>>::O, <P4 as Parser<I, S>>::O, <P5 as Parser<I, S>>::O, <P6 as Parser<I, S>>::O, <P7 as Parser<I, S>>::O, <P8 as Parser<I, S>>::O)

impl<I, S, P1: Parser<I, S>, P2: Parser<I, S>, P3: Parser<I, S>, P4: Parser<I, S>, P5: Parser<I, S>, P6: Parser<I, S>, P7: Parser<I, S>, P8: Parser<I, S>, P9: Parser<I, S>> Parser<I, S> for All<(P1, P2, P3, P4, P5, P6, P7, P8, P9)>

type O = (<P1 as Parser<I, S>>::O, <P2 as Parser<I, S>>::O, <P3 as Parser<I, S>>::O, <P4 as Parser<I, S>>::O, <P5 as Parser<I, S>>::O, <P6 as Parser<I, S>>::O, <P7 as Parser<I, S>>::O, <P8 as Parser<I, S>>::O, <P9 as Parser<I, S>>::O)

impl<I, S, P: Parser<I, S>, F: FnOnce(&P::O) -> bool> Parser<I, S> for Filter<P, F>

type O = <P as Parser<I, S>>::O

impl<I, S, P: Parser<I, S>, F: FnOnce() -> P> Parser<I, S> for Lazy<F>

type O = <P as Parser<I, S>>::O

impl<I, S, P: Parser<I, S>, O, F: FnOnce(P::O) -> Option<O>> Parser<I, S> for FilterMap<P, F>

type O = O

impl<I, S, P: Parser<I, S>, O, F: FnOnce(P::O) -> O> Parser<I, S> for Map<P, F>

type O = O

impl<I, S, P: Parser<I, S>, O, F: FnOnce(P::O, &mut S) -> O> Parser<I, S> for MapWith<P, F>

type O = O

type O = <Last as Parser<I, S>>::O

type O = <Last as Parser<I, S>>::O

type O = <Last as Parser<I, S>>::O

type O = <Last as Parser<I, S>>::O

type O = <Last as Parser<I, S>>::O

type O = <Last as Parser<I, S>>::O

type O = <Last as Parser<I, S>>::O

impl<I: Clone, S, L1, R1, F1, L2, R2, F2, Last: Parser<I, S>> Parser<I, S> for Decide<((L1, F1), (L2, F2)), Last>where L1: Parser<I, S>, R1: Parser<I, S, O = Last::O>, F1: FnOnce(L1::O) -> R1, L2: Parser<I, S>, R2: Parser<I, S, O = Last::O>, F2: FnOnce(L2::O) -> R2,

type O = <Last as Parser<I, S>>::O

impl<I: Clone, S, L1, R1, F1, Last: Parser<I, S>> Parser<I, S> for Decide<((L1, F1),), Last>where L1: Parser<I, S>, R1: Parser<I, S, O = Last::O>, F1: FnOnce(L1::O) -> R1,

type O = <Last as Parser<I, S>>::O

impl<I: Clone, S, O, P1: Parser<I, S, O = O>> Parser<I, S> for Any<(P1,)>

type O = O

impl<I: Clone, S, O, P1: Parser<I, S, O = O>, P2: Parser<I, S, O = O>> Parser<I, S> for Any<(P1, P2)>

type O = O

impl<I: Clone, S, O, P1: Parser<I, S, O = O>, P2: Parser<I, S, O = O>, P3: Parser<I, S, O = O>> Parser<I, S> for Any<(P1, P2, P3)>

type O = O

impl<I: Clone, S, O, P1: Parser<I, S, O = O>, P2: Parser<I, S, O = O>, P3: Parser<I, S, O = O>, P4: Parser<I, S, O = O>> Parser<I, S> for Any<(P1, P2, P3, P4)>

type O = O

impl<I: Clone, S, O, P1: Parser<I, S, O = O>, P2: Parser<I, S, O = O>, P3: Parser<I, S, O = O>, P4: Parser<I, S, O = O>, P5: Parser<I, S, O = O>> Parser<I, S> for Any<(P1, P2, P3, P4, P5)>

type O = O

impl<I: Clone, S, O, P1: Parser<I, S, O = O>, P2: Parser<I, S, O = O>, P3: Parser<I, S, O = O>, P4: Parser<I, S, O = O>, P5: Parser<I, S, O = O>, P6: Parser<I, S, O = O>> Parser<I, S> for Any<(P1, P2, P3, P4, P5, P6)>

type O = O

impl<I: Clone, S, O, P1: Parser<I, S, O = O>, P2: Parser<I, S, O = O>, P3: Parser<I, S, O = O>, P4: Parser<I, S, O = O>, P5: Parser<I, S, O = O>, P6: Parser<I, S, O = O>, P7: Parser<I, S, O = O>> Parser<I, S> for Any<(P1, P2, P3, P4, P5, P6, P7)>

type O = O

impl<I: Clone, S, O, P1: Parser<I, S, O = O>, P2: Parser<I, S, O = O>, P3: Parser<I, S, O = O>, P4: Parser<I, S, O = O>, P5: Parser<I, S, O = O>, P6: Parser<I, S, O = O>, P7: Parser<I, S, O = O>, P8: Parser<I, S, O = O>> Parser<I, S> for Any<(P1, P2, P3, P4, P5, P6, P7, P8)>

type O = O

type O = O

impl<I: Clone, S, O, P: Parser<I, S, O = O>, const N: usize> Parser<I, S> for Any<[P; N]>

type O = O

Trait Parser

§The Importance of Being Your`self`

impl<I: Clone, S, L1, R1, F1, L2, R2, F2, Last: Parser<I, S>> Parser<I, S> for Decide<((L1, F1), (L2, F2)), Last>
where L1: Parser<I, S>, R1: Parser<I, S, O = Last::O>, F1: FnOnce(L1::O) -> R1, L2: Parser<I, S>, R2: Parser<I, S, O = Last::O>, F2: FnOnce(L2::O) -> R2,

impl<I: Clone, S, L1, R1, F1, Last: Parser<I, S>> Parser<I, S> for Decide<((L1, F1),), Last>
where L1: Parser<I, S>, R1: Parser<I, S, O = Last::O>, F1: FnOnce(L1::O) -> R1,

impl<I: Clone, S, P, Sep, O: Extend<P::O>, OF> Parser<I, S> for SeparatedBy<P, Sep, OF>
where P: Parser<I, S> + Clone, Sep: Parser<I, S> + Clone, OF: FnOnce() -> O,

impl<I: Clone, S, P: Parser<I, S>, Sep: Parser<I, S>, O: Extend<P::O>, PF, SepF, OF> Parser<I, S> for SeparateBy<PF, SepF, OF>
where PF: FnMut() -> P, SepF: FnMut() -> Sep, OF: FnOnce() -> O,