Crate parcours

source ·
Expand description

Parser Combinators for Unique Results.

This crate provides building blocks to create parsers (and lexers) that return at most one output.

§Quickstart

The following shows a CSV line parser realised in parcours:

use parcours::{Parser, Combinator};

// CSV line parser               👇 input 👇 state
fn csv_line<'a>() -> impl Parser<&'a str, (), O = Vec<&'a str>> {
    use parcours::str::{matches, take_while};
    take_while(|c, _| *c != b',').separated_by(matches(","))
}

assert_eq!(
    //               👇 input                👇 state
    csv_line().parse("apples,oranges,pears", &mut ()),
    //    👇 output                           👇 remaining input
    Some((vec!["apples", "oranges", "pears"], ""))
);

§Examples

parcours provides a few examples that are each designed to teach a few concepts:

  • JSON: zero-copy string parsing, combinators
  • Lambda calculus: error handling & mutable state
  • bc: separate lexer/parser, precedence climbing

§Topics

§Error Recovery

My opinion on error recovery has been shaped by the lecture of Don’t Panic! Better, Fewer, Syntax Errors for LR Parsers by Lukas Diekmann and Laurence Tratt, who write in their introduction:

When error recovery works well, it is a useful productivity gain. Unfortunately, most current error recovery approaches are simplistic. The most common grammar-neutral approach to error recovery are those algorithms described as “panic mode” […] which skip input until the parser finds something it is able to parse. A more grammar-specific variation of this idea is to skip input until a pre-determined synchronisation token (e.g. ‘;’ in Java) is reached […], or to try inserting a single synchronisation token. Such strategies are often unsuccessful, leading to a cascade of spurious syntax errors […]. Programmers quickly learn that only the location of the first error in a file — not the reported repair, nor the location of subsequent errors — can be relied upon to be accurate.

Their proposed solution is for LR parsers, which parcours cannot adapt. Therefore, instead of adopting a “simplistic” error recovery approach, I leave all control about error recovery to the user of the parcours. While parcours can be used to emit multiple error messages due to its mutable state, I have found it most useful to emit only the first syntax error message, as I find myself only looking at the first syntax error anyway (as observed in the citation). If you want support for multiple parse errors that are automatically derived and good, you may consider using the grmtools suite of the authors.

Modules§

  • Create new parsers by combining existing ones.
  • Precedence climbing for parsing expressions with binary operators.
  • Parsers for slice input.
  • Parsers for &str input.

Macros§

  • Lazily construct a parser from a function.
  • Pattern matching for successful cases.

Structs§

Traits§

  • A combinator combines parsers to form new ones.
  • A parser takes input and mutable state, and maybe yields an output and remaining input.

Functions§

  • Return outputs of all provided parsers, if all succeed.
  • Return output of the first provided parser that succeeds.
  • Construct a parser from a function.
  • Lazily construct a parser from a function.
  • Apply a parser p() as often as possible.
  • Apply a parser p() as often as possible, separated by sep().