Module chumsky::pratt

source ·
Available on crate feature pratt only.
Expand description

Utilities for parsing expressions using Pratt parsing.

“Who am I? What is my purpose in life? Does it really, cosmically speaking, matter if I don’t get up and go to work?”

Pratt parsing is a powerful technique for defining and parsing operators of varying arity, precedence, and associativity. Unlike precedence climbing, which defines operator precedence by structurally composing parsers of decreasing precedence, Pratt parsing defines precedence through a numerical ‘binding power’ that determines how strongly operators should bind to the operands around them.

Pratt parsers are defined with the Parser::pratt method.

When writing pratt parsers, it is necessary to first define an ‘atomic’ operand used by the parser for building up expressions. In most languages, atoms are simple, self-delimiting patterns such as numeric and string literals, identifiers, or parenthesised expressions. Once an atom has been defined, operators can also be defined that operate upon said atoms.

Fold functions

Because operators bind atoms together, pratt parsers require you to specify, for each operator, a function that combines its operands together into a syntax tree. These functions are given as the last arguments of infix, prefix, and postfix.

Fold functions have several overloads, allowing you to make use of only the operands, the operands and the operators, and even additionally a Span that covers the entire operation. See the documentation for each function to see which fold signatures can be used.

Examples

use chumsky::prelude::*;
use chumsky::pratt::*;
use chumsky::extra;

enum Expr {
    Add(Box<Self>, Box<Self>),
    Sub(Box<Self>, Box<Self>),
    Pow(Box<Self>, Box<Self>),
    Neg(Box<Self>),
    Factorial(Box<Self>),
    Deref(Box<Self>),
    Literal(i32),
}

impl std::fmt::Display for Expr {
    fn fmt(&self, f: &mut core::fmt::Formatter<'_>) -> core::fmt::Result {
        match self {
            Self::Literal(literal) => write!(f, "{literal}"),
            Self::Factorial(left) => write!(f, "({left}!)"),
            Self::Deref(left) => write!(f, "(*{left})"),
            Self::Neg(right) => write!(f, "(-{right})"),
            Self::Add(left, right) => write!(f, "({left} + {right})"),
            Self::Sub(left, right) => write!(f, "({left} - {right})"),
            Self::Pow(left, right) => write!(f, "({left} ^ {right})"),
        }
    }
}

let atom = text::int::<_, _, extra::Err<Simple<char>>>(10)
    .from_str()
    .unwrapped()
    .map(Expr::Literal)
    .padded();

let op = |c| just(c).padded();

let expr = atom.pratt((
    // We want factorial to happen before any negation, so we need its precedence to be higher than `Expr::Neg`.
    postfix(4, op('!'), |lhs| Expr::Factorial(Box::new(lhs))),
    // Just like in math, we want that if we write -x^2, our parser parses that as -(x^2), so we need it to have
    // exponents bind tighter than our prefix operators.
    infix(right(3), op('^'), |l, r| Expr::Pow(Box::new(l), Box::new(r))),
    // Notice the conflict with our `Expr::Sub`. This will still parse correctly. We want negation to happen before
    // `+` and `-`, so we set its precedence higher.
    prefix(2, op('-'), |rhs| Expr::Neg(Box::new(rhs))),
    prefix(2, op('*'), |rhs| Expr::Deref(Box::new(rhs))),
    // Our `-` and `+` bind the weakest, meaning that even if they occur first in an expression, they will be the
    // last executed.
    infix(left(1), op('+'), |l, r| Expr::Add(Box::new(l), Box::new(r))),
    infix(left(1), op('-'), |l, r| Expr::Sub(Box::new(l), Box::new(r))),
))
    .map(|x| x.to_string());

assert_eq!(
    expr.parse("*1 + -2! - -3^2").into_result(),
    Ok("(((*1) + (-(2!))) - (-(3 ^ 2)))".to_string()),
);

Structs

Enums

Functions