gazelle-parser 0.1.0

LR parser generator with runtime operator precedence and natural lexer feedback
Documentation

Gazelle

An LR parser generator for Rust with runtime operator precedence and natural lexer feedback.

What Makes Gazelle Different

1. Runtime Operator Precedence

Traditional grammars encode precedence through structure - one rule per level:

expr: add_expr;
add_expr: add_expr '+' mul_expr | mul_expr;
mul_expr: mul_expr '*' unary_expr | unary_expr;
// ... 10 more levels for a full language

Gazelle's prec terminals carry precedence at runtime:

grammar! {
    grammar Calc {
        terminals {
            NUM: Num,
            prec OP: Op,  // precedence attached to each token
        }
        expr: Expr = expr OP expr @binop | NUM @literal;
    }
}

One rule. The lexer provides precedence per token:

'+' => CalcTerminal::Op('+', Precedence::Left(1)),
'*' => CalcTerminal::Op('*', Precedence::Left(2)),

This enables user-defined operators at runtime - see examples/calculator.rs.

2. Natural Lexer Feedback

The infamous C typedef problem: is T * x a multiplication or pointer declaration? The lexer needs parser state to decide.

Traditional parsers hide the parse loop, requiring globals or hacks. Gazelle uses a push-based API - you drive the loop:

loop {
    let token = lexer.next(&actions.ctx)?;  // lexer sees current state
    parser.push(token, &mut actions)?;       // actions update state
    // next iteration: lexer sees updated state
}

No magic. The lexer and parser share state through actions, and you control when each runs. See examples/c11/ for a complete C11 parser using this for typedef disambiguation.

3. Clean Separation of Grammar and Actions

The grammar is pure grammar:

grammar! {
    grammar Calc {
        terminals { NUM: Num, prec OP: Op, LPAREN, RPAREN }

        expr: Expr = expr OP expr @binop
                   | NUM @literal
                   | LPAREN expr RPAREN;  // passthrough - inner expr flows through
    }
}

Actions are split into a types trait and an actions trait:

impl CalcTypes for Evaluator {
    type Num = f64;
    type Op = char;
    type Expr = f64;
}

impl CalcActions for Evaluator {
    fn binop(&mut self, left: f64, op: char, right: f64) -> Result<f64, ParseError> {
        Ok(match op {
            '+' => left + right,
            '*' => left * right,
            _ => panic!("unknown op"),
        })
    }

    fn literal(&mut self, n: f64) -> Result<f64, ParseError> { Ok(n) }
    // No paren method needed - passthrough!
}

Action methods return Result — the error type defaults to ParseError but can be customized for domain-specific errors.

This gives you:

  • Full IDE support in action code (autocomplete, type hints, go-to-definition)
  • Compile errors point to your code, not generated code
  • Multiple implementations (interpreter, AST builder, pretty-printer)
  • Grammars reusable across different backends

4. Parser Generator as a Library

Most parser generators are build tools. Gazelle exposes table construction as a library:

use gazelle::{parse_grammar, ErrorContext};
use gazelle::grammar::GrammarBuilder;
use gazelle::table::CompiledTable;
use gazelle::runtime::{Parser, Token};

// Option 1: Parse grammar from string
let grammar = parse_grammar(r#"
    grammar Expr {
        start expr;
        terminals { NUM, PLUS, STAR }
        expr = expr PLUS term | term;
        term = term STAR factor | factor;
        factor = NUM;
    }
"#).unwrap();

// Option 2: Build programmatically
let mut gb = GrammarBuilder::new();
let num = gb.t("NUM");
let plus = gb.t("PLUS");
let expr = gb.nt("expr");
gb.rule(expr, vec![expr, plus, expr]);
gb.rule(expr, vec![num]);
let grammar = gb.build();

// Build tables and parse
let compiled = CompiledTable::build(&grammar);
let mut parser = Parser::new(compiled.table());

let num_id = compiled.symbol_id("NUM").unwrap();
// ... push tokens with parser.maybe_reduce() and parser.shift()

Enables grammar analyzers, conflict debuggers, or parsers for dynamic grammars. See examples/runtime_grammar.rs for complete examples.

Examples

Calculator with User-Defined Operators

$ cargo run --example calculator

> operator ^ pow right 3;
defined: ^ = pow right 3

> 2 ^ 3 ^ 2;
512                        // right-assoc: 2^(3^2) = 2^9

> x = 2 * 3 ^ 2;
x = 18                     // ^ binds tighter than *

C11 Parser

A complete C11 parser demonstrating:

  • Lexer feedback for typedef disambiguation (Jourdan's approach)
  • Dynamic precedence collapsing 10+ expression levels into one rule
  • Full C11 test suite (41 tests)
$ cargo test --example c11

The expression grammar:

// Traditional C grammar: 10+ cascading rules
// Gazelle: one rule with prec terminals
assignment_expression = cast_expression
                      | assignment_expression BINOP assignment_expression
                      | assignment_expression STAR assignment_expression
                      | assignment_expression QUESTION expression COLON assignment_expression;

Usage

use gazelle::{ParseError, Precedence};
use gazelle_macros::grammar;

grammar! {
    grammar MyParser {
        start expr;
        terminals {
            NUM: Num,  // terminal with payload
            LPAREN, RPAREN,
            prec OP: Op,  // precedence terminal with payload
        }

        expr: Expr = expr OP expr @binop
                   | NUM @num
                   | LPAREN expr RPAREN;  // passthrough
    }
}

struct Eval;

impl MyParserTypes for Eval {
    type Num = i32;
    type Op = char;
    type Expr = i32;
}

impl MyParserActions for Eval {
    fn num(&mut self, n: i32) -> Result<i32, ParseError> { Ok(n) }
    fn binop(&mut self, l: i32, op: char, r: i32) -> Result<i32, ParseError> {
        Ok(match op { '+' => l + r, '*' => l * r, _ => 0 })
    }
}

fn main() {
    let mut parser = MyParserParser::<Eval>::new();
    let mut actions = Eval;

    // Push tokens (you control the loop)
    parser.push(MyParserTerminal::Num(2), &mut actions).unwrap();
    parser.push(MyParserTerminal::Op('+', Precedence::Left(1)), &mut actions).unwrap();
    parser.push(MyParserTerminal::Num(3), &mut actions).unwrap();

    let result = parser.finish(&mut actions).map_err(|(_, e)| e).unwrap();
    assert_eq!(result, 5);
}

When to Use Gazelle

Good fit:

  • Languages with user-definable operators or precedence
  • C-family languages needing lexer feedback (typedef disambiguation)
  • Complex expression grammars you want to simplify
  • When you want full IDE support in semantic actions

Consider alternatives for:

  • Simple formats (JSON, TOML) - nom or pest may be simpler
  • Error recovery focus - chumsky or tree-sitter
  • Maximum ecosystem maturity - lalrpop

License

MIT