Gazelle
An LR parser generator for Rust with runtime operator precedence and natural lexer feedback.
What Makes Gazelle Different
1. Runtime Operator Precedence
Traditional grammars encode precedence through structure - one rule per level:
expr: add_expr;
add_expr: add_expr '+' mul_expr | mul_expr;
mul_expr: mul_expr '*' unary_expr | unary_expr;
// ... 10 more levels for a full language
Gazelle's prec terminals carry precedence at runtime:
gazelle!
One rule. The lexer provides precedence per token:
'+' => Op,
'*' => Op,
This enables user-defined operators at runtime - see examples/c11_calculator.rs.
2. Natural Lexer Feedback
The infamous C typedef problem: is T * x a multiplication or pointer declaration? The lexer needs parser state to decide.
Traditional parsers hide the parse loop, requiring globals or hacks. Gazelle uses a push-based API - you drive the loop:
loop
No magic. The lexer and parser share state through actions, and you control when each runs. See examples/c11/ for a complete C11 parser using this for typedef disambiguation.
3. Clean Separation of Grammar and Actions
The grammar is pure grammar:
gazelle!
Actions are split into a Types trait and per-node Action impls:
Action methods return Result - the error type is declared as type Error: From<ParseError> on the Types trait.
This gives you:
- Full IDE support in action code (autocomplete, type hints, go-to-definition)
- Compile errors point to your code, not generated code
- Multiple implementations (interpreter, AST builder, pretty-printer)
- Grammars reusable across different backends
4. CST to AST — You Choose
Parser generators typically force a choice: either you get a concrete syntax tree (CST) that you walk and transform, or you write untyped action callbacks and build your own result from scratch.
This reflects a real distinction, but not the one people usually describe. The conventional definition — "a CST keeps all tokens, an AST drops some" — confuses a consequence for a cause. A better way to think about it:
- A CST is a concrete tree: an actual data structure with nodes and children, faithfully representing the grammar's productions.
- An AST is the same structural information represented abstractly — through a sequence of typed reduction calls rather than a materialized tree.
When the tree is abstract, you only capture what you care about. A calculator folds Add(2, 3) straight into 5.0 — the parentheses, the operator token, even the child nodes were never stored because there was never a tree to store them in. The information loss traditionally associated with ASTs isn't a design choice; it's what naturally happens when you only materialize what you need.
Gazelle makes this a smooth continuum rather than a binary choice. The gazelle! macro generates an enum for each nonterminal whose variants mirror the grammar's alternatives, but the variant fields are generic — their types come from your Types trait implementation:
// Generated by gazelle! — variants match the grammar, types are yours
Set the associated type to the generated enum itself, and a blanket impl kicks in — you get a full CST with zero action code:
Set it to f64, and you write a reducer that folds values during parsing — a fully custom AST (or no tree at all):
Set it to Ignore, and the node is discarded — useful for validation-only parsing. Mix and match freely across nonterminals: auto-box some, evaluate others, ignore the rest. The same grammar, the same generated enums — the representation depends entirely on what types you plug in.
5. Conflict Diagnostics with Examples
When a grammar has shift/reduce or reduce/reduce conflicts, Gazelle shows concrete example inputs with two bracketings — one for each parse:
Shift/reduce conflict on 'ELSE':
Shift: stmt -> IF COND THEN stmt • ELSE stmt (wins)
Reduce: stmt -> IF COND THEN stmt •
Example: IF COND THEN IF COND THEN stmt • ELSE stmt
Shift: IF COND THEN IF COND THEN (stmt ELSE stmt)
Reduce: IF COND THEN (IF COND THEN stmt) ELSE stmt (reduce to stmt)
The example is found by BFS on the raw DFA — shortest viable prefix to the conflict state, then a joint suffix search that drives both parser interpretations to acceptance. The dot marks the conflict point. When the grammar is LR(k>1) rather than ambiguous (no single string has two parses), separate examples are shown for each interpretation.
6. Parser Generator as a Library
Most parser generators are build tools. Gazelle exposes table construction as a library:
use ;
use GrammarBuilder;
use CompiledTable;
use ;
// Option 1: Parse grammar from string
let grammar = parse_grammar.unwrap;
// Option 2: Build programmatically
let mut gb = new;
let num = gb.t;
let plus = gb.t;
let expr = gb.nt;
gb.rule;
gb.rule;
let grammar = gb.build;
// Build tables and parse
let compiled = build;
let mut parser = new;
let num_id = compiled.symbol_id.unwrap;
// ... push tokens with parser.maybe_reduce() and parser.shift()
Enables grammar analyzers, conflict debuggers, or parsers for dynamic grammars. See examples/runtime_grammar.rs for complete examples.
Examples
Calculator with User-Defined Operators
$ cargo run --example c11_calculator
> operator ^ pow right 3;
defined: ^ = pow right 3
> 2 ^ 3 ^ 2;
512 // right-assoc: 2^(3^2) = 2^9
> x = 2 * 3 ^ 2;
x = 18 // ^ binds tighter than *
C11 Parser
A complete C11 parser demonstrating:
- Lexer feedback for typedef disambiguation (Jourdan's approach)
- Dynamic precedence collapsing 10+ expression levels into one rule
- Full C11 test suite (41 tests)
$ cargo test --example c11
The expression grammar:
// Traditional C grammar: 10+ cascading rules
// Gazelle: one rule with prec terminals
assignment_expression = cast_expression
| assignment_expression BINOP assignment_expression
| assignment_expression STAR assignment_expression
| assignment_expression QUESTION expression COLON assignment_expression;
Usage
use ;
use gazelle;
gazelle!
;
When to Use Gazelle
Good fit:
- Languages with user-definable operators or precedence
- C-family languages needing lexer feedback (typedef disambiguation)
- Complex expression grammars you want to simplify
- When you want full IDE support in semantic actions
Consider alternatives for:
- Simple formats (JSON, TOML) - nom or pest may be simpler
- Maximum ecosystem maturity - lalrpop or tree-sitter
License
MIT