Gazelle
An LR parser generator for Rust with runtime operator precedence and natural lexer feedback.
Getting Started
Add both crates to your Cargo.toml:
[]
= "0.9"
= "0.9"
gazelle-parser provides the runtime (parser, tokens, precedence, error handling). gazelle-macros provides the gazelle! proc macro that generates parser code from your grammar. Both are required for typical use.
For runtime-only usage (dynamic grammars without the macro), gazelle-parser alone is sufficient — see Parser Generator as a Library.
What Makes Gazelle Different
1. Runtime Operator Precedence
Traditional grammars encode precedence through structure - one rule per level:
expr: add_expr;
add_expr: add_expr '+' mul_expr | mul_expr;
mul_expr: mul_expr '*' unary_expr | unary_expr;
// ... 10 more levels for a full language
Gazelle's prec terminals carry precedence at runtime:
gazelle!
One rule. The lexer provides precedence per token:
'+' => Op,
'*' => Op,
This enables user-defined operators at runtime - see examples/c11_calculator.rs.
2. Natural Lexer Feedback
The infamous C typedef problem: is T * x a multiplication or pointer declaration? The lexer needs parser state to decide.
Traditional parsers hide the parse loop, requiring globals or hacks. Gazelle uses a push-based API - you drive the loop:
loop
No magic. The lexer and parser share state through actions, and you control when each runs. See examples/c11/ for a complete C11 parser using this for typedef disambiguation.
3. Clean Separation of Grammar and Actions
The grammar is pure grammar:
gazelle!
Actions are split into an ErrorType trait, a Types trait, and per-node Action impls:
Action methods return Result — when actions are infallible, use type Error = core::convert::Infallible.
This gives you:
- Full IDE support in action code (autocomplete, type hints, go-to-definition)
- Compile errors point to your code, not generated code
- Multiple implementations (interpreter, AST builder, pretty-printer)
- Grammars reusable across different backends
4. CST to AST — You Choose
Parser generators typically force a choice: either you get a concrete syntax tree (CST) that you walk and transform, or you write untyped action callbacks and build your own result from scratch.
This reflects a real distinction, but not the one people usually describe. The conventional definition — "a CST keeps all tokens, an AST drops some" — confuses a consequence for a cause. A better way to think about it:
- A CST is a concrete tree: an actual data structure with nodes and children, faithfully representing the grammar's productions.
- An AST is the same structural information represented abstractly — through a sequence of typed reduction calls rather than a materialized tree.
When the tree is abstract, you only capture what you care about. A calculator folds Add(2, 3) straight into 5.0 — the parentheses, the operator token, even the child nodes were never stored because there was never a tree to store them in. The information loss traditionally associated with ASTs isn't a design choice; it's what naturally happens when you only materialize what you need.
Gazelle makes this a smooth continuum rather than a binary choice. The gazelle! macro generates an enum for each nonterminal whose variants mirror the grammar's alternatives, but the variant fields are generic — their types come from your Types trait implementation:
// Generated by gazelle! — variants match the grammar, types are yours
Set the associated type to the generated enum itself, and a blanket impl kicks in — you get a full CST with zero action code:
Set it to f64, and you write a reducer that folds values during parsing — a fully custom AST (or no tree at all):
Set it to Ignore, and the node is discarded — useful for validation-only parsing. Mix and match freely across nonterminals: auto-box some, evaluate others, ignore the rest. The same grammar, the same generated enums — the representation depends entirely on what types you plug in.
5. Declarative Conflict Resolution
For grammars with intentional ambiguities, terminal modifiers resolve S/R conflicts at the grammar level:
terminals {
shift ELSE, // dangling else: always shift
prec OP: _, // operators: runtime precedence
conflict TOK: _, // caller decides per-token
}
shift and reduce bake the decision into the parse table — zero runtime cost. prec and conflict defer to per-token resolution at parse time. No %expect hacks — the intent is explicit in the grammar.
6. Conflict Diagnostics with Examples
When a grammar has shift/reduce or reduce/reduce conflicts, Gazelle shows concrete example inputs with two bracketings — one for each parse:
Shift/reduce conflict on 'ELSE':
Shift: stmt -> IF COND THEN stmt • ELSE stmt (wins)
Reduce: stmt -> IF COND THEN stmt •
Example: IF COND THEN IF COND THEN stmt • ELSE stmt
Shift: IF COND THEN IF COND THEN (stmt ELSE stmt)
Reduce: IF COND THEN (IF COND THEN stmt) ELSE stmt (reduce to stmt)
The example is found by BFS on the raw DFA — shortest viable prefix to the conflict state, then a joint suffix search that drives both parser interpretations to acceptance. The dot marks the conflict point. When the grammar is LR(k>1) rather than ambiguous (no single string has two parses), separate examples are shown for each interpretation.
7. Parser Generator as a Library
Most parser generators are build tools. Gazelle exposes table construction as a library:
use ;
use CompiledTable;
use ;
let grammar = parse_grammar.unwrap;
// Build tables and parse
let compiled = build.unwrap;
let mut parser = new;
let num_id = compiled.symbol_id.unwrap;
// ... push tokens with parser.maybe_reduce() and parser.shift()
Enables grammar analyzers, conflict debuggers, or parsers for dynamic grammars. See examples/runtime_grammar.rs for complete examples.
Examples
Calculator with User-Defined Operators
$ cargo run --example c11_calculator
> operator ^ pow right 3;
defined: ^ = pow right 3
> 2 ^ 3 ^ 2;
512 // right-assoc: 2^(3^2) = 2^9
> x = 2 * 3 ^ 2;
x = 18 // ^ binds tighter than *
C11 Parser
A complete C11 parser demonstrating:
- Lexer feedback for typedef disambiguation (Jourdan's approach)
- Dynamic precedence collapsing 10+ expression levels into one rule
- Full C11 test suite (41 tests)
$ cargo test --example c11
The expression grammar:
// Traditional C grammar: 10+ cascading rules
// Gazelle: one rule with prec terminals
assignment_expression = cast_expression
| assignment_expression BINOP assignment_expression
| assignment_expression STAR assignment_expression
| assignment_expression QUESTION expression COLON assignment_expression;
Usage
use Precedence;
use gazelle;
gazelle!
;
When to Use Gazelle
Good fit:
- Languages with user-definable operators or precedence
- C-family languages needing lexer feedback (typedef disambiguation)
- Complex expression grammars you want to simplify
- When you want full IDE support in semantic actions
Consider alternatives for:
- Simple formats (JSON, TOML) - nom or pest may be simpler
- Maximum ecosystem maturity - lalrpop or tree-sitter
License
MIT