RustyLR
RustyLR will provide you a LR(1) and LALR(1) Deterministic Finite Automata (DFA) generator from Context Free Grammar (CFGs).
[dependencies]
rusty_lr = "0.7.0"
Features
- pure Rust implementation
- readable error messages, both for grammar building and parsing
- compile-time DFA construction from CFGs ( with proc-macro )
- customizable reducing action
- resolving conflicts of ambiguous grammar
- tracing parser action with callback, also error handling
Sample
In example/calculator/parser.rs
,
use lr1;
use lalr1;
// this define struct `EParser`
// where 'E' is the start symbol
lalr1!
In example/calculator/src/main.rs
,
The result will be:
[Num(3)] [Plus] [Num(4)]
[Num(3), Plus, Num(4)]
[Num(1)] [Plus] [Num(2), Star, LParen, Num(3), Plus, Num(4), RParen]
[Num(1), Plus, Num(2), Star, LParen, Num(3), Plus, Num(4), RParen]
15
userdata: 2
Build Deterministic Finite Automata (DFA) from Context Free Grammar (CFG)
This section will describe how to build DFA from CFGs, on runtime.
1. Define terminal and non-terminal symbols
// must implement these traits
// must implement these traits
/// impl Display for TermType, NonTermType will make related ProductionRule, error message Display-able
Or simply, you can use char
or u8
as terminal, and &'static str
or String
as non-terminal.
Any type that implements traits above can be used as terminal and non-terminal symbols.
2. Define production rules
Consider the following context free grammar:
A -> A + A (reduce left)
A -> M
This grammar can be written as:
/// type alias
type Token = Token;
/// create grammar
let mut grammar = new;
grammar.add_rule;
grammar.add_rule;
/// set reduce type
grammar.set_reduce_type;
Note that the production rule A -> A + A
has a shift/reduce conflict, and the reduce type is set to ReduceType::Left
for terminal symbol Plus
to resolve the conflict. Default will cause an error when a conflict occurs.
reduce/reduce conflict (e.g. duplicated rules) will be always an error.
3. Build DFA
Calling grammar.build()
or grammar.build_lalr()
will build the DFA from the CFGs.
let parser: Parser = match grammar.build ;
You must explicitly specify the Augmented non-terminal symbol, and the production rule
Augmented -> StartSymbol $
must be defined in the grammar.
The returned Parser
struct contains the DFA and the production rules(cloned). It is completely independent from the Grammar
struct, so you can drop the Grammar
struct, or export the Parser
struct to another module.
4. Error messages
The Error
type returned from Grammar::build()
will contain the error information.
Error
is Display
if both Term
and NonTerm
is Display
, and It is Debug
if both Term
and NonTerm
is Debug
.
For Shift/Reduce conflicts,
Build failed: Shift/Reduce Conflict
NextTerm: '0'
Reduce Rule:
"Num" -> "Digit"
Shift Rules:
"Digit" -> '0' • /Lookaheads: '\0', '0'
Try rearanging the rules or set ReduceType to Terminal '0' to resolve the conflict.
For Reduce/Reduce Conflicts,
Build failed: Reduce/Reduce Conflict with lookahead: '\0'
Production Rule1:
"Num" -> "Digit"
Production Rule2:
"Num" -> "Digit"
Parse input sequence with generated DFA
For given input sequence, you can start parsing with Parser::begin()
method. Once you get the Context
from begin()
, you will feed the input sequence to the parser with parser.feed()
method.
let terms = vec!;
// start parsing
let mut context = parser.begin;
// feed input sequence
for term in terms
Note that EOF
token is feeded at the end of sequence, and the augmented rule Augmented -> StartSymbol $
will not be reduced since there are no lookahead symbols.
Parse with callback
For complex error handling and tracing parser action, you can implement Callback
trait and pass it to *_callback(...)
methods.
let terms = vec!;
// start parsing
let mut context = parser.begin;
let mut callback = ParserCallback ;
// feed input sequence
for term in terms
The result will be:
Reduce by P -> Num
Reduce by M -> P
Reduce by A -> M
Reduce by P -> Num
Reduce by M -> P
Reduce by P -> Num
Reduce by M -> P
Reduce by A -> M
Reduce by P -> Num
Reduce by M -> P
Reduce by A -> M
Reduce by A -> A + A
Reduce by E -> A
Reduce by P -> ( E )
Reduce by M -> P
Reduce by M -> M * M
Reduce by A -> M
Reduce by A -> A + A
Reduce by E -> A