rustlr 0.4.0

LR Parser Generator with Advanced Options
Documentation
# grammar for "C-plus-minus", updated version

auto
grammarname cpm
#absyntype String
nonterminals STAT STATLIST EXPR EXPRLIST
terminals x y z cin cout ; ( ) >> ERROR
lexterminal LLANGLE <<
topsym STATLIST
errsym ERROR
#resync ;

STATLIST --> STAT+
STAT --> cout LLANGLE EXPRLIST:s ; 

EXPR --> x | y | z
EXPR --> ( EXPR )
EXPRLIST --> EXPR<LLANGLE+>
#EXPRLIST --> EXPR:(s) {s}
#EXPRLIST --> EXPRLIST LLANGLE EXPR

STAT:ErrorStat --> ERROR ;
#STATLIST --> ERROR STAT
EXPR:ErrorExpr --> ( ERROR )
#EXPR --> EXPR ERROR )

!//injected into parser:
!mod cpm_ast;
!use std::io::{Write};
!
!fn readln()-> String {
!  let mut s = String::new();
!  let r = std::io::stdin().read_line(&mut s);
!  s
!}
!
!fn main() {
!   print!("Write something in C+- : ");
!   std::io::stdout().flush().unwrap();
!   let input = readln();
!   let mut lexer1 = cpmlexer::from_str(&input);
!   let mut parser1 = make_parser();
!   let res = parse_with(&mut parser1, &mut lexer1);
!   //let res = parse_train_with(&mut parser1, &mut lexer1);
!   println!("parsing success: {}",!parser1.error_occurred());
!   println!("Result: {:?}",res);
!}//main

EOF


Everything after 'EOF' is ignored and can be used for comments.

Rustlr uses two methods of error recovery, which this toy language
experiments with.  The first method requires the designation of a
special terminal symbol as an error recovery symbol, using the
directive "errsym" or "errorsymbol".  This symbol is assumed to not
conflict with actual input tokens.  The error symbol can appear at
most once on the right-hand side of a production rule, followed by
zero or more terminal symbols.  When an error is encountered (a
lookahead symbol with no defined transition in the LR/LALR
finite-state machine), the parser looks down the parse stack for a
state that has a transition defined on the symbol.  It truncates the
stack and performs all possible reductions until a state that can
"shift" the error symbol (ERROR in this example) is found, and
simuluates a shift of the next state associated with the error symbol
onto the stack.  It then skips lookheads until a valid transition is
found for the new state.  Since only terminal symbols may follow the
error symbol, eventually one of the error productions will be reduced.
These productions can have semantic actions that report specific
errors.

The second method of error recovery is rather straightforward: the
grammar can define one or more "resynchronization" terminal symbols
using the "resynch" directive.  When an error is encountered, the
parser skips ahead past the first resynchronization symbol.  Then it
looks down the parse stack for a state that has a valid transition on
the next symbol.  For languages that ends statements with a ;
(semicolon), the ; is the natural choice as the resynch symbol.  The
parser will report an error message for the current statement, then
skip over to the next statement.

The second (resynch) method of error recovery is attempted if the
first method fails or if no error symbol is defined.

If both methods of error-recovery fail, the parser simply skips input
tokens until a suitable action is found.

One can experiment with this grammar with "C+-" input such as
cout << x ; cout y ; cin >> ; cout << y << z ;