rustlr 0.5.1 - Docs.rs

# Sample Grammar for matching brackets (), [] and {}

# the "absyntype" is the rust type of the value returned by the parse function.
# in this example, the absyntype will represent the number of pairs of matching
# brackets of types (), [] and {}

absyntype (u32,u32,u32)

# the absyntype can be any type that implements the Default trait. 
# (u32,u32,u32) defaults to (0,0,0)

# This grammar contains 3 non-terminal symbols, although the parser generator
# may create additional symbols.
nonterminals E S WS

# Nonterminals and terminals must be declared. 
terminals ( ) [ ] LBRACE RBRACE Whitespace

# topsym is the top or start symbol of the grammar
topsym S

# on error, the parser will skip past the next whitespace and keep parsing
resync Whitespace

# The following are the grammar production rules.

E --> ( S:(a,b,c) ) {(a+1,b,c)}
E --> [ S:(a,b,c) ] {(a,b+1,c)}
E --> LBRACE S:(a,b,c) RBRACE  {(a,b,c+1)}
S --> WS { (0,0,0) }
S --> S:(a,b,c)  E:(p,q,r)  {(a+p,b+q,c+r)}
WS -->
WS --> WS Whitespace


!use rustlr::charscanner;
!//anything on a ! line is injected verbatim into the generated parser
!
!fn main() {
!  let argv:Vec<String> = std::env::args().collect(); // command-line args
!  let input = &argv[1];
!  let mut parser1 = make_parser();
!  let mut lexer1 = charscanner::new(input,true);
!  lexer1.modify = |c| { match c { //modify symbols to grammar terminal names
!    "{" => "LBRACE", 
!    "}" => "RBRACE",
!    _ if c.chars().next().unwrap().is_whitespace() => "Whitespace",
!    _ => c
!  }};
!     
!  let result = parser1.parse(&mut lexer1);
!  if !parser1.error_occurred() {
!    println!("parsed successfully with result {:?}",&result);
!  }
!  else {println!("parsing failed; partial result is {:?}",&result);}
!}//main


EOF


Everything after "EOF" is ignored and can be used for more comments:

Each grammar rule has a an associated "semantic action" enclosed in {}'s.
The semantic action is a piece of rust code that must return a value of
the declared abysyntype.  If no action is specified, one will be created
that returns absyntype::default() - the chosen absyntype must implement
the Default trait. 

Although not used in this grammar, multiple productions for the same
left-hand side nonterminal may be specified on one line using the |
character.  It is also possible to specify a rule on multiple lines
using ==> instead of --> and ending with <==

Note that {, } and | cannot also be used as terminal symbols.  The grammar
must specify other names such as LBRACE and RBRACE and the lexical scanner
must translate "{" and "}" into these tokens.

Rustlr defines a 'Lexer' trait, and allows the use of any lexical
analyzer (tokenizer) that implements Lexer.  The only actual tokenizer
it defines that implements lexer is charlexer.  This lexer takes a
string and returns each character as a separate token, with the option
to skip whitespace characters.  It also has a `modify` function that
can itself be modified so that some characters are transformed into other
tokens.  charlexer is adequate for simple examples.  

In this example, we've used the ! lines to inject enough code into the
generated parser so it's a self-contained program.  Normally, however,
only a few ! lines are needed for 'use' type declarations.  In the
main() injected into this parser, we use charlexer with the option of
keeping whitespace characters because they're used in the grammar.  We
also change the modify function to return tokens that match the names
of the terminal symbols in the grammar.

Each grammar symbol, terminal as well as nonterminal will always have
associated with it a value of absyntype.  When non are specified, 
absyntype::default() is automatically used.  We can match the value with a
simple pattern (without whitespaces) such as (a,b,c).  Most often, however,
only one variable name is used.  The semantic action has access to these
values when forming the return value.  The semantic action is otherwise
injected verbatim into the generated parser: rustlr is not responsible if
Rust cannot compile your semantic actions.

To generate a parser for this grammar with the rustlr executable
(cargo install rustlr):

> rustlr brackets.grammar lalr

Or, from inside a program you can call the function
`rustlr::rustler("brackets","lalr");`

This produces a file bracketsparser.rs.  Since this file contains a main, you
can cargo build a new crate and replace main.rs with the contents of the
parser file.  Be sure to insert rustlr = "0.1.3" under [dependencies] in
the crate's Cargo.toml.