gramatica
This crate provides a binary to compile grammars into Rust code and a library implementing Earley's parsing algorithm to parse the grammars specified.
Usage
This crate is gramatica
. To use it you should install it in order to acquire the gramatica_compiler
binary and also add gramatica
to your dependencies in your project's Cargo.toml
.
[]
= "0.2.1"
Then, if you have made a grammar file example.rsg
execute gramatica_compiler example.rsg > example.rs
. Afterwards you may use the generated file example.rs
as a source Rust file.
Recent changes
- Now it is possible to use bindings and mutable references. Like in a rule
(LPar, a @ Left(_), Right(ref mut b), RPar) => (std::mem::take(a),std::mem::take(b))
. - Added
parser::cursor
to be used instead ofsource_index
to avoid indexing over unicode strings. - Improved management of large files.
- Added
vebosity
argument toParser::parse
.
Example: calculator
The classical example is to implement a calculator.
//This is a just Rust header that it is copied literally
extern crate gramatica;
use Ordering;
use ;
//Here the proper grammar begins.
//These lines are processed by gramatica_compiler to generate the Token enum and the parsing tables.
//We begin by terminal tokens (symbols that are not in the left of any rule but have a literal representation).
//For this example all terminals are regular expressions. The first argument of re_terminal! is the type entry, as used in a enum.
re_terminal!;
re_terminal!;
re_terminal!;
re_terminal!;
re_terminal!;
re_terminal!;
re_terminal!;
re_terminal!;
re_terminal!;
re_terminal!;//Otherwise skip spaces
//Now is the turn of nonterminal tokens. The first one is the default start symbol.
//These have rules written as match clauses, with the pattern being the reduction of the nonterminal token and the expression being the value the token takes when reducing.
//In this case the type of the symbol is empty and so is the expression
nonterminal Input
//Although the value type of Line is empty we may have code executed on the reduction
nonterminal Line
//Finally a token with value type. Each rule creates the value in a different way.
//Most rules are annotated to avoid ambiguities
nonterminal Expression
//The ordering macro-like sets the order of application of the previously annotated rules
ordering!;
//Finally an example of using the grammar to parse some lines from stdin.
//We could do this or something similar in a different file if we desired to.
use BufRead;
Advanced Lexer
To define terminal tokens not expressable with regular expressions you may use the following. It must containg a _match function returning an option containing the number of chars mathed and the value of the token.
terminal LitChar
Since version 0.1.1 there is also a keyword_terminal!
macro:
keyword_terminal!;
Parsing values as match clauses
Each rule is written as a match clause, whose ending expression is the value that the nonterminal token gets after being parsed. For example, to parse a list of statements:
nonterminal Stmts
Reductions only execute if they are part of the final syntactic tree.
Precedence by annotations
To avoid ambiguities you have two options: to ensure the grammar does not contain them or to priorize rules by introducing annotations. In the example of the calculator we have seen two kinds:
#[priority(p_name)]
to declare a rule with priorityp_name
. Later there should be aordering!(p_0,p_1,p_2,...)
macro-like to indicate thatp_0
should reduce beforep_1
.#[associativity(left/right)]
to decide how to proceed when nesting the same rule.
Example: Parsing JSON
extern crate gramatica;
use Ordering;
use ;
//See https://www.json.org/
use Rc;
//We define an auxiliar type to store JSON values
// ---- Start of the grammar ----
keyword_terminal!;
keyword_terminal!;
keyword_terminal!;
re_terminal!;
terminal LitStr
re_terminal!;
re_terminal!;
re_terminal!;
re_terminal!;
re_terminal!;
re_terminal!;
re_terminal!;//Otherwise skip spaces
nonterminal Object
nonterminal Members
nonterminal Pair
nonterminal Array
nonterminal Elements
nonterminal Value
// ---- End of the grammar ----
use ;
//As example, we parse stdin for a JSON object
Example: Parsing basic XML
//A very basic xml grammar
extern crate gramatica;
use Ordering;
use ;
// see https://www.w3.org/People/Bos/meta-bnf
// also http://cs.lmu.edu/~ray/notes/xmlgrammar/
use Rc;
//We define an auxiliar type to store XML elements
// ---- Start of the grammar ----
re_terminal!;
re_terminal!;
terminal LitStr
re_terminal!;
re_terminal!;
re_terminal!;
re_terminal!;
re_terminal!;
re_terminal!;
nonterminal Document
nonterminal Element
nonterminal EmptyElemTag
nonterminal Attributes
nonterminal Attribute
nonterminal STag
nonterminal ETag
nonterminal Content
nonterminal Contents
nonterminal MaybeSpace
nonterminal CharData
// ---- End of the grammar ----
use ;
//As example, we parse stdin for a XML element