Crate rustlr[][src]

Expand description

rustlr is a parser generator that can create LALR(1) as well as full LR(1) parsers. It is also capable of recognizing operator precedence and associativity declarations that allows the use of some ambiguous grammars. Parsers also have optional access to external state information that allows them to recognize more than just context-free languages. A classical method of error recovery is used. The parser can generate a full LR(1) parser given the grammar for an early version of java (Java1.4) in approximately 10-20 seconds on contemporary processors.

Most of the items exported by this crate are only required by the parsers that are generated, and does not form an API. This crate does not form an API as most of the exported items are required only by the parsers that are generated. The user needs to provide a grammar and a lexical analyzer that implements the Lexer trait. Only a simple lexer that returns individual characters in a string (charlexer) is provided.

Example

Given the grammar at https://cs.hofstra.edu/~cscccl/rustlr_project/calculator.grammar,

 rustlr calculator.grammar lr1

generates a LR(1) parser as a rust program (https://cs.hofstra.edu/~cscccl/rustlr_project/calculatorparser.rs). This program includes a make_parser function, which can be used as in

 let mut scanner = Exprscanner::new(&sourcefile);
 let mut parser1 = make_parser();
 let absyntree = parser1.parse(&mut scanner);

Here, Exprscanner is a structure that must implement the Lexer trait required by the generated parser.

A relatively self-contained grammar and how to use its generated parser is at https://cs.hofstra.edu/~cscccl/rustlr_project/cpm.grammar.

A detailed tutorial is being prepared at https://cs.hofstra.edu/~cscccl/rustlr_project/ that will explain the format of grammars and how to generate and use parsers for several sample languages. The examples in the tutorial use basic_lexer (https://docs.rs/basic_lexer/0.1.2/basic_lexer/), which was written by the same author but other tokenizers can be easily adopted as well, such as scanlex (https://docs.rs/scanlex/0.1.4/scanlex/).

Structs

This structure is expected to be returned by the lexical analyzer (Lexer objects). Furthermore, the .sym field of a Lextoken must match the name of a terminal symbol specified in the grammar that defines the language. AT is the type of the value attached to the token, which is usually some enum that distinguishes between numbers, keywords, alphanumeric symbols and other symbols. See the tutorial and examples at https://cs.hofstra.edu/~cscccl/rustlr_project on how to define the right kind of AT.

this structure is only exported because it is required by the generated parsers. There is no reason to use it in other programs.

this is the structure created by the generated parser. The generated parser program will contain a make_parser function that returns this structure. Most of the pub items are, however, only exported to support the operation of the parser, and should not be accessed directly. Only the functions RuntimeParser::parse, RuntimeParser::report, RuntimeParser::abort and RuntimeParser::error_occurred should be called directly from user programs. Only the field RuntimeParser::exstate should be accessed by user programs.

This is a sample Lexer implementation designed to return every character in a string as a separate token, and is used in small grammars for testing and illustration purposes. It is assumed that the characters read are defined as terminal symbols in the grammar.

Enums

this enum is only exported because it’s used by the generated parsers. There is no reason to use it in other programs.

Traits

This trait defines the interace that any lexical analyzer must be adopted to.

Functions

this function is only exported because it’s used by the generated parsers.

this is the only function that can invoke the parser generator externally, without running rustlr (rustlr::main) directly. It expects to find a file of the form grammarname.grammar. The option argument that can currently only be “lr1” or “lalr”. It generates a grammar in a file named grammarnameparser.rs.