pub struct SyntaxDirectedTranslatorBuilder<LexemeType, Context, Satellite>where
LexemeType: AutomaticallyHandled,{ /* private fields */ }
Expand description
An interface for specifying and compiling a SyntaxDirectedTranslator.
An instance of this type serve as a “contractor” for building a LALR parser. It exports an API to communicate the specifications of the parser, and when you’re done - its build can be used to compile a matching instance of SyntaxDirectedTranslator.
§Features
§Bindings
A group of terminal symbols (i.e. lexeme types) can be associated with a binding, which can then be attached to the grammar’s production rules. Bindings are used to resolve shift-reduce conflicts: when deciding between shifting by a bound terminal or reducing by a bound rule, the one with higher priority binding is selected. If both have the same binding - we reduce if it is left-associated, and shift if it is right-associated.
This can be used to implement associativity and precedence of operators. For example - one left-associated higher-priority binding for multiplication and division, and another left-associated lower-priority binding for addition and subtraction.
§Dubs
To make the parser’s programmatic specification cleaner, the notion of dubs was introduced: instead of referring to a terminal, nonterminal, or a binding, by some complex Rust object, these are referred to by their “dubs” - character strings that identify them.
§Translation Scheme Specifications
The compiled parser will be capable of translating a given sequence of lexemes into some higher-level representation (AST, IR sequence, etc.). We refer to this higher level representation as satellite data (as it escorts subtrees of the input’s syntax tree). Client code is free to determine this target representation and the translation logic.
This translation logic is carried out recursively: a subtree of the input syntax tree is translated into satellite data as a function of the translations of its own subtrees. We call such translation functions satellite reducers, as they execute after each reduction of grammar symbols by the underlying LR parser. At the leaves, lexemes are directly translated to satellite data using dedicated functions we call leaf satellite builders.
All of these translation functions receive a mutable reference to the translation context as an additional argument, which is a user-defined object that’s used to hold global knowledge about the translated input, such as symbol tables and general statistics.
All of this translation logic is of course user-defined, providing client code with great flexibility when building custom syntax-directed translators.
§Example
Check out the example at parsing.
§Conflict Resolution Policy
- Reduce-reduce conflicts cannot be resolved.
- Shift-reduce conflicts are resolved with “shift” by default, unless otherwise specified by bindings.
Implementations§
Source§impl<LexemeType, Context, Satellite> SyntaxDirectedTranslatorBuilder<LexemeType, Context, Satellite>where
LexemeType: AutomaticallyHandled,
impl<LexemeType, Context, Satellite> SyntaxDirectedTranslatorBuilder<LexemeType, Context, Satellite>where
LexemeType: AutomaticallyHandled,
Sourcepub fn new() -> Self
pub fn new() -> Self
Create a blank SyntaxDirectedTranslatorBuilder, with no registered specifications.
Sourcepub fn dub_lexeme_type(&mut self, lexeme_type: LexemeType, dub: &str)
pub fn dub_lexeme_type(&mut self, lexeme_type: LexemeType, dub: &str)
Dub with a category of lexemes.
Remember: lexeme types serve as terminal symbols for the underlying LALR parser. This function, in effect, dubs a terminal symbol in the grammar.
§Panics
If some other grammar symbol is already identified by this dub.
Sourcepub fn dub_lexeme_types<'a>(
&mut self,
lexeme_type_dubs: impl Iterator<Item = (LexemeType, &'a str)>,
)
pub fn dub_lexeme_types<'a>( &mut self, lexeme_type_dubs: impl Iterator<Item = (LexemeType, &'a str)>, )
Dub multiple categories of lexemes at once.
§Panics
If a new dub is already used to dub another grammar symbol.
Sourcepub fn new_nonterminal(&mut self, dub: &str)
pub fn new_nonterminal(&mut self, dub: &str)
Create a new nonterminal grammar symbol, associated with the given dub.
§Panics
If some other grammar symbol is already identified by this dub.
Sourcepub fn new_nonterminals<'a>(&mut self, dubs: impl Iterator<Item = &'a str>)
pub fn new_nonterminals<'a>(&mut self, dubs: impl Iterator<Item = &'a str>)
Create multiple nonterminals at once, associated with the given dubs.
§Panics
If a new dub is already used to dub another grammar symbol.
Sourcepub fn set_start_nonterminal(&mut self, dub: &str)
pub fn set_start_nonterminal(&mut self, dub: &str)
Set the start nonterminal of the underlying grammar.
The nonterminal is identified by its dub
.
§Panics
If no nonterminal is associated with this dub.
Sourcepub fn new_binding(
&mut self,
bound_lexeme_types_dubs: Vec<&str>,
associativity: Associativity,
binding_dub: &str,
)
pub fn new_binding( &mut self, bound_lexeme_types_dubs: Vec<&str>, associativity: Associativity, binding_dub: &str, )
Register a new binding.
Note that the order of registration determines the priority of the bindings: bindings registered earlier have higher priority over bindings registered later.
§Arguments
bound_lexeme_types_dubs
- the dubs of the terminal grammar symbols (lexeme categories) that are bound to this binding.associativity
- the binding’s associativity.binding_dub
- a dub, used to identify this binding.
§Panics
If the binding’s dub is already used to dub another binding, or if no terminal symbol is
dubbed with one of the bound_lexeme_types_dubs
.
Sourcepub fn set_leaf_satellite_builder<F>(
&mut self,
lexeme_type_dub: &str,
builder: F,
)
pub fn set_leaf_satellite_builder<F>( &mut self, lexeme_type_dub: &str, builder: F, )
Set the function that’ll transform lexemes of a certain type into satellite data.
When a lexeme of the category dubbed lexeme_type_dub
will be encountered, builder
will
be called with the lexeme’s contents (alongside a reference to the translation context), and
the output will serve as the satellite data attached to this “terminal symbol”.
§Panics
If lexeme_type_dub
isn’t a known lexeme-type dub.
Sourcepub fn set_default_leaf_satellite_builder<F>(&mut self, builder: F)
pub fn set_default_leaf_satellite_builder<F>(&mut self, builder: F)
Set a leaf-satellite builder to process lexeme types for which no builder was set with SyntaxDirectedTranslatorBuilder::set_leaf_satellite_builder.
Sourcepub fn register_rule<F>(
&mut self,
lhs: &str,
rhs: Vec<&str>,
satellite_reducer: F,
)
pub fn register_rule<F>( &mut self, lhs: &str, rhs: Vec<&str>, satellite_reducer: F, )
Add a production rule to the underlying context-free grammar.
§Arguments
lhs
- the dub of the nonterminal in the left-hand side of the production.rhs
- a sequence of grammar-symbols dubs (terminals and nonterminals), that appear in the right-hand side of the production.satellite_reducer
- when reducing by the new rule, this function will be called with the satellite data of the right-hand side grammar symbols (and the translation context), and its output will serve as the satellite data attached to the left-hand side nonterminal.
§Panics
If any the given dubs isn’t known as a matching dub.
Sourcepub fn register_bound_rule<F>(
&mut self,
lhs: &str,
rhs: Vec<&str>,
binding_dub: &str,
satellite_reducer: F,
)
pub fn register_bound_rule<F>( &mut self, lhs: &str, rhs: Vec<&str>, binding_dub: &str, satellite_reducer: F, )
Register a production rule with an associated binding.
The associated binding is identified by its dub - binding_dub
. The rest of the arguments
are identical to SyntaxDirectedTranslatorBuilder::register_rule.
§Panics
If any the given dubs isn’t known as a matching dub.
Sourcepub fn register_identity_rule(&mut self, lhs: &str, rhs: &str)
pub fn register_identity_rule(&mut self, lhs: &str, rhs: &str)
Register a production rule for which the satellite reducer copies the satellite data from its 1-length RHS to its LHS.
This is commonly used for productions, such as expression -> integer_literal
.
§Arguments
lhs
- the dub of the nonterminal in the left-hand side of the production.rhs
- the dub of the only grammar-symbol in the right-hand side of the production.
§Panics
If any the given dubs isn’t known as a matching dub.
Sourcepub fn register_empty_rule<F>(
&mut self,
lhs: &str,
default_satellite_builder: F,
)where
F: Fn(&mut Context) -> Satellite + 'static,
pub fn register_empty_rule<F>(
&mut self,
lhs: &str,
default_satellite_builder: F,
)where
F: Fn(&mut Context) -> Satellite + 'static,
Register a production rule whose RHS is empty.
This is commonly used with nonterminals that represent lists that can be empty. For
example statements_list -> _nothing_
.
§Arguments
lhs
- the dub of the nonterminal in the left-hand side of the production.default_satellite_builder
- when reducing by this rule, this will be called (with the translation context), and its output will serve as the satellite data attached to the left-hand side nonterminal.
§Panics
If lhs
is not a known dub for a nonterminal.
Sourcepub fn build(self) -> SyntaxDirectedTranslator<LexemeType, Context, Satellite>
pub fn build(self) -> SyntaxDirectedTranslator<LexemeType, Context, Satellite>
Compile the set specifications into a functioning SyntaxDirectedTranslator.