Struct parol_runtime::lexer::tokenizer::Tokenizer
source · pub struct Tokenizer {
pub error_token_type: TerminalIndex,
/* private fields */
}Expand description
The Tokenizer creates a specially formatted regular expression that can be used for tokenizing an input string.
Fields§
§error_token_type: TerminalIndexThis is the token index for the special error token. Its value isn’t constant and depends on the given token count. It is always the last token that is tried to match and usually indicates an error.
Implementations§
source§impl Tokenizer
impl Tokenizer
sourcepub fn build(
augmented_terminals: &[&str],
scanner_specifics: &[&str],
scanner_terminal_indices: &[usize]
) -> Result<Self>
pub fn build( augmented_terminals: &[&str], scanner_specifics: &[&str], scanner_terminal_indices: &[usize] ) -> Result<Self>
Creates a new Tokenizer object from augmented terminals and scanner specific information.
Arguments
augmented_terminals
All valid terminals of the grammar. These include the specific common terminals
EOI, NEW_LINE, WHITESPACE, LINE_COMMENT, BLOCK_COMMENT with the value
UNMATCHABLE_TOKEN to provide consistent index handling for all scanner states.
scanner_specifics
The values of the five scanner specific common terminals EOI, NEW_LINE, WHITESPACE,
LINE_COMMENT and BLOCK_COMMENT
scanner_terminal_indices
The indices of token types belonging to this scanner state. These indices are pointing into
augmented_terminals.
Errors
This function will return an error if the regex patterns can’t be compiled.