Expand description
§Core support library for lexers and parsers generated by parlex-gen.
§Overview
Parlex is a suite of Rust-based tools for the generation of efficient lexical analyzers and parsers. The project comprises two complementary crates: parlex, the core support library, and parlex-gen, which provides the ALEX lexer generator and the ASLR parser generator. Together, these components form a cohesive and extensible framework for language parsing, analysis, and compiler front-end development.
The system is inspired by the classic lex (flex) and yacc (bison) utilities written for C, but provides a Rust-based implementation that is more composable and improves upon ambiguity resolution. Unlike lex and yacc, which interleave user-defined code with automatically generated code, Parlex maintains a strict separation between specification and implementation: grammar rules and lexer definitions are explicitly named, and user code refers to them symbolically.
The ALEX lexer generator offers expressive power comparable to that of lex or flex. It employs Rust’s standard regular expression libraries to construct deterministic finite automata (DFAs) that efficiently recognize lexical patterns at runtime. ALEX supports multiple lexical states, enabling precise and context-sensitive tokenization.
The ASLR parser generator implements the SLR(1) parsing algorithm, which is somewhat less general than the LALR(1) approach used by yacc and bison. However, ASLR introduces an important improvement: it supports dynamic runtime resolution of shift/reduce ambiguities, providing greater flexibility in languages such as Prolog, where operator definitions may be introduced or redefined dynamically.
The parlex crate serves as the core runtime and support library for the generated lexers and parsers. It defines the traits, data structures, and runtime abstractions that underpin the generated code, ensuring consistent behavior and interoperability across user-defined grammars. All lexers and parsers produced by the parlex-gen tools depend on this crate, and users extend its interfaces to build custom language processors based on their generated components.
§Usage
Add this to your Cargo.toml
:
[dependencies]
parlex = "0.1"
This crate is typically used in conjunction with code generated by parlex-gen
. See the
parlex-gen documentation for information on generating lexers and parsers.
§Example
use parlex::*;
// Your generated lexer and parser code will use parlex traits and types.
// Example usage depends on your specific grammar and generated code.
§License
Copyright (c) 2005–2025 IKH Software, Inc.
Released under the terms of the GNU Lesser General Public License, version 3.0 or (at your option) any later version (LGPL-3.0-or-later).
§See Also
- parlex-gen — Code generation tools (
alex
andaslr
) - arena-terms-parser — real-world example using ALEX and ASLR
Structs§
- Lexer
- Core lexer implementation and execution engine.
- Lexer
Stats - Statistics collected by the lexer during processing.
- Parser
- Core parser implementation and execution engine.
- Parser
Stats - Statistics collected during the parsing process.
Enums§
- Lexer
Error - Represents all possible errors that can occur during lexical analysis.
- Parser
Action - Parser action used by the parsing automaton.
- Parser
Error - Represents all possible errors that can occur during parsing.
Traits§
- Lexer
Data - Defines the data and configuration used by a lexer. Provides access to lexer modes, rules, DFA data, and lookup utilities.
- Lexer
Driver - Defines the core lexer interface responsible for processing input streams and producing tokens.
- Lexer
Mode - A trait representing a lexer mode used during lexical analysis. Each mode can be converted into an index and defines a total count of modes.
- Lexer
Rule - A trait representing a lexer rule used for token matching. Each rule can be converted into an index and defines a total count of rules along with a designated end rule.
- Parser
AmbigID - A trait representing an identifier for an ambiguity in the grammar.
- Parser
Data - Defines the data and configuration used by a parser. Provides access to parser states, productions, ambiguities, and lookup tables.
- Parser
Driver - Core interface for a parser driver.
- Parser
ProdID - A trait representing an identifier for a grammar production rule.
- Parser
StateID - A trait representing an identifier for a parser state.
- Parser
TokenID - A trait representing an identifier for a terminal or nonterminal token in the grammar.
- Token
- A trait representing a token in lexical analysis or parsing.