Rusty Regex
Rusty Regex is a regex engine written without dependencies. It compiles a regex pattern into an Abstract Syntax Tree (AST), then builds a Non-deterministic Finite Automaton (NFA), converts that NFA into a Deterministic Finite Automaton (DFA) via subset construction, and finally uses the DFA to match input strings.
Supported Features
- Literals & Wildcards
- Literal characters (e.g.,
"abc"
) - Wildcard
.
matching any single character
- Literal characters (e.g.,
- Character Classes
- Explicit classes: e.g.,
[abc]
- Ranges: e.g.,
[a-c]
- Explicit classes: e.g.,
- Predefined Classes
\d
for digits\w
for word characters (letters, digits, underscore)
- Quantifiers
- Zero or more (
*
) - One or more (
+
) - Zero or one (
?
) - Exact count (
{n}
) - Range (
{m,n}
or{m,}
)
- Zero or more (
- Grouping & Capturing
- Parentheses for grouping (all groups are capturing by default)
- Alternation
- The pipe operator (
|
) for alternatives
- The pipe operator (
- Anchors
- Start (
^
) and end ($
) anchors to match input positions
- Start (
Design Overview
Rusty Regex works in several phases:
- Parsing: Converts the regex string into an AST.
- NFA Construction: Transforms the AST into an NFA.
- DFA Conversion: Converts the NFA into a DFA using subset construction.
- Matching: Simulates the DFA over input strings (scanning for matches).
Example Usage
Below is a simple example demonstrating how to compile a regex and test input strings:
use Regex;