Rusty Regex
Rusty Regex is a minimalistic regex engine written in Rust. It compiles a regex pattern into an Abstract Syntax Tree (AST), then builds a Non-deterministic Finite Automaton (NFA), converts that NFA into a Deterministic Finite Automaton (DFA) via subset construction, and finally uses the DFA to match input strings.
Supported Features
- Literals & Wildcards
- Literal characters (e.g.,
"abc") - Wildcard
.matching any single character
- Literal characters (e.g.,
- Character Classes
- Explicit classes: e.g.,
[abc] - Ranges: e.g.,
[a-c]
- Explicit classes: e.g.,
- Predefined Classes
\dfor digits\wfor word characters (letters, digits, underscore)
- Quantifiers
- Zero or more (
*) - One or more (
+) - Zero or one (
?) - Exact count (
{n}) - Range (
{m,n}or{m,})
- Zero or more (
- Grouping & Capturing
- Parentheses for grouping (all groups are capturing by default)
- Alternation
- The pipe operator (
|) for alternatives
- The pipe operator (
- Anchors
- Start (
^) and end ($) anchors to match input positions
- Start (
Design Overview
Rusty Regex works in several phases:
- Parsing: Converts the regex string into an AST.
- NFA Construction: Transforms the AST into an NFA.
- DFA Conversion: Converts the NFA into a DFA using subset construction.
- Matching: Simulates the DFA over input strings (scanning for matches).
Example Usage
Below is a simple example demonstrating how to compile a regex and test input strings:
use Regex;