ReXile ๐ฆ
A blazing-fast regex engine with 10-100x faster compilation speed
ReXile is a lightweight regex alternative that achieves exceptional compilation speed while maintaining competitive matching performance:
- โก 10-100x faster compilation - Load patterns instantly
- ๐ Competitive matching - 1.4-1.9x faster on simple patterns
- ๐ฏ Dot wildcard support - Full
.,.*,.+implementation with backtracking - ๐ฆ Only 2 dependencies -
memchrandaho-corasickfor SIMD primitives - ๐ง Smart backtracking - Handles complex patterns with quantifiers
- ๐ง Perfect for parsers - Ideal for GRL, DSL, and rule engines
Key Features:
- โ Literal searches with SIMD acceleration
- โ Multi-pattern matching (alternations)
- โ Character classes with negation
- โ
Quantifiers (
*,+,?) - โ
Dot wildcard (
.,.*,.+) with backtracking - โ
Escape sequences (
\d,\w,\s, etc.) - โ Sequences and groups
- โ
Word boundaries (
\b,\B) - โ
Anchoring (
^,$) - โ Capturing groups - Auto-detection and extraction
๐ฏ Purpose
ReXile is a high-performance regex engine optimized for fast compilation:
- ๐ Lightning-fast compilation - 10-100x faster than
regexcrate - โก Competitive matching - Faster on simple patterns, acceptable on complex
- ๐ฏ Ideal for parsers - GRL, DSL, rule engines with dynamic patterns
- ๐ฆ Minimal dependencies - Only
memchr+aho-corasickfor SIMD primitives - Memory efficient - 15x less compilation memory
- ๐ง Full control - Custom optimizations for specific use cases
Performance Highlights
Compilation Speed (vs regex crate):
- Pattern
[a-zA-Z_]\w*: 104.7x faster ๐ - Pattern
\d+: 46.5x faster ๐ - Pattern
(\w+)\s*(>=|<=|==|!=|>|<)\s*(.+): 40.7x faster ๐ - Pattern
.*test.*: 15.3x faster - Average: 10-100x faster compilation
Matching Speed:
- Simple patterns (
\d+,\w+): 1.4-1.9x faster โ - Complex patterns with backtracking: 2-10x slower (acceptable for non-hot-path)
- Perfect trade-off for parsers and rule engines
Use Case Example (Load 1000 GRL rules):
- regex crate: ~2 seconds compilation
- rexile: ~0.02 seconds (100x faster startup!)
Memory Comparison:
- Compilation: 15x less memory (128 KB vs 1920 KB)
- Peak memory: 5x less in stress tests (0.12 MB vs 0.62 MB)
- Search operations: Equal memory efficiency
When to Use ReXile:
- โ Parsers & lexers (fast token matching + instant startup)
- โ Rule engines with dynamic patterns (100x faster rule loading)
- โ DSL compilers (GRL, business rules)
- โ Applications with many patterns (instant initialization)
- โ Memory-constrained environments (15x less memory)
- โ Non-hot-path matching (acceptable trade-off for 100x faster compilation)
๐ Quick Start
use Pattern;
// Literal matching with SIMD acceleration
let pattern = new.unwrap;
assert!;
assert_eq!;
// Multi-pattern matching (aho-corasick fast path)
let multi = new.unwrap;
assert!;
// Dot wildcard matching (with backtracking)
let dot = new.unwrap;
assert!; // . matches 'b'
assert!; // . matches '_'
// Greedy quantifiers with dot
let greedy = new.unwrap;
assert!; // .* matches 'b'
assert!; // .* matches '12345'
let plus = new.unwrap;
assert!; // .+ matches 'b' (requires at least one char)
assert!; // .+ needs at least 1 character
// Digit matching (DigitRun fast path - 1.4-1.9x faster than regex!)
let digits = new.unwrap;
let matches = digits.find_all;
// Returns: [(7, 12), (20, 22), (23, 25)]
// Identifier matching (IdentifierRun fast path)
let ident = new.unwrap;
assert!;
// Quoted strings (QuotedString fast path - 1.4-1.9x faster!)
let quoted = new.unwrap;
assert!;
// Word boundaries
let word = new.unwrap;
assert!;
assert!;
// Anchors
let exact = new.unwrap;
assert!;
assert!;
Cached API (Recommended for Hot Paths)
For patterns used repeatedly in hot loops:
use rexile;
// Automatically cached - compile once, reuse forever
assert!;
assert_eq!;
// Perfect for parsers and lexers
for line in log_lines
โจ Supported Features
Fast Path Optimizations (10 Types)
ReXile uses JIT-style specialized implementations for common patterns:
| Fast Path | Pattern Example | Performance vs regex |
|---|---|---|
| Literal | "hello" |
Competitive (SIMD) |
| LiteralPlusWhitespace | "rule " |
Competitive |
| DigitRun | \d+ |
1.4-1.9x faster โจ |
| IdentifierRun | [a-zA-Z_]\w* |
104.7x faster compilation |
| QuotedString | "[^"]+" |
1.4-1.9x faster โจ |
| WordRun | \w+ |
Competitive |
| DotWildcard | ., .*, .+ |
With backtracking |
| Alternation | foo|bar|baz |
2x slower (acceptable) |
| LiteralWhitespaceQuoted | Complex | Competitive |
| LiteralWhitespaceDigits | Complex | Competitive |
Regex Features
| Feature | Example | Status |
|---|---|---|
| Literal strings | hello, world |
โ Supported |
| Alternation | foo|bar|baz |
โ Supported (aho-corasick) |
| Start anchor | ^start |
โ Supported |
| End anchor | end$ |
โ Supported |
| Exact match | ^exact$ |
โ Supported |
| Character classes | [a-z], [0-9], [^abc] |
โ Supported |
| Quantifiers | *, +, ? |
โ Supported |
| Dot wildcard | ., .*, .+ |
โ Supported (v0.2.0) |
| Escape sequences | \d, \w, \s, \., \n, \t |
โ Supported |
| Sequences | ab+c*, \d+\w* |
โ Supported |
| Groups | (abc), (?:...) |
โ Supported |
| Word boundaries | \b, \B |
โ Supported |
| Capturing groups | Extract (group) |
โ Supported (v0.2.0) |
| Bounded quantifiers | {n}, {n,m} |
๐ง Planned |
| Lookahead/lookbehind | (?=...), (?<=...) |
๐ง Planned |
| Backreferences | \1, \2 |
๐ง Planned |
๐ Performance Benchmarks
Compilation Speed (Primary Advantage)
Pattern Compilation Benchmark (vs regex crate):
| Pattern | rexile | regex | Speedup |
|---|---|---|---|
[a-zA-Z_]\w* |
95.2 ns | 9.97 ยตs | 104.7x faster ๐ |
\d+ |
86.7 ns | 4.03 ยตs | 46.5x faster ๐ |
(\w+)\s*(>=|<=|==|!=|>|<)\s*(.+) |
471 ns | 19.2 ยตs | 40.7x faster ๐ |
.*test.* |
148 ns | 2.27 ยตs | 15.3x faster ๐ |
Average: 10-100x faster compilation - Perfect for dynamic patterns!
Matching Speed
Simple Patterns (Fast paths):
- Pattern
\d+on "12345": 1.4-1.9x faster โ - Pattern
\w+on "variable": 1.4-1.9x faster โ - Pattern
"[^"]+"on quoted strings: Competitive โ
Complex Patterns (Backtracking):
- Pattern
a.+con "abc": 2-5x slower (acceptable) - Pattern
.*test.*on long strings: 2-10x slower (acceptable) - Trade-off: 100x faster compilation vs slightly slower complex matching
Use Case Performance
Loading 1000 GRL Rules:
- regex crate: ~2 seconds (2ms per pattern)
- rexile: ~0.02 seconds (20ยตs per pattern)
- Result: 100x faster startup! Perfect for parsers and rule engines.
Memory Comparison
Test 1: Pattern Compilation (10 patterns):
- regex: 1920 KB in 7.89ms
- ReXile: 128 KB in 370ยตs
- Result: 15x less memory, 21x faster โจ
Test 2: Search Operations (5 patterns ร 139KB corpus):
- Both: 0 bytes memory delta
- Result: Equal efficiency โ
Test 3: Stress Test (50 patterns ร 500KB corpus):
- regex: 0.62 MB peak in 46ms
- ReXile: 0.12 MB peak in 27ms
- Result: 5x less peak memory, 1.7x faster โจ
When ReXile Wins
โ
Simple patterns (\d+, \w+) - 1.4-1.9x faster matching
โ
Fast compilation - 10-100x faster pattern compilation (huge win!)
โ
Identifiers ([a-zA-Z_]\w*) - 104.7x faster compilation
โ
Memory efficiency - 15x less for compilation, 5x less peak
โ
Instant startup - Load 1000 patterns in 0.02s vs 2s (100x faster)
โ
Dot wildcards - Full ., .*, .+ support with backtracking
When regex Wins
โ ๏ธ Complex patterns with backtracking - ReXile 2-10x slower (acceptable trade-off)
โ ๏ธ Alternations (when|then) - ReXile 2x slower
โ ๏ธ Hot-path matching - For performance-critical matching, regex may be better
Architecture
Pattern โ Parser โ AST โ Fast Path Detection โ Specialized Matcher
โ
DigitRun (memchr SIMD scanning)
IdentifierRun (direct byte scanning)
QuotedString (memchr + validation)
Alternation (aho-corasick automaton)
Literal (memchr SIMD)
... 5 more fast paths
Run benchmarks yourself:
๐ฆ Installation
Add to your Cargo.toml:
[]
= "0.2"
๐ Examples
Literal Search
let p = new.unwrap;
assert!;
assert_eq!;
// Find all occurrences
let matches = p.find_all;
assert_eq!;
Multi-Pattern (Alternation)
// Fast multi-pattern search using aho-corasick
let keywords = new.unwrap;
assert!;
Anchored Patterns
// Must start with pattern
let starts = new.unwrap;
assert!;
assert!;
// Must end with pattern
let ends = new.unwrap;
assert!;
assert!;
// Exact match
let exact = new.unwrap;
assert!;
assert!;
Cached API (Best for Repeated Patterns)
// First call compiles and caches
is_match.unwrap;
// Subsequent calls reuse cached pattern (zero compile cost)
is_match.unwrap;
is_match.unwrap;
๐ More examples: See examples/ directory for:
basic_usage.rs- Core API walkthroughlog_processing.rs- Log analysis patternsperformance.rs- Performance comparison
Run examples with:
๐ง Use Cases
ReXile is production-ready for:
โ Ideal Use Cases
- Parsers and lexers - 21x faster pattern compilation, competitive matching
- Rule engines - Simple pattern matching in business rules (original use case!)
- Log processing - Fast keyword and pattern extraction
- Dynamic patterns - Applications that compile patterns at runtime
- Memory-constrained environments - 15x less compilation memory
- Low-latency applications - Predictable performance, no JIT warmup
๐ฏ Perfect Patterns for ReXile
- Fast compilation: All patterns compile 10-100x faster
- Simple matching:
\d+,\w+(1.4-1.9x faster matching) - Identifiers:
[a-zA-Z_]\w*(104.7x faster compilation!) - Dot wildcards:
.,.*,.+with proper backtracking - Keyword search:
rule\s+,function\s+ - Many patterns: Load 1000 patterns instantly (100x faster startup)
โ ๏ธ Consider regex crate for
- Complex alternations (ReXile 2x slower)
- Very sparse patterns (ReXile up to 1.44x slower)
- Unicode properties (
\p{L}- not yet supported) - Advanced features (lookahead, backreferences - not yet supported)
๐ค Contributing
Contributions welcome! ReXile is actively maintained and evolving.
Current focus:
- โ Core regex features complete
- โ
Dot wildcard (
.,.*,.+) with backtracking - v0.2.0 - โ Capturing groups - Auto-detection and extraction - v0.2.0
- โ 10-100x faster compilation
- ๐ Advanced features: bounded quantifiers
{n,m}, lookahead, Unicode support
How to contribute:
- Check issues for open tasks
- Run tests:
cargo test - Run benchmarks:
cargo run --release --example per_file_grl_benchmark - Submit PR with benchmarks showing performance impact
Priority areas:
- ๐ Bounded quantifiers (
{n},{n,m}) - ๐ More fast path patterns
- ๐ Unicode support
- ๐ Documentation improvements
- ๐ Non-greedy quantifiers (
*?,+?)
๐ License
Licensed under either of:
- MIT License (LICENSE-MIT or http://opensource.org/licenses/MIT)
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
at your option.
๐ Credits
Built on top of:
memchrby Andrew Gallant - SIMD-accelerated substring searchaho-corasickby Andrew Gallant - Multi-pattern matching automaton
Developed for the rust-rule-engine project, providing fast pattern matching for GRL (Grule Rule Language) parsing and business rule evaluation.
Performance Philosophy: ReXile achieves competitive performance through intelligent specialization rather than complex JIT compilation:
- 10 hand-optimized fast paths for common patterns
- SIMD acceleration via memchr
- Pre-built automatons for alternations
- Zero-copy iterator design
- Minimal metadata overhead
Status: โ Production Ready (v0.2.0)
- โ Compilation Speed: 10-100x faster than regex crate
- โ Matching Speed: 1.4-1.9x faster on simple patterns
- โ Memory: 15x less compilation, 5x less peak
- โ Features: Core regex + dot wildcard + capturing groups
- โ Testing: 77 unit tests passing, comprehensive benchmarks
- โ Real-world validated: GRL parsing, rule engines, DSL compilers