Skip to main content

Crate rexile

Crate rexile 

Source
Expand description

§ReXile 🦎

A blazing-fast regex engine with 10-100x faster compilation speed

ReXile is a lightweight regex alternative optimized for fast compilation while maintaining competitive matching performance.

§Quick Start

use rexile::Pattern;

// Literal matching with SIMD acceleration
let pattern = Pattern::new("hello").unwrap();
assert!(pattern.is_match("hello world"));

// Digit matching (1.4-1.9x faster than regex!)
let digits = Pattern::new(r"\d+").unwrap();
let matches = digits.find_all("Order #12345 costs $67.89");
assert_eq!(matches, vec![(7, 12), (20, 22), (23, 25)]);

// Dot wildcard with backtracking
let quoted = Pattern::new(r#""[^"]+""#).unwrap();
assert!(quoted.is_match(r#"say "hello world""#));

§Performance Highlights

Compilation Speed (vs regex crate): Compilation Speed (vs regex crate):

  • Pattern [a-zA-Z_]\w*: 104.7x faster compilation
  • Pattern \d+: 46.5x faster compilation
  • Average: 10-100x faster compilation

Memory Usage:

  • Compilation: 15x less memory (128 KB vs 1920 KB)
  • Compilation time: 10-100x faster on average
  • Peak memory: 5x less in stress tests

§Fast Path Optimizations

ReXile uses 10 specialized fast paths for common patterns:

PatternFast PathPerformance
\d+DigitRun1.4-1.9x faster
"[^"]+"QuotedString2.44x faster
[a-zA-Z_]\w*IdentifierRun104.7x faster compilation
\w+WordRunCompetitive
foo|bar|bazAlternation (aho-corasick)2x slower (acceptable)

§Supported Features

  • ✅ Literal searches with SIMD acceleration
  • ✅ Multi-pattern matching (alternations)
  • ✅ Character classes with negation ([a-z], [^abc])
  • ✅ Quantifiers (*, +, ?, {n}, {n,m})
  • ✅ Range quantifiers ({n}, {n,}, {n,m})
  • ✅ Case-insensitive flag ((?i))
  • ✅ Escape sequences (\d, \w, \s, etc.)
  • ✅ Sequences and groups
  • ✅ Word boundaries (\b, \B)
  • ✅ Anchoring (^, $)

§Use Cases

ReXile is production-ready for:

  • ✅ Parsers & lexers - 10-100x faster compilation, instant startup
  • ✅ Rule engines - Original use case (GRL parsing)
  • ✅ Log processing - Fast keyword extraction
  • ✅ Dynamic patterns - Applications that compile patterns at runtime
  • ✅ Memory-constrained environments - 15x less compilation memory
  • ✅ Low-latency applications - Predictable performance

§Cached API

For patterns used repeatedly in hot loops:

use rexile;

// Automatically cached - compile once, reuse forever
assert!(rexile::is_match("test", "this is a test").unwrap());
assert_eq!(rexile::find("world", "hello world").unwrap(), Some((6, 11)));

§Architecture

Pattern → Parser → AST → Fast Path Detection → Specialized Matcher
                                                       ↓
                                    DigitRun (memchr SIMD)
                                    IdentifierRun (direct bytes)
                                    QuotedString (memchr + validation)
                                    Alternation (aho-corasick)
                                    ... 6 more fast paths

Dependencies: Only memchr and aho-corasick for SIMD primitives

§When to Use ReXile vs regex

Choose ReXile for:

  • Digit extraction (\d+) - 3.57x faster
  • Quoted strings ("[^"]+") - 2.44x faster
  • Identifiers ([a-zA-Z_]\w*) - Much faster
  • Dynamic pattern compilation - 21x faster
  • Memory-constrained environments - 15x less memory

Choose regex crate for:

  • Complex alternations (ReXile 2x slower)
  • Unicode properties (\p{L} - not yet supported)
  • Advanced features (lookahead, backreferences - not yet supported)

§License

Licensed under either of MIT or Apache-2.0 at your option.

Re-exports§

pub use optimization::literal;
pub use optimization::prefilter;

Modules§

optimization
Optimization module - fast paths and performance optimizations

Structs§

CaptureGroup
A capture group in the pattern
Captures
A set of captured substrings from a single match
CapturesIter
Iterator over captures for each match
FindIter
Iterator over pattern matches
Match
A single match in the haystack.
Pattern
Main ReXile pattern type
SplitIter
Iterator over text split by pattern matches

Enums§

PatternError

Functions§

find
get_pattern
is_match

Type Aliases§

ReXile
Type alias for convenience