Crate regexr

Crate regexr 

Source
Expand description

§regexr

A regex engine built for tokenization and LLM text processing.

§When to Use

  • Use regex for general-purpose regex
  • Use regexr when building tokenizers or pipelines that need lookarounds with performance

§Features

  • Lookarounds: (?=...), (?!...), (?<=...), (?<!...)
  • Unicode properties: \p{L}, \p{N}, \p{M}, etc.
  • SIMD acceleration: AVX2, SSSE3, WASM v128
  • JIT compilation: Cranelift backend (native targets)
  • ReDoS protection: Bounded execution via memoization

§Quick Start

use regexr::Regex;

let re = Regex::new(r"\d{3}-\d{2}-\d{4}").unwrap();
assert!(re.is_match("123-45-6789"));

// Lookahead
let re = Regex::new(r"foo(?=bar)").unwrap();
assert!(re.is_match("foobar"));

// Find all matches
for m in re.find_iter("foobar foobaz") {
    println!("Found: {}", m.as_str());
}

// Capture groups
let re = Regex::new(r"(\w+)@(\w+)\.(\w+)").unwrap();
if let Some(caps) = re.captures("user@example.com") {
    println!("User: {}, Domain: {}", &caps[1], &caps[2]);
}

§Feature Flags

  • simd: Native SIMD (AVX2/SSSE3)
  • jit: JIT compilation via Cranelift (native only)
  • wasm-simd: WASM SIMD (v128)
  • wasm-slim: Minimal WASM build
  • advanced-cache: Advanced LRU cache with moka (high-concurrency scenarios)
  • parallel: Parallel execution with rayon
  • full: All optimizations

Modules§

analyzer
Pattern analysis and classification.
backtrack
Backtracking regex engine (Layer 2).
bytes
Bytes-based regex matching for binary data and LLM tokenization.
bytes_factory
Engine factory for bytes-mode regex execution.
cache
Pluggable regex caching.
engine
Multi-engine regex execution system.
parser
PCRE pattern parser.
simd
SIMD-accelerated scanning.
util
Utility functions and types.

Structs§

Captures
Capture groups from a match.
Error
An error that occurred during regex compilation or matching.
Match
A single match in the input text.
Regex
A compiled regular expression.
RegexBuilder
A builder for configuring and compiling a regex.

Enums§

EngineChoice
Engine selection for regex compilation.
ErrorKind
The kind of regex error.

Functions§

compile
Compile a regex pattern with default options.
escape
Escape all regex metacharacters in a string.
is_match
Check if a pattern matches anywhere in the input.

Type Aliases§

Result
A specialized Result type for regexr operations.