Crate safe_regex
source ·Expand description
A safe regular expression library.
§Features
forbid(unsafe_code)
- Good test coverage (~80%)
- Runtime is linear.
- Memory usage is constant. Does not allocate.
- Compiles your regular expression to a simple Rust function
- Rust compiler checks and optimizes the matcher
- Supports basic regular expression syntax:
- Any byte:
.
- Sequences:
abc
- Classes:
[-ab0-9]
,[^ab]
- Repetition:
a?
,a*
,a+
,a{1}
,a{1,}
,a{,1}
,a{1,2}
,a{,}
- Alternates:
a|b|c
- Capturing groups:
a(bc)?
- Non-capturing groups:
a(?:bc)?
- Any byte:
no_std
, by omitting the default"std"
feature
§Limitations
-
Only works on byte slices, not strings.
-
Partially optimized. Runtime is about 10 times slower than
regex
crate. Here are relative runtimes measured withsafe-regex-rs/bench
run on a 2018 Macbook Pro:regex
safe_regex
expression 1 6 find phone num .*([0-9]{3})[-. ]?([0-9]{3})[-. ]?([0-9]{4}).*
1 20 find date time .*([0-9]+)-([0-9]+)-([0-9]+) ([0-9]+):([0-9]+).*
1 0.75 parse date time ([0-9]+)-([0-9]+)-([0-9]+) ([0-9]+):([0-9]+)
1 50 check PEM Base64 [a-zA-Z0-9+/]{0,64}=*
1 20-500 substring search .*(2G8H81RFNZ).*
§Alternatives
regex
- Mature & Popular
- Maintained by the core Rust language developers
- Contains
unsafe
code. - Allocates
- Compiles your regular expression at runtime at first use.
- Subsequent uses must retrieve it from the cache.
pcre2
- Uses PCRE library which is written in unsafe C.
regular-expression
- No documentation
rec
§Cargo Geiger Safety Report
§Examples
use safe_regex::{regex, Matcher0};
let matcher: Matcher0<_> =
regex!(br"[ab][0-9]*");
assert!(matcher.is_match(b"a42"));
assert!(!matcher.is_match(b"X"));
use safe_regex::{regex, Matcher3};
let matcher: Matcher3<_> =
regex!(br"([ab])([0-9]*)(suffix)?");
let (prefix, digits, suffix) =
matcher.match_slices(b"a42").unwrap();
assert_eq!(b"a", prefix);
assert_eq!(b"42", digits);
assert_eq!(b"", suffix);
let (prefix_range, digits_r, suffix_r)
= matcher.match_ranges(b"a42").unwrap();
assert_eq!(0..1_usize, prefix_range);
assert_eq!(1..3_usize, digits_r);
assert_eq!(0..0_usize, suffix_r);
§Changelog
- v0.3.0 - Add
assert_match
and defaultstd
feature. - v0.2.6 - Fix some Clippy warnings on
regex!
macro invocation sites. - v0.2.5 - Fix
no_std
. Thank you, Soares Chen! github.com/soareschen gitlab.com/soareschen-informal - v0.2.4
- Bug fixes, reducing performance.
- Optimize non-match runtime.
- v0.2.3
- Rename
match_all
->match_slices
. - Add
match_ranges
.
- Rename
- v0.2.2 - Simplify
match_all
return type - v0.2.1 - Non-capturing groups, bug fixes
- v0.2.0
- Linear-time & constant-memory algorithm! :)
- Work around rustc optimizer hang on regexes with exponential execution paths like “a{,30}”.
See
src/bin/uncompilable/main.rs
.
- v0.1.1 - Bug fixes and more tests.
- v0.1.0 - First published version
§TO DO
- 11+ capturing groups
- Increase coverage
- Add fuzzing tests
- Common character classes: whitespace, letters, punctuation, etc.
- Match strings
- Repeated capturing groups:
(ab|cd)*
. Idea: Return anMatcherNIter
struct that is an iterator that returnsMatcherN
structs. - Implement optimizations explained in https://swtch.com/%7Ersc/regexp/regexp3.html .
Some of the code already exists in
tests/dfa_single_pass.rs
andtests/nfa_without_capturing.rs
. - Once const generics are stable, use the feature to simplify some types.
- Once
trait bounds on
const fn
parameters are stable, make theMatcherN::new
functionsconst
.
§Development
- An overview of how this library works: https://news.ycombinator.com/item?id=27301320
Macros§
- Compiles a regular expression into a Rust type.
Structs§
- A compiled regular expression with no capturing groups.
- A compiled regular expression with 1 capturing groups.
- A compiled regular expression with 2 capturing groups.
- A compiled regular expression with 3 capturing groups.
- A compiled regular expression with 4 capturing groups.
- A compiled regular expression with 5 capturing groups.
- A compiled regular expression with 6 capturing groups.
- A compiled regular expression with 7 capturing groups.
- A compiled regular expression with 8 capturing groups.
- A compiled regular expression with 9 capturing groups.
- A compiled regular expression with 10 capturing groups.
Traits§
- Provides an
is_match
function.