safe-regex-0.2.0 has been yanked.
safe-regex
A safe regular expression library.
Features
forbid(unsafe_code)
- Good test coverage (~80%)
- Runtime is linear. Memory usage is constant.
Runtime and memory usage are both
O(n * r * g)
wheren
is the length of the data to checkr
is the length of the regexg
is the number of capturing groups in the regex
- Does not allocate
no_std
- Rust compiler checks and optimizes the matcher
- Supports basic regular expression syntax:
- Any byte:
.
- Sequences:
abc
- Classes:
[-ab0-9]
,[^ab]
- Repetition:
a?
,a*
,a+
,a{1}
,a{1,}
,a{,1}
,a{1,2}
,a{,}
- Alternates:
a|b|c
- Capturing groups:
a(b*)?
- Any byte:
Limitations
-
Only works on byte slices, not strings.
-
Partially optimized. Runtime is about 10 times slower than
regex
crate. Here are relative runtimes measured withsafe-regex-rs/bench
run on a 2018 Macbook Pro:regex
safe_regex
expression 1 6 find phone num .*([0-9]{3})[-. ]?([0-9]{3})[-. ]?([0-9]{4}).*
1 18 find date time .*([0-9]+)-([0-9]+)-([0-9]+) ([0-9]+):([0-9]+).*
1 0.9 parse date time ([0-9]+)-([0-9]+)-([0-9]+) ([0-9]+):([0-9]+)
1 30 check PEM Base64 [a-zA-Z0-9+/=]{0,64}=*
1 20-550 substring search .*(2G8H81RFNZ).*
Alternatives
regex
- Mature & Popular
- Maintained by the core Rust language developers
- Contains
unsafe
code.
pcre2
- Uses PCRE library which is written in unsafe C.
regular-expression
- No documentation
rec
Cargo Geiger Safety Report
Metric output format: x/y
x = unsafe code used by the build
y = total unsafe code found in the crate
Symbols:
🔒 = No `unsafe` usage found, declares #![forbid(unsafe_code)]
❓ = No `unsafe` usage found, missing #![forbid(unsafe_code)]
☢️ = `unsafe` usage found
Functions Expressions Impls Traits Methods Dependency
0/0 0/0 0/0 0/0 0/0 🔒 safe-regex 0.2.0
0/0 0/0 0/0 0/0 0/0 🔒 └── safe-regex-macro 0.2.0
0/0 0/0 0/0 0/0 0/0 🔒 ├── safe-proc-macro2 1.0.24
0/0 0/0 0/0 0/0 0/0 🔒 │ └── unicode-xid 0.2.1
0/0 0/0 0/0 0/0 0/0 🔒 └── safe-regex-compiler 0.2.0
0/0 0/0 0/0 0/0 0/0 🔒 ├── safe-proc-macro2 1.0.24
0/0 0/0 0/0 0/0 0/0 🔒 └── safe-quote 1.0.9
0/0 0/0 0/0 0/0 0/0 🔒 └── safe-proc-macro2 1.0.24
0/0 0/0 0/0 0/0 0/0
Examples
use ;
let matcher: =
regex!;
assert!;
assert!;
use ;
let matcher: =
regex!;
let =
matcher.match_all.unwrap;
assert_eq!;
assert_eq!;
Changelog
- v0.2.0
- Linear-time & constant-memory algorithm! :)
- Work around rustc optimizer hang on regexes with exponential execution paths like "a{,30}".
See
src/bin/uncompilable/main.rs
.
- v0.1.1 - Bug fixes and more tests.
- v0.1.0 - First published version
TO DO
- DONE - Read about regular expressions
- DONE - Read about NFAs, https://swtch.com/~rsc/regexp/
- DONE - Design API
- DONE - Implement
- DONE - Add integration tests
- Simplify
match_all
return type - Non-capturing groups
- 11+ capturing groups
- Increase coverage
- Add fuzzing tests
- Common character classes: whitespace, letters, punctuation, etc.
- Match strings
- Implement optimizations explained in https://swtch.com/%7Ersc/regexp/regexp3.html .
Some of the code already exists in
tests/dfa_single_pass.rs
andtests/nfa_without_capturing.rs
. - Once const generics are stable, use the feature to simplify some types.
- Once
trait bounds on `const fn` parameters are stable,
make the
MatcherN::new
functionsconst
.
Release Process
- Edit
Cargo.toml
and bump version number. - Run
../release.sh
License: Apache-2.0