A safe regular expression library.
Features
forbid(unsafe_code)- Good test coverage (~80%)
- Runtime is linear.
- Memory usage is constant. Does not allocate.
- Compiles your regular expression to a simple Rust function
- Rust compiler checks and optimizes the matcher
- Supports basic regular expression syntax:
- Any byte:
. - Sequences:
abc - Classes:
[-ab0-9],[^ab] - Repetition:
a?,a*,a+,a{1},a{1,},a{,1},a{1,2},a{,} - Alternates:
a|b|c - Capturing groups:
a(bc)? - Non-capturing groups:
a(?:bc)?
- Any byte:
no_std, by omitting the default"std"feature
Limitations
-
Only works on byte slices, not strings.
-
Partially optimized. Runtime is about 10 times slower than
regexcrate. Here are relative runtimes measured withsafe-regex-rs/benchrun on a 2018 Macbook Pro:regexsafe_regexexpression 1 6 find phone num .*([0-9]{3})[-. ]?([0-9]{3})[-. ]?([0-9]{4}).*1 20 find date time .*([0-9]+)-([0-9]+)-([0-9]+) ([0-9]+):([0-9]+).*1 0.75 parse date time ([0-9]+)-([0-9]+)-([0-9]+) ([0-9]+):([0-9]+)1 50 check PEM Base64 [a-zA-Z0-9+/]{0,64}=*1 20-500 substring search .*(2G8H81RFNZ).*
Alternatives
regex- Mature & Popular
- Maintained by the core Rust language developers
- Contains
unsafecode. - Allocates
- Compiles your regular expression at runtime at first use.
- Subsequent uses must retrieve it from the cache.
pcre2- Uses PCRE library which is written in unsafe C.
regular-expression- No documentation
rec
Cargo Geiger Safety Report
Metric output format: x/y
x = unsafe code used by the build
y = total unsafe code found in the crate
Symbols:
🔒 = No `unsafe` usage found, declares #![forbid(unsafe_code)]
❓ = No `unsafe` usage found, missing #![forbid(unsafe_code)]
☢️ = `unsafe` usage found
Functions Expressions Impls Traits Methods Dependency
0/0 0/0 0/0 0/0 0/0 🔒 safe-regex 0.3.0
0/0 0/0 0/0 0/0 0/0 🔒 └── safe-regex-macro 0.3.0
0/0 0/0 0/0 0/0 0/0 🔒 ├── safe-proc-macro2 1.0.68
0/0 0/0 0/0 0/0 0/0 🔒 │ └── unicode-xid 0.2.4
0/0 0/0 0/0 0/0 0/0 🔒 └── safe-regex-compiler 0.3.0
0/0 0/0 0/0 0/0 0/0 🔒 ├── safe-proc-macro2 1.0.68
0/0 0/0 0/0 0/0 0/0 🔒 └── safe-quote 1.0.15
0/0 0/0 0/0 0/0 0/0 🔒 └── safe-proc-macro2 1.0.68
0/0 0/0 0/0 0/0 0/0
Examples
use ;
let matcher: =
regex!;
assert!;
assert!;
use ;
let matcher: =
regex!;
let =
matcher.match_slices.unwrap;
assert_eq!;
assert_eq!;
assert_eq!;
let
= matcher.match_ranges.unwrap;
assert_eq!;
assert_eq!;
assert_eq!;
Changelog
- v0.3.0 - Add
assert_matchand defaultstdfeature. - v0.2.6 - Fix some Clippy warnings on
regex!macro invocation sites. - v0.2.5 - Fix
no_std. Thank you, Soares Chen! github.com/soareschen gitlab.com/soareschen-informal - v0.2.4
- Bug fixes, reducing performance.
- Optimize non-match runtime.
- v0.2.3
- Rename
match_all->match_slices. - Add
match_ranges.
- Rename
- v0.2.2 - Simplify
match_allreturn type - v0.2.1 - Non-capturing groups, bug fixes
- v0.2.0
- Linear-time & constant-memory algorithm! :)
- Work around rustc optimizer hang on regexes with exponential execution paths like "a{,30}".
See
src/bin/uncompilable/main.rs.
- v0.1.1 - Bug fixes and more tests.
- v0.1.0 - First published version
TO DO
- 11+ capturing groups
- Increase coverage
- Add fuzzing tests
- Common character classes: whitespace, letters, punctuation, etc.
- Match strings
- Repeated capturing groups:
(ab|cd)*. Idea: Return anMatcherNIterstruct that is an iterator that returnsMatcherNstructs. - Implement optimizations explained in https://swtch.com/%7Ersc/regexp/regexp3.html .
Some of the code already exists in
tests/dfa_single_pass.rsandtests/nfa_without_capturing.rs. - Once const generics are stable, use the feature to simplify some types.
- Once
trait bounds on
const fnparameters are stable, make theMatcherN::newfunctionsconst.
Development
- An overview of how this library works: https://news.ycombinator.com/item?id=27301320
License: Apache-2.0