Expand description
This crate provides tools for compiling and using regular expressions.
It is intended as a simple but compiler-checked version of the regex crate, as it does regular expression compilation at compile-time, but only supports POSIX Extended Regular Expressions*.
§Usage
use ere::prelude::*;
const PHONE_REGEX: Regex<2> = compile_regex!(r"^(\+1 )?[0-9]{3}-[0-9]{3}-[0-9]{4}$");
fn test() {
assert!(PHONE_REGEX.test("012-345-6789"));
assert!(PHONE_REGEX.test("987-654-3210"));
assert!(PHONE_REGEX.test("+1 555-555-5555"));
assert!(PHONE_REGEX.test("123-555-9876"));
assert!(!PHONE_REGEX.test("abcd"));
assert!(!PHONE_REGEX.test("0123456789"));
assert!(!PHONE_REGEX.test("012--345-6789"));
assert!(!PHONE_REGEX.test("(555) 555-5555"));
assert!(!PHONE_REGEX.test("1 555-555-5555"));
}
const COLOR_REGEX: Regex<5> = compile_regex!(
r"^#?([[:xdigit:]]{2})([[:xdigit:]]{2})([[:xdigit:]]{2})([[:xdigit:]]{2})?$"
);
fn exec() {
assert_eq!(
COLOR_REGEX.exec("#000000"),
Some([
Some("#000000"),
Some("00"),
Some("00"),
Some("00"),
None,
]),
);
assert_eq!(
COLOR_REGEX.exec("1F2e3D"),
Some([
Some("1F2E3D"),
Some("1F"),
Some("2e"),
Some("3D"),
None,
]),
);
assert_eq!(
COLOR_REGEX.exec("ffffff80"),
Some([
Some("ffffff80"),
Some("ff"),
Some("ff"),
Some("ff"),
Some("80"),
]),
);
assert_eq!(PHONE_REGEX.exec("green"), None);
assert_eq!(PHONE_REGEX.exec("%FFFFFF"), None);
assert_eq!(PHONE_REGEX.exec("#2"), None);
}To minimize memory overhead and binary size, it is recommended to create a single instance of each regular expression (using a const variable) rather than creating multiple.
*Some features are not fully implemented, such as POSIX-mode ambiguous submatch rules (we currently use greedy mode, which is the much more common and efficient method). See the roadmap for more details.
§Alternatives
ere is intended as an alternative to regex that provides compile-time checking and regex compilation. However, ere is less featureful, so here are a few reasons you might prefer regex:
- You require more complex regular expressions with features like backreferences and word boundary checking (which are unavailable in POSIX EREs).
- You need run-time-compiled regular expressions (such as when provided by the user).
- Your regular expression runs significantly more efficiently on a specific regex engine not currently available in
ere.
Modules§
- config
- dfa_u8
- Implements a statically-built DFA over
u8s. Executing a DFA is much faster than running an NFA due to dealing with only a single thread. However, since the upper bound of a DFA’s size is exponential in the number of NFA states, we may need to cancel static construction if the DFA becomes too large. - fixed_
offset - This is a highly-efficient implementation for regexes where capture groups always have the same offset and the same length (in bytes). This also means the text has a fixed length.
- flat_
lockstep_ nfa - Implements an nfa-like regex engine for over
chars. The engine keeps all threads in lockstep (all threads are at the same input index), and the NFA’s epsilon transitions are flattened to a single epsilon transition between symbols (including handling anchors and capture tags). - flat_
lockstep_ nfa_ u8 - Implements an nfa-like regex engine for over
u8s. The engine keeps all threads in lockstep (all threads are at the same input index), and the NFA’s epsilon transitions are flattened to a single epsilon transition between symbols (including handling anchors and capture tags). - nfa_
static - Implements a version of
WorkingNFAthat can be serialized statically into a binary. - one_
pass_ u8 - This implements an engine for one-pass regexes.
- parse_
tree - Implements the ERE parser
and primitive types (like
Atom). - prelude
- Includes the basic things you’ll need.
- simplified_
tree - Implements a simplified intermediate representation of a regular expression.
- visualization
- Implements visualization features for NFAs. Typically not used outside debug.
- working_
nfa - Implements the primary compile-time intermediate
WorkingNFAstructure for optimization. - working_
u8_ dfa - Working datastructure for a tagged DFA over
u8s. Primarily intended for use at compile time, converted from [crate::U8NFA]. - working_
u8_ nfa - Implements
u8-based version ofcrate::working_nfa.
Macros§
- compile_
regex - This is the primary entrypoint to the
erecrate. Checks and compiles a regular expression into aRegex<N>. - compile_
regex_ dfa_ u8 - Checks and compiles a regular expression into a into a [
ere_core::Regex<N>] with the [ere_core::dfa_u8] engine. Unless you specifically want this engine, you might want to usecompile_regex!instead. - compile_
regex_ fixed_ offset - Checks and compiles a regular expression into a [
ere_core::Regex<N>] with the [ere_core::fixed_offset] engine. Unless you specifically want this engine, you might want to usecompile_regex!instead. - compile_
regex_ flat_ lockstep_ nfa - Checks and compiles a regular expression into a into a [
ere_core::Regex<N>] with the [ere_core::flat_lockstep_nfa] engine. Unless you specifically want this engine, you might want to usecompile_regex!instead. - compile_
regex_ flat_ lockstep_ nfa_ u8 - Checks and compiles a regular expression into a into a [
ere_core::Regex<N>] with the [ere_core::flat_lockstep_nfa_u8] engine. Unless you specifically want this engine, you might want to usecompile_regex!instead. - compile_
regex_ u8onepass - Checks and compiles a regular expression into a [
ere_core::Regex<N>] with the [ere_core::one_pass_u8] engine. Unless you specifically want this engine, you might want to usecompile_regex!instead.
Structs§
Functions§
- __
compile_ regex - Tries to pick the best engine.
- __
compile_ regex_ engine_ dfa_ u8 - Always uses the
dfa_u8engine - __
compile_ regex_ engine_ fixed_ offset - Always uses the
fixed_offset - __
compile_ regex_ engine_ flat_ lockstep_ nfa - Always uses the
flat_lockstep_nfaengine - __
compile_ regex_ engine_ flat_ lockstep_ nfa_ u8 - Always uses the
flat_lockstep_nfa_u8engine - __
compile_ regex_ engine_ one_ pass_ u8 - Always uses the
one_pass_u8 - __
construct_ regex - Intended to be used in macros only.