Crate ere

Crate ere 

Source
Expand description

Crates.io Version docs.rs

This crate provides tools for compiling and using regular expressions. It is intended as a simple but compiler-checked version of the regex crate, as it does regular expression compilation at compile-time, but only supports POSIX Extended Regular Expressions*.

§Usage

use ere::prelude::*;

const PHONE_REGEX: Regex<2> = compile_regex!(r"^(\+1 )?[0-9]{3}-[0-9]{3}-[0-9]{4}$");
fn test() {
    assert!(PHONE_REGEX.test("012-345-6789"));
    assert!(PHONE_REGEX.test("987-654-3210"));
    assert!(PHONE_REGEX.test("+1 555-555-5555"));
    assert!(PHONE_REGEX.test("123-555-9876"));

    assert!(!PHONE_REGEX.test("abcd"));
    assert!(!PHONE_REGEX.test("0123456789"));
    assert!(!PHONE_REGEX.test("012--345-6789"));
    assert!(!PHONE_REGEX.test("(555) 555-5555"));
    assert!(!PHONE_REGEX.test("1 555-555-5555"));
}

const COLOR_REGEX: Regex<5> = compile_regex!(
    r"^#?([[:xdigit:]]{2})([[:xdigit:]]{2})([[:xdigit:]]{2})([[:xdigit:]]{2})?$"
);
fn exec() {
    assert_eq!(
        COLOR_REGEX.exec("#000000"),
        Some([
            Some("#000000"),
            Some("00"),
            Some("00"),
            Some("00"),
            None,
        ]),
    );
    assert_eq!(
        COLOR_REGEX.exec("1F2e3D"),
        Some([
            Some("1F2E3D"),
            Some("1F"),
            Some("2e"),
            Some("3D"),
            None,
        ]),
    );
    assert_eq!(
        COLOR_REGEX.exec("ffffff80"),
        Some([
            Some("ffffff80"),
            Some("ff"),
            Some("ff"),
            Some("ff"),
            Some("80"),
        ]),
    );

    assert_eq!(PHONE_REGEX.exec("green"), None);
    assert_eq!(PHONE_REGEX.exec("%FFFFFF"), None);
    assert_eq!(PHONE_REGEX.exec("#2"), None);
}

To minimize memory overhead and binary size, it is recommended to create a single instance of each regular expression (using a const variable) rather than creating multiple.

*Some features are not fully implemented, such as POSIX-mode ambiguous submatch rules (we currently use greedy mode, which is the much more common and efficient method). See the roadmap for more details.

§Alternatives

ere is intended as an alternative to regex that provides compile-time checking and regex compilation. However, ere is less featureful, so here are a few reasons you might prefer regex:

  • You require more complex regular expressions with features like backreferences and word boundary checking (which are unavailable in POSIX EREs).
  • You need run-time-compiled regular expressions (such as when provided by the user).
  • Your regular expression runs significantly more efficiently on a specific regex engine not currently available in ere.

Modules§

config
dfa_u8
Implements a statically-built DFA over u8s. Executing a DFA is much faster than running an NFA due to dealing with only a single thread. However, since the upper bound of a DFA’s size is exponential in the number of NFA states, we may need to cancel static construction if the DFA becomes too large.
fixed_offset
This is a highly-efficient implementation for regexes where capture groups always have the same offset and the same length (in bytes). This also means the text has a fixed length.
flat_lockstep_nfa
Implements an nfa-like regex engine for over chars. The engine keeps all threads in lockstep (all threads are at the same input index), and the NFA’s epsilon transitions are flattened to a single epsilon transition between symbols (including handling anchors and capture tags).
flat_lockstep_nfa_u8
Implements an nfa-like regex engine for over u8s. The engine keeps all threads in lockstep (all threads are at the same input index), and the NFA’s epsilon transitions are flattened to a single epsilon transition between symbols (including handling anchors and capture tags).
nfa_static
Implements a version of WorkingNFA that can be serialized statically into a binary.
one_pass_u8
This implements an engine for one-pass regexes.
parse_tree
Implements the ERE parser and primitive types (like Atom).
prelude
Includes the basic things you’ll need.
simplified_tree
Implements a simplified intermediate representation of a regular expression.
visualization
Implements visualization features for NFAs. Typically not used outside debug.
working_nfa
Implements the primary compile-time intermediate WorkingNFA structure for optimization.
working_u8_dfa
Working datastructure for a tagged DFA over u8s. Primarily intended for use at compile time, converted from [crate::U8NFA].
working_u8_nfa
Implements u8-based version of crate::working_nfa.

Macros§

compile_regex
This is the primary entrypoint to the ere crate. Checks and compiles a regular expression into a Regex<N>.
compile_regex_dfa_u8
Checks and compiles a regular expression into a into a [ere_core::Regex<N>] with the [ere_core::dfa_u8] engine. Unless you specifically want this engine, you might want to use compile_regex! instead.
compile_regex_fixed_offset
Checks and compiles a regular expression into a [ere_core::Regex<N>] with the [ere_core::fixed_offset] engine. Unless you specifically want this engine, you might want to use compile_regex! instead.
compile_regex_flat_lockstep_nfa
Checks and compiles a regular expression into a into a [ere_core::Regex<N>] with the [ere_core::flat_lockstep_nfa] engine. Unless you specifically want this engine, you might want to use compile_regex! instead.
compile_regex_flat_lockstep_nfa_u8
Checks and compiles a regular expression into a into a [ere_core::Regex<N>] with the [ere_core::flat_lockstep_nfa_u8] engine. Unless you specifically want this engine, you might want to use compile_regex! instead.
compile_regex_u8onepass
Checks and compiles a regular expression into a [ere_core::Regex<N>] with the [ere_core::one_pass_u8] engine. Unless you specifically want this engine, you might want to use compile_regex! instead.

Structs§

Regex
A regular expression (specifically, a POSIX ERE).

Functions§

__compile_regex
Tries to pick the best engine.
__compile_regex_engine_dfa_u8
Always uses the dfa_u8 engine
__compile_regex_engine_fixed_offset
Always uses the fixed_offset
__compile_regex_engine_flat_lockstep_nfa
Always uses the flat_lockstep_nfa engine
__compile_regex_engine_flat_lockstep_nfa_u8
Always uses the flat_lockstep_nfa_u8 engine
__compile_regex_engine_one_pass_u8
Always uses the one_pass_u8
__construct_regex
Intended to be used in macros only.