proc-macro-regex
A proc macro regex library to match an arbitrary string or byte array to a regular expression.
Usage
Add this to your Cargo.toml
:
[]
= "~1.1.0"
Example
The macro regex!
creates a function of the given name which takes a string or byte array and
returns true
if the argument matches the regex, otherwise false
.
use regex;
/// Create the function with the signature:
/// fn regex_email(s: &str) -> bool;
regex!;
The given regex works the same as in the regex crate. If the ^
is at the beginning of the regex and $
at the end then the whole string is checked, otherwise is
check if the string contains the regex.
How it works
The macro creates a deterministic finite automaton (DFA), which parse the given input. Depending on the size of the DFA or the character of the regex, a lookup table or a code base implementation (binary search) is generated. If the size of the lookup table would be bigger than 65536 bytes (can be changed) then a code base implementation (binary search) is used. Additionally, if the regex contains any Unicode (no ASCII) character then a code base implementation (binary search) is used, too.
The following macro generates the following code:
regex!;
Generates:
To tell the macro that the lookup table is not allowed to be bigger than 256 bytes, a third argument can be given. Therefore, a code base implementation (binary search) of the DFA is generated.
regex!;
Generates:
To change the visibility of the function, add the keywords at the beginning of the arguments.
regex!;
Generates:
To parse a byte array instead of string, pass a byte string.
regex!;
Generates:
The generated code should work with #![no_std]
, too.
proc-macro-regex vs regex
Advantages:
- Compile-time (no runtime initialization, no lazy-static)
- Generated code that does not contain any dependencies
- No heap allocation
- Approximately 12%-68% faster for no trivia regex [^1]
[^1]: It were tested with regex in benches/compare.rs
. For pattern/word matching it is slower
because the regex library uses
aho-corasick. (See Performance)
Disadvantages:
- Currently, no group captures
- No runtime regex generation
Performance
This is the performance comparison between this crate and the regex crate. If you want to test it
by yourself, run cargo bench --bench compare
.
Name | proc-macro-regex |
regex |
Result |
---|---|---|---|
743.95 MiB/s | 441.67 MiB/s | 68.44 % | |
URL | 584.62 MiB/s | 519.00 MiB/s | 12.64 % |
IPv6 | 746.92 MiB/s | 473.38 MiB/s | 57.78 % |
This was compiled with rustc 1.53.0-nightly (392ba2ba1 2021-04-17)
.
License
This project is licensed under the BSD-3-Clause license.
Contribution
Any contribution intentionally submitted for inclusion in proc-macro-regex
by you, shall
be licensed as BSD-3-Clause, without any additional
terms or conditions.