Crate vectorscan
source ·Expand description
Wrapper for the vectorscan C regex library.
Quirks
The vectorscan library (originally hyperscan, from Intel) supports high-performance pattern matching using a subset of PCRE syntax. It was originally written for extremely low-latency network traffic monitoring, so it has some interface quirks that may be unfamiliar:
- Vectorscan Callback API: Matches are “returned” to the user when vectorscan executes a user-provided C ABI method call, so overlapping matches and other interactive feedback with the matching engine are much easier to support compared to a synchronous method call.
- Highly Expressive Pattern Set Matching:
expression::ExpressionSet
supports the full range of searching and matching operations available to individualexpression::Expression
instances. This is rare: most other regex engines e.g. do not support finding match offsets, but instead only which expressions in a set matched. - Mutable State and String Searching: Vectorscan requires the user to
explicitly provide a “scratch” space with
state::Scratch
to each search method. This state is not very large, but most other regex engines attempt to present an interface without any mutable state, even if internally they use constructions like lazy DFAs.
Feature Flags
This library uses spack-rs
to configure the build of the
vectorscan codebase using spack
, so it can be precise about which native
dependencies it brings in:
"static"
(default): link against vectorscan statically. Conflicts with"dynamic"
."dynamic"
: link against vectorscan dynamically. Conflicts with"static"
,"chimera"
, and"alloc"
. Because ofspack
’s caching and RPATH rewriting, the same dynamic library can be shared by every dependency of this crate."compiler"
(default): whether to bring in the entirelibhs
library, or justlibhs_runtime
, which is unable to compile patterns but can deserialize them. This significantly reduces the size of the code added to the binary."chimera"
: whether to link against PCRE and add in extra vectorscan code to provide the chimera PCRE compatible search library. Conflicts with"dynamic"
and requires"compiler"
.
Feature flags are also used to gate certain functionality to minimize external dependencies when not in use:
"alloc"
: hook into vectorscan’s dynamic memory allocation withcrate::alloc
. Requires"static"
due to modifying process-global hooks."stream"
(default): supports stream parsing withcrate::stream
."vectored"
(default): supports vectored mode parsing withMode::VECTORED
."catch-unwind"
(default): catches Rust panics in the match callback before they bubble back up to vectorscan to produce undefined behavior."async"
: provides anasync
interface over vectorscan’s quirky callback API usingtokio
as described in Asynchronous String Scanning."tokio-impls"
: implementstokio::io::AsyncWrite
for stream parsers incrate::stream::channel::AsyncStreamWriter
.
Modules
- alloc
alloc
Routines for overriding the allocators used in several components of vectorscan. - Compile state machines from expressions or deserialize them from bytes.
- Errors returned by methods in this library.
- expression
compiler
FFI wrappers for different types of pattern strings. - flags
compiler
Integer bitsets used to configure pattern compilation. - Types used in vectorscan match callbacks.
- Wrappers for types of string data which can be searched and indexed.
- Allocate and initialize mutable scratch space required for string searching.
- stream
stream
Higher-level wrappers to manage state needed for stream parsing.
Functions
- check_valid_platform
compiler
Utility function to test the current system architecture. - chimera_version
chimera
Utility function for identifying this release version. - Utility function for identifying this release version.