Struct ExpressionSet

Source
pub struct ExpressionSet<'a> { /* private fields */ }
Available on crate feature compiler only.
Expand description

Collection of regular expressions.

This is the main entry point to vectorscan’s primary functionality: matching against sets of patterns at once, which is typically poorly supported or less featureful than single-pattern matching in many other regex engines.

This struct provides an immutable (returning Self) builder interface to attach additional configuration to the initial set of patterns constructed with Self::from_exprs().

Implementations§

Source§

impl<'a> ExpressionSet<'a>

Source

pub fn from_exprs(exprs: impl IntoIterator<Item = &'a Expression>) -> Self

Construct a pattern set from references to parsed expressions.

The length of this initial exprs argument is returned by Self::len(), and all subsequent configuration methods are checked to provide iterators of the same length:

 use vectorscan::expression::*;

 let a: Expression = "a+".parse().unwrap();
 // Fails due to argument length mismatch:
 ExpressionSet::from_exprs([&a])
   .with_flags([]);
Source

pub fn with_flags(self, flags: impl IntoIterator<Item = Flags>) -> Self

Provide flags which modify the behavior of each expression.

The length of flags is checked to be the same as Self::len().

If this builder method is not used, Flags::default() will be assigned to all patterns.

 use vectorscan::{expression::*, flags::*, matchers::*};

 // Create two expressions to demonstrate separate flags for each pattern:
 let a: Expression = "a+[^a]".parse()?;
 let b: Expression = "b+[^b]".parse()?;

 // Get the start of match for one pattern, but not the other:
 let db = ExpressionSet::from_exprs([&a, &b])
   .with_flags([Flags::default(), Flags::SOM_LEFTMOST])
   .compile(Mode::BLOCK)?;

 let mut scratch = db.allocate_scratch()?;

 let mut matches: Vec<&str> = Vec::new();
 scratch.scan_sync(&db, "aardvark imbibbe".into(), |m| {
   matches.push(unsafe { m.source.as_str() });
   MatchResult::Continue
 })?;
 // Start of match is preserved for only one pattern:
 assert_eq!(&matches, &["aar", "aardvar", "bi", "bbe"]);
Source

pub fn with_ids(self, ids: impl IntoIterator<Item = ExprId>) -> Self

Assign an ID number to each pattern.

The length of ids is checked to be the same as Self::len(). Multiple patterns can be assigned the same ID.

If this builder method is not used, vectorscan will assign them all the ID number 0:

 use vectorscan::{expression::*, flags::*, state::*, matchers::*, sources::*};

 // Create two expressions to demonstrate multiple pattern IDs.
 let a: Expression = "a+[^a]".parse()?;
 let b: Expression = "b+[^b]".parse()?;

 // Create one db with ID numbers, and one without.
 let set1 = ExpressionSet::from_exprs([&a, &b]).compile(Mode::BLOCK)?;
 let set2 = ExpressionSet::from_exprs([&a, &b])
   .with_ids([ExprId(300), ExprId(12)])
   .compile(Mode::BLOCK)?;

 let mut scratch = Scratch::blank();
 scratch.setup_for_db(&set1)?;
 scratch.setup_for_db(&set2)?;

 let msg: ByteSlice = "aardvark imbibbe".into();

 // The first db doesn't differentiate matches by ID number:
 let mut matches1: Vec<ExpressionIndex> = Vec::new();
 scratch.scan_sync(&set1, msg, |m| {
   matches1.push(m.id);
   MatchResult::Continue
 })?;
 assert_eq!(
   &matches1,
   &[ExpressionIndex(0), ExpressionIndex(0), ExpressionIndex(0), ExpressionIndex(0)],
 );

 // The second db returns corresponding ExpressionIndex instances:
 let mut matches2: Vec<ExpressionIndex> = Vec::new();
 scratch.scan_sync(&set2, msg, |m| {
   matches2.push(m.id);
   MatchResult::Continue
 })?;
 assert_eq!(
   &matches2,
   &[ExpressionIndex(300), ExpressionIndex(300), ExpressionIndex(12), ExpressionIndex(12)],
 );
Source

pub fn with_exts( self, exts: impl IntoIterator<Item = Option<&'a ExprExt>>, ) -> Self

Optionally assign ExprExt configuration to each pattern.

This is the only available entry point to compiling a database with ExprExt configuration for a given pattern (i.e. the single expression compiler does not support extended configuration).

If Expression::ext_info() succeeds with a given Expression/ExprExt pair, then compiling the same pattern and configuration into a vectorscan database via an expression set with this method is likely but not guaranteed to succeed.

 use vectorscan::{expression::*, flags::*, matchers::*};

 // Apply extended configuration to one version of the pattern, but not the other:
 let a: Expression = "a.*b".parse()?;
 let a_ext = ExprExt::from_min_length(4);
 let set = ExpressionSet::from_exprs([&a, &a])
   .with_exts([Some(&a_ext), None])
   .with_ids([ExprId(1), ExprId(2)])
   .compile(Mode::BLOCK)?;
 let mut scratch = set.allocate_scratch()?;

 // The configured pattern does not match because of its min length attribute:
 let mut matches: Vec<ExpressionIndex> = Vec::new();
 scratch.scan_sync(&set, "ab".into(), |m| {
   matches.push(m.id);
   MatchResult::Continue
 })?;
 assert_eq!(&matches, &[ExpressionIndex(2)]);

 // Howver, both patterns match a longer input:
 matches.clear();
 scratch.scan_sync(&set, "asssssb".into(), |m| {
   matches.push(m.id);
   MatchResult::Continue
 })?;
 assert_eq!(&matches, &[ExpressionIndex(1), ExpressionIndex(2)]);
Source

pub fn compile(self, mode: Mode) -> Result<Database, VectorscanCompileError>

Call Database::compile_multi() with None for the platform.

Source

pub fn len(&self) -> usize

The number of patterns in this set.

Source

pub fn is_empty(&self) -> bool

Whether this set contains any patterns.

Trait Implementations§

Source§

impl<'a> Clone for ExpressionSet<'a>

Source§

fn clone(&self) -> ExpressionSet<'a>

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl<'a> Debug for ExpressionSet<'a>

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

§

impl<'a> Freeze for ExpressionSet<'a>

§

impl<'a> RefUnwindSafe for ExpressionSet<'a>

§

impl<'a> !Send for ExpressionSet<'a>

§

impl<'a> !Sync for ExpressionSet<'a>

§

impl<'a> Unpin for ExpressionSet<'a>

§

impl<'a> UnwindSafe for ExpressionSet<'a>

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.