Struct ByteSlice

Source
pub struct ByteSlice<'a>(/* private fields */);
Expand description

A slice of u8 with an associated lifetime.

This is currently implemented as a #[repr(transparent)] wrapper over &'a [u8].

Implementations§

Source§

impl<'a> ByteSlice<'a>

§Byte-Oriented Interface

Vectorscan can search over arbitrary byte patterns in any encoding, so Self::from_slice() and Self::as_slice() offer the most general byte-oriented interface. This may be particularly useful when matching against non-UTF8 data, possibly with Literal pattern strings (although non-literal patterns can also be used to match non-UTF8 data).

From implementations are also provided to convert from references to native arrays and slices:

 use vectorscan::sources::ByteSlice;

 let b1 = ByteSlice::from_slice(b"asdf");
 let b2: ByteSlice = b"asdf".into();
 let b3: ByteSlice = [b'a', b's', b'd', b'f'].as_ref().into();
 assert_eq!(b1, b2);
 assert_eq!(b2, b3);
 assert_eq!(b1.as_slice(), b"asdf");

Note however that a From implementation is not provided to convert from an array [u8; N] by value, as this wrapper requires a lifetime to associate the data with, even if it’s just the local '_ lifetime or the global 'static lifetime.

Source

pub const fn from_slice(data: &'a [u8]) -> Self

Wrap a byte slice so it can be used by vectorscan.

This method is const so it can be used to define const values as well as static initializers.

Source

pub const fn as_slice(&self) -> &'a [u8]

Extract the byte slice.

A slice can be split into a pointer/length pair which is consumed by vectorscan’s underlying C ABI.

Source§

impl<'a> ByteSlice<'a>

§String-Oriented Interface

When vectorscan is being used with UTF8-encoded inputs (e.g. with Self::from_str()), it will produce UTF8 encoded match outputs, and Self::as_str() can be invoked safely on match results.

A From implementation is also provided to convert from a native str:

 use vectorscan::sources::ByteSlice;

 let b1 = ByteSlice::from_str("asdf");
 let b2: ByteSlice = "asdf".into();
 assert_eq!(b1, b2);
 assert_eq!(unsafe { b1.as_str() }, "asdf");
§The UTF8 Flag

It is important to note that vectorscan itself does not assume any particular string encoding, and the function of e.g. Flags::UTF8 is to determine which bytes should be included in the state machine, not the encoding of any particular input. This means that the UTF8 flag may be disabled for UTF8 inputs to produce a much smaller state machine (as it is when using Flags::default()). Note however that enabling the UTF8 flag for non-UTF8 inputs produces undefined behavior.

Source

pub const fn from_str(data: &'a str) -> Self

Wrap a UTF8-encoded byte slice so it can be used by vectorscan.

As with Self::from_slice(), this method is const and can produce const values or static initializers.

Source

pub const unsafe fn as_str(&self) -> &'a str

Extract the byte slice, and assert that it is correctly UTF8-encoded.

§Safety

This method passes the result of Self::as_slice() to str::from_utf8_unchecked() in order to avoid the overhead of repeatedly validating the underlying string data in the common case where all strings are UTF-8. Where this is not certain, the slice may be provided to methods such as str::from_utf8() or String::from_utf8_lossy() that check for UTF-8 validity:

 use vectorscan::sources::ByteSlice;
 use std::{borrow::Cow, str};

 // All-or-nothing UTF8 conversion with error:
 let b = ByteSlice::from_slice(b"asdf");
 let s = str::from_utf8(b.as_slice()).unwrap();
 assert_eq!(s, "asdf");

 // Error-coercing UTF8 conversion which replaces invalid characters:
 let b = ByteSlice::from_slice(b"Hello \xF0\x90\x80World");
 let s: Cow<'_, str> = String::from_utf8_lossy(b.as_slice());
 assert_eq!(s, "Hello �World");
Source§

impl<'a> ByteSlice<'a>

§Subsetting

Match callbacks return subsets of the input argument. These methods apply a fallible subsetting operation which is used to convert match offsets to substrings.

Source

pub fn index_range( &self, range: impl SliceIndex<[u8], Output = [u8]>, ) -> Option<Self>

Return a subset of the input, or None if the result would be out of range:

 use vectorscan::sources::ByteSlice;

 let b: ByteSlice = "asdf".into();
 let b2 = b.index_range(0..2).unwrap();
 assert_eq!(unsafe { b2.as_str() }, "as");
 assert!(b.index_range(0..5).is_none());

This method is largely intended for internal use inside this library, but is exposed in the public API to make it clear how the match callback converts match offsets to substrings of the original input data.

Trait Implementations§

Source§

impl<'a> Clone for ByteSlice<'a>

Source§

fn clone(&self) -> ByteSlice<'a>

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl<'a> Debug for ByteSlice<'a>

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl<'a> Display for ByteSlice<'a>

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl<'a> From<&'a [u8]> for ByteSlice<'a>

Source§

fn from(x: &'a [u8]) -> Self

Converts to this type from the input type.
Source§

impl<'a, const N: usize> From<&'a [u8; N]> for ByteSlice<'a>

Source§

fn from(x: &'a [u8; N]) -> Self

Converts to this type from the input type.
Source§

impl<'a> From<&'a str> for ByteSlice<'a>

Source§

fn from(x: &'a str) -> Self

Converts to this type from the input type.
Source§

impl<'a> Hash for ByteSlice<'a>

Source§

fn hash<__H: Hasher>(&self, state: &mut __H)

Feeds this value into the given Hasher. Read more
1.3.0 · Source§

fn hash_slice<H>(data: &[Self], state: &mut H)
where H: Hasher, Self: Sized,

Feeds a slice of this type into the given Hasher. Read more
Source§

impl<'a> Ord for ByteSlice<'a>

Source§

fn cmp(&self, other: &ByteSlice<'a>) -> Ordering

This method returns an Ordering between self and other. Read more
1.21.0 · Source§

fn max(self, other: Self) -> Self
where Self: Sized,

Compares and returns the maximum of two values. Read more
1.21.0 · Source§

fn min(self, other: Self) -> Self
where Self: Sized,

Compares and returns the minimum of two values. Read more
1.50.0 · Source§

fn clamp(self, min: Self, max: Self) -> Self
where Self: Sized,

Restrict a value to a certain interval. Read more
Source§

impl<'a> PartialEq for ByteSlice<'a>

Source§

fn eq(&self, other: &ByteSlice<'a>) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
Source§

impl<'a> PartialOrd for ByteSlice<'a>

Source§

fn partial_cmp(&self, other: &ByteSlice<'a>) -> Option<Ordering>

This method returns an ordering between self and other values if one exists. Read more
1.0.0 · Source§

fn lt(&self, other: &Rhs) -> bool

Tests less than (for self and other) and is used by the < operator. Read more
1.0.0 · Source§

fn le(&self, other: &Rhs) -> bool

Tests less than or equal to (for self and other) and is used by the <= operator. Read more
1.0.0 · Source§

fn gt(&self, other: &Rhs) -> bool

Tests greater than (for self and other) and is used by the > operator. Read more
1.0.0 · Source§

fn ge(&self, other: &Rhs) -> bool

Tests greater than or equal to (for self and other) and is used by the >= operator. Read more
Source§

impl<'a> Copy for ByteSlice<'a>

Source§

impl<'a> Eq for ByteSlice<'a>

Source§

impl<'a> StructuralPartialEq for ByteSlice<'a>

Auto Trait Implementations§

§

impl<'a> Freeze for ByteSlice<'a>

§

impl<'a> RefUnwindSafe for ByteSlice<'a>

§

impl<'a> Send for ByteSlice<'a>

§

impl<'a> Sync for ByteSlice<'a>

§

impl<'a> Unpin for ByteSlice<'a>

§

impl<'a> UnwindSafe for ByteSlice<'a>

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<Q, K> Equivalent<K> for Q
where Q: Eq + ?Sized, K: Borrow<Q> + ?Sized,

Source§

fn equivalent(&self, key: &K) -> bool

Compare self to key and return true if they are equal.
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T> ToString for T
where T: Display + ?Sized,

Source§

fn to_string(&self) -> String

Converts the given value to a String. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.