pub struct ByteSlice<'a>(/* private fields */);
Expand description
A slice
of u8
with an associated lifetime.
This is currently implemented as
a #[repr(transparent)]
wrapper over &'a [u8]
.
Implementations§
Source§impl<'a> ByteSlice<'a>
§Byte-Oriented Interface
Vectorscan can search over arbitrary byte patterns in any encoding, so
Self::from_slice()
and Self::as_slice()
offer the most general
byte-oriented interface. This may be particularly useful when matching
against non-UTF8 data, possibly with Literal
pattern strings (although non-literal patterns can also be used to
match non-UTF8 data).
impl<'a> ByteSlice<'a>
§Byte-Oriented Interface
Vectorscan can search over arbitrary byte patterns in any encoding, so
Self::from_slice()
and Self::as_slice()
offer the most general
byte-oriented interface. This may be particularly useful when matching
against non-UTF8 data, possibly with Literal
pattern strings (although non-literal patterns can also be used to
match non-UTF8 data).
From
implementations are also provided to convert from references to
native arrays and slices:
use vectorscan::sources::ByteSlice;
let b1 = ByteSlice::from_slice(b"asdf");
let b2: ByteSlice = b"asdf".into();
let b3: ByteSlice = [b'a', b's', b'd', b'f'].as_ref().into();
assert_eq!(b1, b2);
assert_eq!(b2, b3);
assert_eq!(b1.as_slice(), b"asdf");
Note however that a From
implementation is not provided to convert from
an array [u8; N]
by value, as this wrapper requires a lifetime to
associate the data with, even if it’s just the local '_
lifetime or the
global 'static
lifetime.
Sourcepub const fn from_slice(data: &'a [u8]) -> Self
pub const fn from_slice(data: &'a [u8]) -> Self
Source§impl<'a> ByteSlice<'a>
§String-Oriented Interface
When vectorscan is being used with UTF8-encoded inputs (e.g. with
Self::from_str()
), it will produce UTF8 encoded match outputs, and
Self::as_str()
can be invoked safely on match results.
impl<'a> ByteSlice<'a>
§String-Oriented Interface
When vectorscan is being used with UTF8-encoded inputs (e.g. with
Self::from_str()
), it will produce UTF8 encoded match outputs, and
Self::as_str()
can be invoked safely on match results.
A From
implementation is also provided to convert from a native
str
:
use vectorscan::sources::ByteSlice;
let b1 = ByteSlice::from_str("asdf");
let b2: ByteSlice = "asdf".into();
assert_eq!(b1, b2);
assert_eq!(unsafe { b1.as_str() }, "asdf");
§The UTF8
Flag
It is important to note that vectorscan itself does not assume any
particular string encoding, and the function of e.g.
Flags::UTF8
is to determine which bytes
should be included in the state machine, not the encoding of any
particular input. This means that the UTF8 flag may be disabled for UTF8
inputs to produce a much smaller state machine (as it is when using
Flags::default()
). Note however that
enabling the UTF8 flag for non-UTF8 inputs produces undefined behavior.
Sourcepub const fn from_str(data: &'a str) -> Self
pub const fn from_str(data: &'a str) -> Self
Wrap a UTF8-encoded byte slice so it can be used by vectorscan.
As with Self::from_slice()
, this method is const
and can produce
const
values or static
initializers.
Sourcepub const unsafe fn as_str(&self) -> &'a str
pub const unsafe fn as_str(&self) -> &'a str
Extract the byte slice, and assert that it is correctly UTF8-encoded.
§Safety
This method passes the result of Self::as_slice()
to
str::from_utf8_unchecked()
in order to avoid the overhead of
repeatedly validating the underlying string data in the common case
where all strings are UTF-8. Where this is not certain, the slice may be
provided to methods such as str::from_utf8()
or
String::from_utf8_lossy()
that check for UTF-8 validity:
use vectorscan::sources::ByteSlice;
use std::{borrow::Cow, str};
// All-or-nothing UTF8 conversion with error:
let b = ByteSlice::from_slice(b"asdf");
let s = str::from_utf8(b.as_slice()).unwrap();
assert_eq!(s, "asdf");
// Error-coercing UTF8 conversion which replaces invalid characters:
let b = ByteSlice::from_slice(b"Hello \xF0\x90\x80World");
let s: Cow<'_, str> = String::from_utf8_lossy(b.as_slice());
assert_eq!(s, "Hello �World");
Source§impl<'a> ByteSlice<'a>
§Subsetting
Match callbacks return subsets of the input argument. These methods apply a
fallible subsetting operation which is used to convert match offsets to
substrings.
impl<'a> ByteSlice<'a>
§Subsetting
Match callbacks return subsets of the input argument. These methods apply a fallible subsetting operation which is used to convert match offsets to substrings.
Sourcepub fn index_range(
&self,
range: impl SliceIndex<[u8], Output = [u8]>,
) -> Option<Self>
pub fn index_range( &self, range: impl SliceIndex<[u8], Output = [u8]>, ) -> Option<Self>
Return a subset of the input, or None
if the result would be out of
range:
use vectorscan::sources::ByteSlice;
let b: ByteSlice = "asdf".into();
let b2 = b.index_range(0..2).unwrap();
assert_eq!(unsafe { b2.as_str() }, "as");
assert!(b.index_range(0..5).is_none());
This method is largely intended for internal use inside this library, but is exposed in the public API to make it clear how the match callback converts match offsets to substrings of the original input data.
Trait Implementations§
Source§impl<'a> Ord for ByteSlice<'a>
impl<'a> Ord for ByteSlice<'a>
1.21.0 · Source§fn max(self, other: Self) -> Selfwhere
Self: Sized,
fn max(self, other: Self) -> Selfwhere
Self: Sized,
Source§impl<'a> PartialOrd for ByteSlice<'a>
impl<'a> PartialOrd for ByteSlice<'a>
impl<'a> Copy for ByteSlice<'a>
impl<'a> Eq for ByteSlice<'a>
impl<'a> StructuralPartialEq for ByteSlice<'a>
Auto Trait Implementations§
impl<'a> Freeze for ByteSlice<'a>
impl<'a> RefUnwindSafe for ByteSlice<'a>
impl<'a> Send for ByteSlice<'a>
impl<'a> Sync for ByteSlice<'a>
impl<'a> Unpin for ByteSlice<'a>
impl<'a> UnwindSafe for ByteSlice<'a>
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<Q, K> Equivalent<K> for Q
impl<Q, K> Equivalent<K> for Q
Source§fn equivalent(&self, key: &K) -> bool
fn equivalent(&self, key: &K) -> bool
key
and return true
if they are equal.