Struct mako::tokenization::hf_tokenizers::tokenizer::normalizer::NormalizedString
source · [−]pub struct NormalizedString { /* private fields */ }
Expand description
A NormalizedString
takes care of processing an “original” string to modify
it and obtain a “normalized” string. It keeps both version of the string,
alignments information between both and provides an interface to retrieve
ranges of each string, using offsets from any of them.
It is possible to retrieve a part of the original string, by indexing it with offsets from the normalized one, and the other way around too. It is also possible to convert offsets from one referential to the other one easily.
Implementations
Return the original string
Return the original offsets
pub fn convert_offsets<T>(&self, range: Range<T>) -> Option<Range<usize>> where
T: RangeBounds<usize> + Clone,
pub fn convert_offsets<T>(&self, range: Range<T>) -> Option<Range<usize>> where
T: RangeBounds<usize> + Clone,
Convert the given offsets range from one referential to the other one:
Original => Normalized
or Normalized => Original
Returns None
when targeting something that is outside range
Return a range of the normalized string
pub fn get_range_original<T>(&self, range: Range<T>) -> Option<&str> where
T: RangeBounds<usize> + Clone,
pub fn get_range_original<T>(&self, range: Range<T>) -> Option<&str> where
T: RangeBounds<usize> + Clone,
Return a range of the original string
pub fn slice<T>(&self, range: Range<T>) -> Option<NormalizedString> where
T: RangeBounds<usize> + Clone,
pub fn slice<T>(&self, range: Range<T>) -> Option<NormalizedString> where
T: RangeBounds<usize> + Clone,
Return a slice of the current NormalizedString If the range is not on char boundaries, return None
pub fn transform_range<T, I>(
&mut self,
range: Range<T>,
dest: I,
initial_offset: usize
) where
T: RangeBounds<usize> + Clone,
I: IntoIterator<Item = (char, isize)>,
pub fn transform_range<T, I>(
&mut self,
range: Range<T>,
dest: I,
initial_offset: usize
) where
T: RangeBounds<usize> + Clone,
I: IntoIterator<Item = (char, isize)>,
Applies transformations to the current normalized version of the string,
while updating the alignments.
This method expect an Iterator yielding each char of the new normalized string
with a change
isize equals to:
1
if this is a new char-N
if the char is right before N removed chars0
if the char is replacing the existing one Since it is possible that the normalized string doesn’t include some of the characters at the beginning of the original one, we need aninitial_offset
which represents the number of removed chars at the very beginning.
Applies transformations to the current normalized version of the string,
while updating the alignments.
This method expect an Iterator yielding each char of the new normalized string
with a change
isize equals to:
1
if this is a new char-N
if the char is right before N removed chars0
if the char is replacing the existing one Since it is possible that the normalized string doesn’t include some of the characters at the beginning of the original one, we need aninitial_offset
which represents the number of removed chars at the very beginning.
Applies filtering over our characters
Calls the given function for each characters
Replace anything that matches the pattern with the given content.
pub fn split<P: Pattern>(
&self,
pattern: P,
behavior: SplitDelimiterBehavior
) -> Result<Vec<NormalizedString>>
pub fn split<P: Pattern>(
&self,
pattern: P,
behavior: SplitDelimiterBehavior
) -> Result<Vec<NormalizedString>>
Split the current string in many subparts. Specify what to do with the delimiter.
Splitting Behavior for the delimiter
The behavior can be one of the followings:
When splitting on '-'
for example, with input the-final--countdown
:
- Removed =>
[ "the", "", "final", "", "", "countdown" ]
- Isolated =>
[ "the", "-", "final", "-", "-", "countdown" ]
- MergedWithPrevious =>
[ "the-", "final-", "-", "countdown" ]
- MergedWithNext =>
[ "the", "-final", "-", "-countdown" ]
Remove any leading and trailing space(s) of the normalized string
Returns the length of the normalized string (counting chars not bytes)
Returns the length of the original string (counting chars not bytes)
Trait Implementations
Returns the “default value” for a type. Read more
Performs the conversion.
Performs the conversion.
This method tests for self
and other
values to be equal, and is used
by ==
. Read more
This method tests for !=
.
Auto Trait Implementations
impl RefUnwindSafe for NormalizedString
impl Send for NormalizedString
impl Sync for NormalizedString
impl Unpin for NormalizedString
impl UnwindSafe for NormalizedString
Blanket Implementations
Mutably borrows from an owned value. Read more
fn fmt_binary(self) -> FmtBinary<Self> where
Self: Binary,
fn fmt_binary(self) -> FmtBinary<Self> where
Self: Binary,
Causes self
to use its Binary
implementation when Debug
-formatted.
fn fmt_display(self) -> FmtDisplay<Self> where
Self: Display,
fn fmt_display(self) -> FmtDisplay<Self> where
Self: Display,
Causes self
to use its Display
implementation when
Debug
-formatted. Read more
fn fmt_lower_exp(self) -> FmtLowerExp<Self> where
Self: LowerExp,
fn fmt_lower_exp(self) -> FmtLowerExp<Self> where
Self: LowerExp,
Causes self
to use its LowerExp
implementation when
Debug
-formatted. Read more
fn fmt_lower_hex(self) -> FmtLowerHex<Self> where
Self: LowerHex,
fn fmt_lower_hex(self) -> FmtLowerHex<Self> where
Self: LowerHex,
Causes self
to use its LowerHex
implementation when
Debug
-formatted. Read more
Causes self
to use its Octal
implementation when Debug
-formatted.
fn fmt_pointer(self) -> FmtPointer<Self> where
Self: Pointer,
fn fmt_pointer(self) -> FmtPointer<Self> where
Self: Pointer,
Causes self
to use its Pointer
implementation when
Debug
-formatted. Read more
fn fmt_upper_exp(self) -> FmtUpperExp<Self> where
Self: UpperExp,
fn fmt_upper_exp(self) -> FmtUpperExp<Self> where
Self: UpperExp,
Causes self
to use its UpperExp
implementation when
Debug
-formatted. Read more
fn fmt_upper_hex(self) -> FmtUpperHex<Self> where
Self: UpperHex,
fn fmt_upper_hex(self) -> FmtUpperHex<Self> where
Self: UpperHex,
Causes self
to use its UpperHex
implementation when
Debug
-formatted. Read more
impl<T> Pipe for T where
T: ?Sized,
impl<T> Pipe for T where
T: ?Sized,
Pipes by value. This is generally the method you want to use. Read more
Borrows self
and passes that borrow into the pipe function. Read more
fn pipe_ref_mut<'a, R>(&'a mut self, func: impl FnOnce(&'a mut Self) -> R) -> R where
R: 'a,
fn pipe_ref_mut<'a, R>(&'a mut self, func: impl FnOnce(&'a mut Self) -> R) -> R where
R: 'a,
Mutably borrows self
and passes that borrow into the pipe function. Read more
fn pipe_borrow<'a, B, R>(&'a self, func: impl FnOnce(&'a B) -> R) -> R where
Self: Borrow<B>,
B: 'a + ?Sized,
R: 'a,
fn pipe_borrow<'a, B, R>(&'a self, func: impl FnOnce(&'a B) -> R) -> R where
Self: Borrow<B>,
B: 'a + ?Sized,
R: 'a,
Borrows self
, then passes self.borrow()
into the pipe function. Read more
fn pipe_borrow_mut<'a, B, R>(
&'a mut self,
func: impl FnOnce(&'a mut B) -> R
) -> R where
Self: BorrowMut<B>,
B: 'a + ?Sized,
R: 'a,
fn pipe_borrow_mut<'a, B, R>(
&'a mut self,
func: impl FnOnce(&'a mut B) -> R
) -> R where
Self: BorrowMut<B>,
B: 'a + ?Sized,
R: 'a,
Mutably borrows self
, then passes self.borrow_mut()
into the pipe
function. Read more
fn pipe_as_ref<'a, U, R>(&'a self, func: impl FnOnce(&'a U) -> R) -> R where
Self: AsRef<U>,
U: 'a + ?Sized,
R: 'a,
fn pipe_as_ref<'a, U, R>(&'a self, func: impl FnOnce(&'a U) -> R) -> R where
Self: AsRef<U>,
U: 'a + ?Sized,
R: 'a,
Borrows self
, then passes self.as_ref()
into the pipe function.
fn pipe_as_mut<'a, U, R>(&'a mut self, func: impl FnOnce(&'a mut U) -> R) -> R where
Self: AsMut<U>,
U: 'a + ?Sized,
R: 'a,
fn pipe_as_mut<'a, U, R>(&'a mut self, func: impl FnOnce(&'a mut U) -> R) -> R where
Self: AsMut<U>,
U: 'a + ?Sized,
R: 'a,
Mutably borrows self
, then passes self.as_mut()
into the pipe
function. Read more
fn pipe_deref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R where
Self: Deref<Target = T>,
T: 'a + ?Sized,
R: 'a,
fn pipe_deref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R where
Self: Deref<Target = T>,
T: 'a + ?Sized,
R: 'a,
Borrows self
, then passes self.deref()
into the pipe function.
fn pipe_as_ref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R where
Self: AsRef<T>,
T: 'a,
R: 'a,
fn pipe_as_ref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R where
Self: AsRef<T>,
T: 'a,
R: 'a,
Pipes a trait borrow into a function that cannot normally be called in suffix position. Read more
fn pipe_as_mut<'a, T, R>(&'a mut self, func: impl FnOnce(&'a mut T) -> R) -> R where
Self: AsMut<T>,
T: 'a,
R: 'a,
fn pipe_as_mut<'a, T, R>(&'a mut self, func: impl FnOnce(&'a mut T) -> R) -> R where
Self: AsMut<T>,
T: 'a,
R: 'a,
Pipes a trait mutable borrow into a function that cannot normally be called in suffix position. Read more
fn pipe_borrow<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R where
Self: Borrow<T>,
T: 'a,
R: 'a,
fn pipe_borrow<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R where
Self: Borrow<T>,
T: 'a,
R: 'a,
Pipes a trait borrow into a function that cannot normally be called in suffix position. Read more
fn pipe_borrow_mut<'a, T, R>(
&'a mut self,
func: impl FnOnce(&'a mut T) -> R
) -> R where
Self: BorrowMut<T>,
T: 'a,
R: 'a,
fn pipe_borrow_mut<'a, T, R>(
&'a mut self,
func: impl FnOnce(&'a mut T) -> R
) -> R where
Self: BorrowMut<T>,
T: 'a,
R: 'a,
Pipes a trait mutable borrow into a function that cannot normally be called in suffix position. Read more
fn pipe_deref<'a, R>(&'a self, func: impl FnOnce(&'a Self::Target) -> R) -> R where
Self: Deref,
R: 'a,
fn pipe_deref<'a, R>(&'a self, func: impl FnOnce(&'a Self::Target) -> R) -> R where
Self: Deref,
R: 'a,
Pipes a dereference into a function that cannot normally be called in suffix position. Read more
fn pipe_deref_mut<'a, R>(
&'a mut self,
func: impl FnOnce(&'a mut Self::Target) -> R
) -> R where
Self: DerefMut,
R: 'a,
fn pipe_deref_mut<'a, R>(
&'a mut self,
func: impl FnOnce(&'a mut Self::Target) -> R
) -> R where
Self: DerefMut,
R: 'a,
Pipes a mutable dereference into a function that cannot normally be called in suffix position. Read more
Pipes a reference into a function that cannot ordinarily be called in suffix position. Read more
fn tap_borrow<B>(self, func: impl FnOnce(&B)) -> Self where
Self: Borrow<B>,
B: ?Sized,
fn tap_borrow<B>(self, func: impl FnOnce(&B)) -> Self where
Self: Borrow<B>,
B: ?Sized,
Immutable access to the Borrow<B>
of a value. Read more
fn tap_borrow_mut<B>(self, func: impl FnOnce(&mut B)) -> Self where
Self: BorrowMut<B>,
B: ?Sized,
fn tap_borrow_mut<B>(self, func: impl FnOnce(&mut B)) -> Self where
Self: BorrowMut<B>,
B: ?Sized,
Mutable access to the BorrowMut<B>
of a value. Read more
Immutable access to the AsRef<R>
view of a value. Read more
fn tap_ref_mut<R>(self, func: impl FnOnce(&mut R)) -> Self where
Self: AsMut<R>,
R: ?Sized,
fn tap_ref_mut<R>(self, func: impl FnOnce(&mut R)) -> Self where
Self: AsMut<R>,
R: ?Sized,
Mutable access to the AsMut<R>
view of a value. Read more
Immutable access to the Deref::Target
of a value. Read more
Mutable access to the Deref::Target
of a value. Read more
Calls .tap()
only in debug builds, and is erased in release builds.
fn tap_mut_dbg(self, func: impl FnOnce(&mut Self)) -> Self
fn tap_mut_dbg(self, func: impl FnOnce(&mut Self)) -> Self
Calls .tap_mut()
only in debug builds, and is erased in release
builds. Read more
fn tap_borrow_dbg<B>(self, func: impl FnOnce(&B)) -> Self where
Self: Borrow<B>,
B: ?Sized,
fn tap_borrow_dbg<B>(self, func: impl FnOnce(&B)) -> Self where
Self: Borrow<B>,
B: ?Sized,
Calls .tap_borrow()
only in debug builds, and is erased in release
builds. Read more
fn tap_borrow_mut_dbg<B>(self, func: impl FnOnce(&mut B)) -> Self where
Self: BorrowMut<B>,
B: ?Sized,
fn tap_borrow_mut_dbg<B>(self, func: impl FnOnce(&mut B)) -> Self where
Self: BorrowMut<B>,
B: ?Sized,
Calls .tap_borrow_mut()
only in debug builds, and is erased in release
builds. Read more
fn tap_ref_dbg<R>(self, func: impl FnOnce(&R)) -> Self where
Self: AsRef<R>,
R: ?Sized,
fn tap_ref_dbg<R>(self, func: impl FnOnce(&R)) -> Self where
Self: AsRef<R>,
R: ?Sized,
Calls .tap_ref()
only in debug builds, and is erased in release
builds. Read more
fn tap_ref_mut_dbg<R>(self, func: impl FnOnce(&mut R)) -> Self where
Self: AsMut<R>,
R: ?Sized,
fn tap_ref_mut_dbg<R>(self, func: impl FnOnce(&mut R)) -> Self where
Self: AsMut<R>,
R: ?Sized,
Calls .tap_ref_mut()
only in debug builds, and is erased in release
builds. Read more
Provides immutable access for inspection. Read more
Calls tap
in debug builds, and does nothing in release builds.
Provides mutable access for modification. Read more
fn tap_mut_dbg<F, R>(self, func: F) -> Self where
F: FnOnce(&mut Self) -> R,
fn tap_mut_dbg<F, R>(self, func: F) -> Self where
F: FnOnce(&mut Self) -> R,
Calls tap_mut
in debug builds, and does nothing in release builds.
impl<T, U> TapAsRef<U> for T where
U: ?Sized,
impl<T, U> TapAsRef<U> for T where
U: ?Sized,
Provides immutable access to the reference for inspection.
fn tap_ref_dbg<F, R>(self, func: F) -> Self where
Self: AsRef<T>,
F: FnOnce(&T) -> R,
fn tap_ref_dbg<F, R>(self, func: F) -> Self where
Self: AsRef<T>,
F: FnOnce(&T) -> R,
Calls tap_ref
in debug builds, and does nothing in release builds.
fn tap_ref_mut<F, R>(self, func: F) -> Self where
Self: AsMut<T>,
F: FnOnce(&mut T) -> R,
fn tap_ref_mut<F, R>(self, func: F) -> Self where
Self: AsMut<T>,
F: FnOnce(&mut T) -> R,
Provides mutable access to the reference for modification.
fn tap_ref_mut_dbg<F, R>(self, func: F) -> Self where
Self: AsMut<T>,
F: FnOnce(&mut T) -> R,
fn tap_ref_mut_dbg<F, R>(self, func: F) -> Self where
Self: AsMut<T>,
F: FnOnce(&mut T) -> R,
Calls tap_ref_mut
in debug builds, and does nothing in release builds.
impl<T, U> TapBorrow<U> for T where
U: ?Sized,
impl<T, U> TapBorrow<U> for T where
U: ?Sized,
fn tap_borrow<F, R>(self, func: F) -> Self where
Self: Borrow<T>,
F: FnOnce(&T) -> R,
fn tap_borrow<F, R>(self, func: F) -> Self where
Self: Borrow<T>,
F: FnOnce(&T) -> R,
Provides immutable access to the borrow for inspection. Read more
fn tap_borrow_dbg<F, R>(self, func: F) -> Self where
Self: Borrow<T>,
F: FnOnce(&T) -> R,
fn tap_borrow_dbg<F, R>(self, func: F) -> Self where
Self: Borrow<T>,
F: FnOnce(&T) -> R,
Calls tap_borrow
in debug builds, and does nothing in release builds.
fn tap_borrow_mut<F, R>(self, func: F) -> Self where
Self: BorrowMut<T>,
F: FnOnce(&mut T) -> R,
fn tap_borrow_mut<F, R>(self, func: F) -> Self where
Self: BorrowMut<T>,
F: FnOnce(&mut T) -> R,
Provides mutable access to the borrow for modification.
fn tap_borrow_mut_dbg<F, R>(self, func: F) -> Self where
Self: BorrowMut<T>,
F: FnOnce(&mut T) -> R,
fn tap_borrow_mut_dbg<F, R>(self, func: F) -> Self where
Self: BorrowMut<T>,
F: FnOnce(&mut T) -> R,
Calls tap_borrow_mut
in debug builds, and does nothing in release
builds. Read more
Immutably dereferences self
for inspection.
fn tap_deref_dbg<F, R>(self, func: F) -> Self where
Self: Deref,
F: FnOnce(&Self::Target) -> R,
fn tap_deref_dbg<F, R>(self, func: F) -> Self where
Self: Deref,
F: FnOnce(&Self::Target) -> R,
Calls tap_deref
in debug builds, and does nothing in release builds.
fn tap_deref_mut<F, R>(self, func: F) -> Self where
Self: DerefMut,
F: FnOnce(&mut Self::Target) -> R,
fn tap_deref_mut<F, R>(self, func: F) -> Self where
Self: DerefMut,
F: FnOnce(&mut Self::Target) -> R,
Mutably dereferences self
for modification.
fn tap_deref_mut_dbg<F, R>(self, func: F) -> Self where
Self: DerefMut,
F: FnOnce(&mut Self::Target) -> R,
fn tap_deref_mut_dbg<F, R>(self, func: F) -> Self where
Self: DerefMut,
F: FnOnce(&mut Self::Target) -> R,
Calls tap_deref_mut
in debug builds, and does nothing in release
builds. Read more