Struct AugmentationClassify

Source
pub struct AugmentationClassify {}
Expand description

An augmenter that rewrites the SegmentedToken::kind field to match reality.

It does so by reading the token text (preferring the normalized text) and applying heuristics based on the unicode GeneralCategoryGroup of the characters it contains.

The following heuristics are applied in the given order:

  1. If it contains Letters or Numbers -> SegmentedTokenKind::AlphaNumeric
  2. If it contains Symbols or Other -> SegmentedTokenKind::Symbol
  3. If it contains Punctuation or Separators -> SegmentedTokenKind::Separator

Exceptions from usual unicode classification: \n and \0 are seperators.

The Mark category is ignored. If none of the heuristics apply the token kind is reset to None.

Implementations§

Source§

impl AugmentationClassify

Source

pub fn new() -> Self

Create a new classify augmenter with default settings.

Trait Implementations§

Source§

impl Augmenter for AugmentationClassify

Source§

fn augment<'a>(&self, token: SegmentedToken<'a>) -> SegmentedToken<'a>

Apply augmentation function to the given token and return it.
Source§

impl Clone for AugmentationClassify

Source§

fn clone(&self) -> AugmentationClassify

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for AugmentationClassify

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Default for AugmentationClassify

Source§

fn default() -> AugmentationClassify

Returns the “default value” for a type. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> Segmenter for T
where T: Augmenter,

Source§

type SubdivisionIter<'a> = IntoIter<SegmentedToken<'a>>

The iterator type returned by the subdivide function if it has multiple results. Read more
Source§

fn subdivide<'a>( &self, token: SegmentedToken<'a>, ) -> UseOrSubdivide<SegmentedToken<'a>, <T as Segmenter>::SubdivisionIter<'a>>

A method that should split the given token into zero, one or more subtokens. Read more
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.