[][src]Enum rust_tokenizers::preprocessing::tokenizer::base_tokenizer::Mask

pub enum Mask {
    None,
    Whitespace,
    Punctuation,
    CJK,
    Special,
    Begin,
    Continuation,
    Unfinished,
    Unknown,
}

Variants

None

The token has no particular mask. This is the default situation. It may indicate that further processing can be done on a token.

Whitespace

the token represents a whitespace (in any shape or form)

Punctuation

the token represents punctuation (in any shape or form)

CJK

the token represents a single Chinese/Japanese/Korean character (including kana and hangul)

Special

the token is a special marker (such as a separator marker, a class marker, etc)

Begin

the token is the begin in a series of subtokens, the offset refers specifically to the subtoken. Subsequent tokens in this sequence will carry the 'Continuation' mask

Continuation

the token is the continuation of the previous token, the offset refers specifically to the subtoken. All but the first subtoken in a sequence carry this mask (the first carries 'Begin'). (this is the reverse of Mask::Unfinished)

Unfinished

the token is the start of a token but not finished yet. All but the last subtoken in the a token sequence carry this mask. This is the reverse of Mask::Continuation.

Unknown

The token is out of vocabulary, it is unknown by the tokenizer and it will decode to unknown. Tokens that can be decoded properly (but may still be out of vocabulary) should not set this.

Trait Implementations

impl Clone for Mask[src]

impl Copy for Mask[src]

impl Debug for Mask[src]

impl Default for Mask[src]

impl<'de> Deserialize<'de> for Mask[src]

impl PartialEq<Mask> for Mask[src]

impl PartialOrd<Mask> for Mask[src]

impl Serialize for Mask[src]

impl StructuralPartialEq for Mask[src]

Auto Trait Implementations

impl RefUnwindSafe for Mask

impl Send for Mask

impl Sync for Mask

impl Unpin for Mask

impl UnwindSafe for Mask

Blanket Implementations

impl<T> Any for T where
    T: 'static + ?Sized
[src]

impl<T> Borrow<T> for T where
    T: ?Sized
[src]

impl<T> BorrowMut<T> for T where
    T: ?Sized
[src]

impl<T> DeserializeOwned for T where
    T: for<'de> Deserialize<'de>, 
[src]

impl<T> From<T> for T[src]

impl<T, U> Into<U> for T where
    U: From<T>, 
[src]

impl<T> ToOwned for T where
    T: Clone
[src]

type Owned = T

The resulting type after obtaining ownership.

impl<T, U> TryFrom<U> for T where
    U: Into<T>, 
[src]

type Error = Infallible

The type returned in the event of a conversion error.

impl<T, U> TryInto<U> for T where
    U: TryFrom<T>, 
[src]

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.