[][src]Struct rust_tokenizers::TokenizedInput

pub struct TokenizedInput {
    pub token_ids: Vec<i64>,
    pub segment_ids: Vec<i8>,
    pub special_tokens_mask: Vec<i8>,
    pub overflowing_tokens: Vec<i64>,
    pub num_truncated_tokens: usize,
    pub token_offsets: Vec<Option<Offset>>,
    pub reference_offsets: Vec<Vec<OffsetSize>>,
    pub mask: Vec<Mask>,
}

Tokenized Input, ready for processing in language models

This represents the final output of the encoding process (tokenized sentence with encoded values)

Fields

token_ids: Vec<i64>

Vector of token IDs

segment_ids: Vec<i8>

Vector segments ids (for example for BERT segments are separated with a [SEP] marker, each incrementing the segment ID). This vector has the same length as token_ids.

special_tokens_mask: Vec<i8>

Flags tokens as special tokens (1) or not (0). This vector has the same length as token_ids.

overflowing_tokens: Vec<i64>

Vector containing overflowing tokens, populated following a truncation step

num_truncated_tokens: usize

Number of overflowing tokens following a truncation step. this equals the length overflowing_tokens

token_offsets: Vec<Option<Offset>>

Offset information (as start and end positions) in relation to the original text. Tokens that can not be related to the original source are registered as None.

reference_offsets: Vec<Vec<OffsetSize>>

Offset information (as a sequence of positions) in relation to the original text. Tokens that can not be related to the original source are registered as None.

mask: Vec<Mask>

Masks tokens providing information on the type of tokens. This vector has the same length as token_ids.

Trait Implementations

impl Clone for TokenizedInput[src]

impl Debug for TokenizedInput[src]

impl PartialEq<TokenizedInput> for TokenizedInput[src]

impl PartialOrd<TokenizedInput> for TokenizedInput[src]

impl StructuralPartialEq for TokenizedInput[src]

Auto Trait Implementations

Blanket Implementations

impl<T> Any for T where
    T: 'static + ?Sized
[src]

impl<T> Borrow<T> for T where
    T: ?Sized
[src]

impl<T> BorrowMut<T> for T where
    T: ?Sized
[src]

impl<T> From<T> for T[src]

impl<T, U> Into<U> for T where
    U: From<T>, 
[src]

impl<T> Pointable for T

type Init = T

The type for initializers.

impl<T> ToOwned for T where
    T: Clone
[src]

type Owned = T

The resulting type after obtaining ownership.

impl<T, U> TryFrom<U> for T where
    U: Into<T>, 
[src]

type Error = Infallible

The type returned in the event of a conversion error.

impl<T, U> TryInto<U> for T where
    U: TryFrom<T>, 
[src]

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.