pub struct Encoded<'a> { /* private fields */ }Expand description
Output produced by a Tokenizer::encode implementation.
Use Encoded::token_ids to get the token IDs to feed to a model, and
Encoded::text_for_token_range to map token ID ranges back to the
corresponding input text.
Implementations§
Source§impl<'a> Encoded<'a>
impl<'a> Encoded<'a>
Sourcepub fn token_ids(&self) -> &[TokenId] ⓘ
pub fn token_ids(&self) -> &[TokenId] ⓘ
Return the sequence of token IDs that the input was tokenized into.
Sourcepub fn into_token_ids(self) -> Vec<TokenId> ⓘ
pub fn into_token_ids(self) -> Vec<TokenId> ⓘ
Consume self and return a list of token IDs.
This is a convenient way to discard other information from the encoded output and get the token IDs as an owned vector.
Sourcepub fn token_offsets(&self) -> &[usize]
pub fn token_offsets(&self) -> &[usize]
Return the byte offsets of the start of each token in the input sequence. If the input contained two sequences, the offsets are assigned as if the two sequences were concatenated.
Sourcepub fn token_type_ids(&self) -> impl Iterator<Item = usize>
pub fn token_type_ids(&self) -> impl Iterator<Item = usize>
Return an iterator of the inputs for the token_type_ids input field
in the model, if it has one.
Trait Implementations§
Auto Trait Implementations§
impl<'a> Freeze for Encoded<'a>
impl<'a> RefUnwindSafe for Encoded<'a>
impl<'a> Send for Encoded<'a>
impl<'a> Sync for Encoded<'a>
impl<'a> Unpin for Encoded<'a>
impl<'a> UnwindSafe for Encoded<'a>
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more