TokenMapper

Struct TokenMapper 

Source
pub struct TokenMapper {
    pub token_map: HashMap<TokenVector, TokenId>,
    pub reverse_token_map: HashMap<TokenId, TokenVector>,
    /* private fields */
}
Expand description

A struct to map tokens to unique identifiers and vice versa.

This structure is responsible for maintaining a bidirectional mapping between tokens (represented as character vectors) and their unique IDs. It also provides utility methods to query and manage these mappings.

Fields§

§token_map: HashMap<TokenVector, TokenId>

A map of token character vectors to their unique IDs.

§reverse_token_map: HashMap<TokenId, TokenVector>

A reverse map of unique IDs back to their token character vectors.

Implementations§

Source§

impl TokenMapper

Source

pub fn new() -> Self

Creates a new instance of TokenMapper.

Initializes empty maps for tokens and reverse lookups, and sets the starting next_id to 0.

Source

pub fn upsert_token(&mut self, token: &str) -> TokenId

Adds a token to the map if it doesn’t already exist, and returns its unique ID.

If the token is already present in the token_map, its existing ID is returned. Otherwise, a new ID is generated, stored, and returned.

§Arguments
  • token - A reference to the token string to add or look up.
§Returns
  • A unique ID for the token.
Source

pub fn get_token_id(&self, token: &TokenRef) -> Option<TokenId>

Gets the unique ID for a token if it exists in the map.

§Arguments
  • token - A reference to the token string to look up.
§Returns
  • Some(TokenId) if the token is present, or None if it is not found.
Source

pub fn get_filtered_tokens<'a>( &'a self, tokens: Vec<&'a TokenRef>, ) -> Vec<&'a TokenRef>

Filters and returns tokens that are present in the map.

§Arguments
  • tokens - A vector of borrowed token references.
§Returns
  • A vector of borrowed token references that exist in the map.
Source

pub fn get_filtered_token_ids<'a>( &'a self, tokens: Vec<&'a TokenRef>, ) -> Vec<TokenId>

Filters and returns token IDs for tokens that exist in the map.

§Arguments
  • tokens - A vector of borrowed token references.
§Returns
  • A vector of token IDs corresponding to the tokens found in the map.
Source

pub fn get_token_by_id(&self, token_id: TokenId) -> Option<String>

Retrieves the token string for a given unique ID.

§Arguments
  • token_id - The unique ID of the token to look up.
§Returns
  • Some(String) containing the token if the ID is found, or None otherwise.
Source

pub fn get_tokens_by_ids(&self, token_ids: &[TokenId]) -> Vec<Option<String>>

Retrieves token strings for a list of token IDs.

§Arguments
  • token_ids - A slice of token IDs to look up.
§Returns
  • A vector of Option<String> where each entry corresponds to the token string for the given ID, or None if the ID is not found.
Source

pub fn get_token_count(&self) -> usize

Gets the total number of unique tokens in the map.

§Returns
  • The number of unique tokens as a usize.

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.