pub struct BertTokenizer { /* private fields */ }Expand description
Wrapper around HuggingFace tokenizer configured for BERT-style encoding.
Implementations§
Source§impl BertTokenizer
impl BertTokenizer
Sourcepub fn from_dir(dir: &Path, max_length: usize) -> Result<BertTokenizer, Error>
pub fn from_dir(dir: &Path, max_length: usize) -> Result<BertTokenizer, Error>
Load tokenizer from a model directory containing:
tokenizer.jsonconfig.jsonspecial_tokens_map.jsontokenizer_config.json
Sourcepub fn from_bytes(
tokenizer_json: &[u8],
config_json: &[u8],
special_tokens_map_json: &[u8],
tokenizer_config_json: &[u8],
max_length: usize,
) -> Result<BertTokenizer, Error>
pub fn from_bytes( tokenizer_json: &[u8], config_json: &[u8], special_tokens_map_json: &[u8], tokenizer_config_json: &[u8], max_length: usize, ) -> Result<BertTokenizer, Error>
Load tokenizer from raw file bytes.
Sourcepub fn encode_batch(&self, texts: &[&str]) -> Result<TokenizedBatch, Error>
pub fn encode_batch(&self, texts: &[&str]) -> Result<TokenizedBatch, Error>
Tokenize a batch of texts.
Returns input_ids, attention_mask, and token_type_ids for each text, all padded to the same length (longest in batch).
Auto Trait Implementations§
impl !Freeze for BertTokenizer
impl RefUnwindSafe for BertTokenizer
impl Send for BertTokenizer
impl Sync for BertTokenizer
impl Unpin for BertTokenizer
impl UnsafeUnpin for BertTokenizer
impl UnwindSafe for BertTokenizer
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
impl<ST, DT> CastableFrom<ST, Initialized, Initialized> for DT
impl<ST, DT> CastableFrom<ST, Uninit, Uninit> for DT
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more