pub struct TokenizerConfig {
pub max_length: usize,
pub padding: bool,
pub truncation: bool,
pub add_special_tokens: bool,
}Expand description
Configuration for the tokenizer
§Configuration Options
max_length: Maximum sequence length (default: 512)padding: Enable padding to max_length (default: true)truncation: Enable truncation at max_length (default: true)add_special_tokens: Add special tokens like [CLS], [SEP] (default: true)
§Recommended Settings
Production (default):
let config = TokenizerConfig::default();
assert_eq!(config.max_length, 512);
assert!(config.padding);
assert!(config.truncation);Memory-constrained:
let config = TokenizerConfig {
max_length: 256,
padding: false,
truncation: true,
add_special_tokens: true,
};Fields§
§max_length: usizeMaximum sequence length (tokens)
padding: boolEnable padding to max_length
truncation: boolEnable truncation at max_length
add_special_tokens: boolAdd model-specific special tokens ([CLS], [SEP], etc.)
Trait Implementations§
Source§impl Clone for TokenizerConfig
impl Clone for TokenizerConfig
Source§fn clone(&self) -> TokenizerConfig
fn clone(&self) -> TokenizerConfig
Returns a duplicate of the value. Read more
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreSource§impl Debug for TokenizerConfig
impl Debug for TokenizerConfig
Auto Trait Implementations§
impl Freeze for TokenizerConfig
impl RefUnwindSafe for TokenizerConfig
impl Send for TokenizerConfig
impl Sync for TokenizerConfig
impl Unpin for TokenizerConfig
impl UnwindSafe for TokenizerConfig
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more