pub struct SimpleTokenizer {
pub lowercase: bool,
pub remove_punctuation: bool,
}Expand description
Simple tokenizer: lowercase + split on whitespace and punctuation
This is the default tokenizer, matching VecStore’s original behavior. Fast and works for most Latin-script languages.
Fields§
§lowercase: boolWhether to convert to lowercase (default: true)
remove_punctuation: boolWhether to remove punctuation (default: true)
Implementations§
Source§impl SimpleTokenizer
impl SimpleTokenizer
Sourcepub fn with_case_preserved() -> Self
pub fn with_case_preserved() -> Self
Create tokenizer that preserves case
Trait Implementations§
Source§impl Clone for SimpleTokenizer
impl Clone for SimpleTokenizer
Source§fn clone(&self) -> SimpleTokenizer
fn clone(&self) -> SimpleTokenizer
Returns a duplicate of the value. Read more
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreSource§impl Debug for SimpleTokenizer
impl Debug for SimpleTokenizer
Source§impl Default for SimpleTokenizer
impl Default for SimpleTokenizer
Auto Trait Implementations§
impl Freeze for SimpleTokenizer
impl RefUnwindSafe for SimpleTokenizer
impl Send for SimpleTokenizer
impl Sync for SimpleTokenizer
impl Unpin for SimpleTokenizer
impl UnwindSafe for SimpleTokenizer
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more