pub struct PreprocessConfig {Show 14 fields
pub strip_html: bool,
pub handle_urls: UrlHandling,
pub handle_emails: EmailHandling,
pub handle_mentions: MentionHandling,
pub normalize_numbers: bool,
pub number_token: String,
pub expand_contractions: bool,
pub spell_check: bool,
pub max_edit_distance: usize,
pub remove_diacritics: bool,
pub unicode_normalize: bool,
pub lowercase: bool,
pub normalize_whitespace: bool,
pub remove_punctuation: bool,
}Expand description
Configuration for the text preprocessing pipeline.
Fields§
§strip_html: boolRemove HTML/XML tags.
handle_urls: UrlHandlingRemove or replace URLs.
handle_emails: EmailHandlingRemove or replace email addresses.
handle_mentions: MentionHandlingRemove or replace @mentions.
normalize_numbers: boolNormalize numbers.
number_token: StringNumber replacement token.
expand_contractions: boolExpand contractions.
spell_check: boolEnable spell checking.
max_edit_distance: usizeMaximum edit distance for spell checking.
remove_diacritics: boolRemove diacritics/accents.
unicode_normalize: boolPerform unicode normalization (NFC).
lowercase: boolConvert to lowercase.
normalize_whitespace: boolRemove extra whitespace.
remove_punctuation: boolRemove punctuation.
Trait Implementations§
Source§impl Clone for PreprocessConfig
impl Clone for PreprocessConfig
Source§fn clone(&self) -> PreprocessConfig
fn clone(&self) -> PreprocessConfig
Returns a duplicate of the value. Read more
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreSource§impl Debug for PreprocessConfig
impl Debug for PreprocessConfig
Auto Trait Implementations§
impl Freeze for PreprocessConfig
impl RefUnwindSafe for PreprocessConfig
impl Send for PreprocessConfig
impl Sync for PreprocessConfig
impl Unpin for PreprocessConfig
impl UnsafeUnpin for PreprocessConfig
impl UnwindSafe for PreprocessConfig
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> Pointable for T
impl<T> Pointable for T
Source§impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
Source§fn to_subset(&self) -> Option<SS>
fn to_subset(&self) -> Option<SS>
The inverse inclusion map: attempts to construct
self from the equivalent element of its
superset. Read moreSource§fn is_in_subset(&self) -> bool
fn is_in_subset(&self) -> bool
Checks if
self is actually part of its subset T (and can be converted to it).Source§fn to_subset_unchecked(&self) -> SS
fn to_subset_unchecked(&self) -> SS
Use with care! Same as
self.to_subset but without any property checks. Always succeeds.Source§fn from_subset(element: &SS) -> SP
fn from_subset(element: &SS) -> SP
The inclusion map: converts
self to the equivalent element of its superset.