pub struct CountVectorizer { /* private fields */ }Expand description
Count-based document-feature matrix builder with N-gram support.
Implementations§
Source§impl CountVectorizer
impl CountVectorizer
Sourcepub fn with_max_features(self, n: usize) -> Self
pub fn with_max_features(self, n: usize) -> Self
Limit the vocabulary to the n most frequent features.
Sourcepub fn with_ngram_range(self, min: usize, max: usize) -> Self
pub fn with_ngram_range(self, min: usize, max: usize) -> Self
Set N-gram range (min_n, max_n).
Sourcepub fn with_min_df(self, min_df: usize) -> Self
pub fn with_min_df(self, min_df: usize) -> Self
Set minimum document frequency (number of documents a token must appear in).
Sourcepub fn with_max_df_ratio(self, ratio: f64) -> Self
pub fn with_max_df_ratio(self, ratio: f64) -> Self
Set maximum document frequency as a fraction of the corpus (0.0–1.0).
Sourcepub fn transform(&self, texts: &[String]) -> Result<Vec<Vec<f64>>>
pub fn transform(&self, texts: &[String]) -> Result<Vec<Vec<f64>>>
Transform texts into a count matrix.
Sourcepub fn fit_transform(&mut self, corpus: &[String]) -> Result<Vec<Vec<f64>>>
pub fn fit_transform(&mut self, corpus: &[String]) -> Result<Vec<Vec<f64>>>
Fit then transform in one step.
Sourcepub fn vocabulary_size(&self) -> usize
pub fn vocabulary_size(&self) -> usize
Return the current vocabulary size.
Sourcepub fn vocabulary(&self) -> &HashMap<String, usize>
pub fn vocabulary(&self) -> &HashMap<String, usize>
Borrow the vocabulary map.
Trait Implementations§
Source§impl Clone for CountVectorizer
impl Clone for CountVectorizer
Source§fn clone(&self) -> CountVectorizer
fn clone(&self) -> CountVectorizer
Returns a duplicate of the value. Read more
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreSource§impl Debug for CountVectorizer
impl Debug for CountVectorizer
Auto Trait Implementations§
impl Freeze for CountVectorizer
impl RefUnwindSafe for CountVectorizer
impl Send for CountVectorizer
impl Sync for CountVectorizer
impl Unpin for CountVectorizer
impl UnsafeUnpin for CountVectorizer
impl UnwindSafe for CountVectorizer
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> Pointable for T
impl<T> Pointable for T
Source§impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
Source§fn to_subset(&self) -> Option<SS>
fn to_subset(&self) -> Option<SS>
The inverse inclusion map: attempts to construct
self from the equivalent element of its
superset. Read moreSource§fn is_in_subset(&self) -> bool
fn is_in_subset(&self) -> bool
Checks if
self is actually part of its subset T (and can be converted to it).Source§fn to_subset_unchecked(&self) -> SS
fn to_subset_unchecked(&self) -> SS
Use with care! Same as
self.to_subset but without any property checks. Always succeeds.Source§fn from_subset(element: &SS) -> SP
fn from_subset(element: &SS) -> SP
The inclusion map: converts
self to the equivalent element of its superset.