Struct linfa_preprocessing::CountVectorizer
source · [−]pub struct CountVectorizer { /* private fields */ }Expand description
Counts the occurrences of each vocabulary entry, learned during fitting, in a sequence of documents. Each vocabulary entry is mapped to an integer value that is used to index the count in the result.
Implementations
sourceimpl CountVectorizer
impl CountVectorizer
sourcepub fn params() -> CountVectorizerParams
pub fn params() -> CountVectorizerParams
Construct a new set of parameters
sourcepub fn transform<T: ToString, D: Data<Elem = T>>(
&self,
x: &ArrayBase<D, Ix1>
) -> CsMat<usize>
pub fn transform<T: ToString, D: Data<Elem = T>>(
&self,
x: &ArrayBase<D, Ix1>
) -> CsMat<usize>
Given a sequence of n documents, produces a sparse array of size (n, vocabulary_entries) where column j of row i
is the number of occurrences of vocabulary entry j in the document of index i. Vocabulary entry j is the string
at the j-th position in the vocabulary. If a vocabulary entry was not encountered in a document, then the relative
cell in the sparse matrix will be set to None.
sourcepub fn transform_files<P: AsRef<Path>>(
&self,
input: &[P],
encoding: EncodingRef,
trap: DecoderTrap
) -> CsMat<usize>
pub fn transform_files<P: AsRef<Path>>(
&self,
input: &[P],
encoding: EncodingRef,
trap: DecoderTrap
) -> CsMat<usize>
Given a sequence of n file names, produces a sparse array of size (n, vocabulary_entries) where column j of row i
is the number of occurrences of vocabulary entry j in the document contained in the file of index i. Vocabulary entry j is the string
at the j-th position in the vocabulary. If a vocabulary entry was not encountered in a document, then the relative
cell in the sparse matrix will be set to None.
The files will be read using the specified encoding, and any sequence unrecognized by the encoding will be handled
according to trap.
Trait Implementations
sourceimpl Clone for CountVectorizer
impl Clone for CountVectorizer
sourcefn clone(&self) -> CountVectorizer
fn clone(&self) -> CountVectorizer
Returns a copy of the value. Read more
1.0.0 · sourcefn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from source. Read more
Auto Trait Implementations
impl !RefUnwindSafe for CountVectorizer
impl Send for CountVectorizer
impl !Sync for CountVectorizer
impl Unpin for CountVectorizer
impl UnwindSafe for CountVectorizer
Blanket Implementations
sourceimpl<T> BorrowMut<T> for T where
T: ?Sized,
impl<T> BorrowMut<T> for T where
T: ?Sized,
const: unstable · sourcefn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more