Struct linfa_preprocessing::CountVectorizer
source · [−]pub struct CountVectorizer { /* private fields */ }
Expand description
Counts the occurrences of each vocabulary entry, learned during fitting, in a sequence of documents. Each vocabulary entry is mapped to an integer value that is used to index the count in the result.
Implementations
sourceimpl CountVectorizer
impl CountVectorizer
sourcepub fn params() -> CountVectorizerParams
pub fn params() -> CountVectorizerParams
Construct a new set of parameters
sourcepub fn transform<T: ToString, D: Data<Elem = T>>(
&self,
x: &ArrayBase<D, Ix1>
) -> CsMat<usize>
pub fn transform<T: ToString, D: Data<Elem = T>>(
&self,
x: &ArrayBase<D, Ix1>
) -> CsMat<usize>
Given a sequence of n
documents, produces a sparse array of size (n, vocabulary_entries)
where column j
of row i
is the number of occurrences of vocabulary entry j
in the document of index i
. Vocabulary entry j
is the string
at the j
-th position in the vocabulary. If a vocabulary entry was not encountered in a document, then the relative
cell in the sparse matrix will be set to None
.
sourcepub fn transform_files<P: AsRef<Path>>(
&self,
input: &[P],
encoding: EncodingRef,
trap: DecoderTrap
) -> CsMat<usize>
pub fn transform_files<P: AsRef<Path>>(
&self,
input: &[P],
encoding: EncodingRef,
trap: DecoderTrap
) -> CsMat<usize>
Given a sequence of n
file names, produces a sparse array of size (n, vocabulary_entries)
where column j
of row i
is the number of occurrences of vocabulary entry j
in the document contained in the file of index i
. Vocabulary entry j
is the string
at the j
-th position in the vocabulary. If a vocabulary entry was not encountered in a document, then the relative
cell in the sparse matrix will be set to None
.
The files will be read using the specified encoding
, and any sequence unrecognized by the encoding will be handled
according to trap
.
Trait Implementations
sourceimpl Clone for CountVectorizer
impl Clone for CountVectorizer
sourcefn clone(&self) -> CountVectorizer
fn clone(&self) -> CountVectorizer
Returns a copy of the value. Read more
1.0.0 · sourcefn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from source
. Read more
Auto Trait Implementations
impl !RefUnwindSafe for CountVectorizer
impl Send for CountVectorizer
impl !Sync for CountVectorizer
impl Unpin for CountVectorizer
impl UnwindSafe for CountVectorizer
Blanket Implementations
sourceimpl<T> BorrowMut<T> for T where
T: ?Sized,
impl<T> BorrowMut<T> for T where
T: ?Sized,
const: unstable · sourcefn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more