Struct linfa_preprocessing::count_vectorization::FittedCountVectorizer [−][src]
pub struct FittedCountVectorizer { /* fields omitted */ }Counts the occurrences of each vocabulary entry, learned during fitting, in a sequence of documents. Each vocabulary entry is mapped to an integer value that is used to index the count in the result.
Implementations
impl FittedCountVectorizer[src]
impl FittedCountVectorizer[src]pub fn nentries(&self) -> usize[src]
Number of vocabulary entries learned during fitting
pub fn transform<T: ToString, D: Data<Elem = T>>(
&self,
x: &ArrayBase<D, Ix1>
) -> CsMat<usize>[src]
&self,
x: &ArrayBase<D, Ix1>
) -> CsMat<usize>
Given a sequence of n documents, produces a sparse array of size (n, vocabulary_entries) where column j of row i
is the number of occurrences of vocabulary entry j in the document of index i. Vocabulary entry j is the string
at the j-th position in the vocabulary. If a vocabulary entry was not encountered in a document, then the relative
cell in the sparse matrix will be set to None.
pub fn transform_files<P: AsRef<Path>>(
&self,
input: &[P],
encoding: EncodingRef,
trap: DecoderTrap
) -> CsMat<usize>[src]
&self,
input: &[P],
encoding: EncodingRef,
trap: DecoderTrap
) -> CsMat<usize>
Given a sequence of n file names, produces a sparse array of size (n, vocabulary_entries) where column j of row i
is the number of occurrences of vocabulary entry j in the document contained in the file of index i. Vocabulary entry j is the string
at the j-th position in the vocabulary. If a vocabulary entry was not encountered in a document, then the relative
cell in the sparse matrix will be set to None.
The files will be read using the specified encoding, and any sequence unrecognized by the encoding will be handled
according to trap.
pub fn vocabulary(&self) -> &Vec<String>[src]
Contains all vocabulary entries, in the same order used by the transform methods.
Auto Trait Implementations
impl RefUnwindSafe for FittedCountVectorizer
impl RefUnwindSafe for FittedCountVectorizerimpl Send for FittedCountVectorizer
impl Send for FittedCountVectorizerimpl Sync for FittedCountVectorizer
impl Sync for FittedCountVectorizerimpl Unpin for FittedCountVectorizer
impl Unpin for FittedCountVectorizerimpl UnwindSafe for FittedCountVectorizer
impl UnwindSafe for FittedCountVectorizer