Expand description
Count vectorizer: convert text documents to a term-count matrix.
Tokenizes documents by splitting on non-alphanumeric characters, builds a
vocabulary, and produces a term-count matrix of shape (n_docs, n_vocab).
Structsยง
- Count
Vectorizer - An unfitted count vectorizer.
- Fitted
Count Vectorizer - A fitted count vectorizer holding the learned vocabulary.