Struct rammer::BagOfWords [−][src]
pub struct BagOfWords { /* fields omitted */ }Expand description
A BagOfWords, also referred to as a bow, is a frequency map of words. Read more about the BagOfWords model here: BagOfWords Wikipedia. BagOfWords works with Unicode Words. Words are defined by as between UAX#29 word boundaries. BagOfWords is serializable using one of the serde serialization crates
use rammer::BagOfWords;
use serde_json;
let singly_trained_bow = BagOfWords::from_file("test_resources/test_data/unicode_and_ascii.txt").expect("File not found");
let big_bow = BagOfWords::from_folder("data/train/ham").expect("Folder not found");
let com_bow = singly_trained_bow.combine(big_bow);Implementations
Return a new BagOfWords with an empty Frequency Map.
let empty_bow = BagOfWords::new();Create a BagOfWords from a text file. This file should already be known to be ham or spam. The text file will be the basis of a new HSModel’s Ham/Spam BagOfWords
let spam_bow = BagOfWords::from_file("test_resources/test_data/unicode_and_ascii.txt").unwrap();Create a BagOfWords from a folder containing either spam training text files, or ham training text files.
let spam_bow = BagOfWords::from_folder("data/train/spam");Combines two BagOfWords into a new BagOfWords. Freqencies of words found in both bags are additive. This operation is commutative and associative. These properties can be used to dynamically grow your training BagOfWords.
let ham_bow_1 = BagOfWords::from("Hello there world"); // Creates: {HELLO: 1, THERE: 1, WORLD: 1}
let ham_bow_2 = BagOfWords::from("howdy there guy"); // Creates: {HOWDY: 1, THERE: 1, GUY: 1}
let com_bow = ham_bow_1.combine(ham_bow_2); // Combines to: {HELLO: 1, THERE: 2, HOWDY: 1, ...}Get the sum of all the Counts in a BagOfWords. Used internally for frequency calculations.
ham_bow.total_word_count(); // returns a sum of Counts.Calculates the Frequency of a word in the BagOfWords by taking count_of_a_word / total_word_count. This will return None, if the word slice passed contains multiple words.
let ham_bow = BagOfWords::from("hello there how are you");
ham_bow.word_frequency("hello"); //returns 0.2
ham_bow.word_frequency("hello there"); //returns NoneTrait Implementations
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error> where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error> where
__D: Deserializer<'de>,
Deserialize this value from the given Serde deserializer. Read more
Converts a &str to a bag of words. This to create BagOfWord models, consider using from_file or from_folder instead.
let bow = BagOfWords::from("hello world WOrLD"); // creates {HELLO: 1, WORLD: 2}Performs the conversion.
Use .collect() over an iterator of BagOfWords to additively combine them with combine
let bow: BagOfWords = vec![
BagOfWords::from("hi"),
BagOfWords::new(),
BagOfWords::from("Big sale!")]
.into_iter().collect();Creates a value from an iterator. Read more
Use .collect() over a parallel iterator of BagOfWords to additively combine them with combine use rayon crate to make .into_par_iter() available.
use rayon::prelude::*;
let bow: BagOfWords = vec![
BagOfWords::from("hi"),
BagOfWords::new(),
BagOfWords::from("Big sale!")]
.into_par_iter().collect();Creates an instance of the collection from the parallel iterator par_iter. Read more
This method tests for self and other values to be equal, and is used
by ==. Read more
This method tests for !=.
Auto Trait Implementations
impl RefUnwindSafe for BagOfWords
impl Send for BagOfWords
impl Sync for BagOfWords
impl Unpin for BagOfWords
impl UnwindSafe for BagOfWords
Blanket Implementations
Mutably borrows from an owned value. Read more