bayespam/lib.rs
1//! # bayespam
2//!
3//! A simple bayesian spam classifier.
4//!
5//! ## About
6//!
7//! Bayespam is inspired by [Naive Bayes classifiers](https://en.wikipedia.org/wiki/Naive_Bayes_spam_filtering), a popular statistical technique of e-mail filtering.
8//!
9//! Here, the message to be identified is cut into simple words, also called tokens.
10//! That are compared to all the corpus of messages (spam or not), to determine the frequency of different tokens in both categories.
11//!
12//! A probabilistic formula is used to calculate the probability that the message is a spam.
13//! When the probability is high enough, the classifier categorizes the message as likely a spam, otherwise as likely a ham.
14//! The probability threshold is fixed at 0.8 by default.
15//!
16//! ## Usage
17//!
18//! Add to your `Cargo.toml` manifest:
19//!
20//! ```ini
21//! [dependencies]
22//! bayespam = "1.1.0"
23//! ```
24//!
25//! ### Use a pre-trained model
26//!
27//! Add a `model.json` file to your **package root**.
28//! Then, you can use it to **score** and **identify** messages:
29//!
30//! ```
31//! extern crate bayespam;
32//!
33//! use bayespam::classifier;
34//!
35//! fn main() -> Result<(), std::io::Error> {
36//! // Identify a typical spam message
37//! let spam = "Lose up to 19% weight. Special promotion on our new weightloss.";
38//! let is_spam = classifier::identify(spam)?;
39//! assert!(is_spam);
40//!
41//! // Identify a typical ham message
42//! let ham = "Hi Bob, can you send me your machine learning homework?";
43//! let is_spam = classifier::identify(ham)?;
44//! assert!(!is_spam);
45//!
46//! Ok(())
47//! }
48//! ```
49//!
50//! ### Train your own model
51//!
52//! You can train a new model from scratch:
53//!
54//! ```
55//! extern crate bayespam;
56//!
57//! use bayespam::classifier::Classifier;
58//!
59//! fn main() {
60//! // Create a new classifier with an empty model
61//! let mut classifier = Classifier::new();
62//!
63//! // Train the classifier with a new spam example
64//! let spam = "Don't forget our special promotion: -30% on men shoes, only today!";
65//! classifier.train_spam(spam);
66//!
67//! // Train the classifier with a new ham example
68//! let ham = "Hi Bob, don't forget our meeting today at 4pm.";
69//! classifier.train_ham(ham);
70//!
71//! // Identify a typical spam message
72//! let spam = "Lose up to 19% weight. Special promotion on our new weightloss.";
73//! let is_spam = classifier.identify(spam);
74//! assert!(is_spam);
75//!
76//! // Identify a typical ham message
77//! let ham = "Hi Bob, can you send me your machine learning homework?";
78//! let is_spam = classifier.identify(ham);
79//! assert!(!is_spam);
80//! }
81//! ```
82
83pub mod classifier;