1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
//! This is an implementation for metrics to be used in various ML/DL fields.
//! for now, split_whitespace based rouge-n score is provided.
//!
use HashMap;
use ;
use ;
pub use crate;
/// Creates n-grams from a list of tokens.
///
/// Given a list of tokens and the desired size `n`, this function generates n-grams,
/// which are contiguous sequences of `n` tokens from the input list.
///
/// ### Arguments
///
/// * `tokens` - A vector of string slices representing individual tokens.
/// * `n` - The size of the n-grams to be created.
///
/// ### Returns
///
/// A `HashMap` where keys are n-grams (represented as vectors of string slices) and values
/// are the counts of each n-gram in the input sequence.
///
/// ### Examples
///
/// ```
/// use std::collections::HashMap;
/// use text_score::rouge::create_ngrams;
///
/// let tokens = vec!["this", "is", "an", "example"];
/// let n = 2;
///
/// let ngrams = create_ngrams(tokens, n);
///
/// // The result may look like: {"this is": 1, "is an": 1, "an example": 1}
/// ```
///
/// ### Note
///
/// - The function uses a sliding window approach to iterate through the input `tokens`
/// and create n-grams of the specified size `n`.
/// - The resulting n-grams are stored in a `HashMap`, where each key is an n-gram,
/// and the corresponding value is the count of occurrences of that n-gram in the input sequence.
/// Computes precision, recall, and F1 score based on n-grams.
///
/// Given two HashMaps representing the n-grams of predicted and target sequences,
/// this function calculates precision, recall, and F1 score for the prediction.
///
/// ### Arguments
///
/// * `predicted_ngrams` - A HashMap containing n-grams and their counts for the predicted sequence.
/// * `target_ngrams` - A HashMap containing n-grams and their counts for the target (reference) sequence.
///
/// ### Returns
///
/// A `Score` struct containing precision, recall, and F1 score for the prediction based on n-grams.
///
/// ### Examples
///
/// ```
/// use std::collections::{HashMap, hash_map};
/// use text_score::rouge::{ngram_based_score, Score}; // Replace with the actual module name
///
/// let predicted_ngrams = hashmap! { vec!["this", "is"] => 2, vec!["is", "an"] => 1 };
/// let target_ngrams = hashmap! { vec!["this", "is"] => 3, vec!["is", "an"] => 2 };
///
/// let score = ngram_based_score(predicted_ngrams, target_ngrams);
/// println!("Precision: {}", score.precision); // Accessing precision field
/// println!("Recall: {}", score.recall); // Accessing recall field
/// println!("F1 Score: {}", score.f1); // Accessing f1 field
/// ```
///
/// # Note
///
/// - The function iterates through the target n-grams and computes the intersection count
/// with the predicted n-grams to calculate precision, recall, and F1 score.
/// - Precision and recall are calculated using the standard formulas, and F1 score is computed
/// using the `f1` function defined in the module.
/// - The resulting scores are returned in a `Score` struct.
/// Computes ROUGE scores based on n-grams for a given input and reference text.
///
/// ROUGE (Recall-Oriented Understudy for Gisting Evaluation) is a metric commonly used
/// in natural language processing to evaluate the quality of text summaries or translations.
/// This function calculates precision, recall, and F1 score based on n-grams for the provided input
/// text and reference text.
///
/// ### Arguments
///
/// * `input` - The input text to be evaluated.
/// * `reference` - The reference text, considered as the ground truth or gold standard.
/// * `n` - The size of n-grams to be used in the evaluation.
///
/// ### Returns
///
/// A `Result` containing a `Score` struct if successful, or an error message if `n` is less than 1.
///
/// ### Examples
///
/// ```
/// use text_score::rouge::{rouge_n, Score}; // Replace with the actual module name
///
/// let input_text = "This is a sample sentence for evaluation.";
/// let reference_text = "This is a sample sentence for testing.";
/// let n = 2;
///
/// match rouge_n(input_text, reference_text, n) {
/// Ok(score) => {
/// println!("Precision: {}", score.precision); // Accessing precision field
/// println!("Recall: {}", score.recall); // Accessing recall field
/// println!("F1 Score: {}", score.f1); // Accessing f1 field
/// }
/// Err(err) => println!("Error: {}", err),
/// }
/// ```
///
/// # Note
///
/// - The function checks if the specified `n` is greater than or equal to 1. If not, it returns an error.
/// - The input and reference texts are tokenized into words, and n-grams are created using the `create_ngrams` function.
/// - The n-gram based scores are then calculated using the `ngram_based_score` function.
/// - The resulting scores are returned in a `Score` struct if the operation is successful.