1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
//! Utility functions for converting between note events and points, merging TECs,
//! computing compression statistics, and reconstructing original data.
use ;
use crateTranslationalEquivalence;
const PITCH_BITS: u32 = 14;
const OFFSET: i16 = 1200;
/// Converts a list of MIDI‑like note events into 2D points for compression,
/// along with a mapping from each point back to **all** original note data
/// that share that exact point (i.e., a multi‑set).
///
/// The y‑coordinate is encoded as:
/// `(instrument << PITCH_BITS) | (pitch_in_cents + OFFSET)`
/// This packs instrument ID and pitch offset into a single `u32` value.
/// The x‑coordinate is simply the tick (time) value.
///
/// # Arguments
/// * `notes` - A slice of tuples representing note events:
/// `(tick, layer, instrument, key, velocity, panning, pitch)`
///
/// # Returns
/// A tuple `(points, mapping)` where:
/// - `points`: `Vec<(u32, u32)>` – the generated 2D points `(tick, encoded_y)`,
/// with possible duplicates.
/// - `mapping`: `HashMap<(u32, u32), Vec<NoteData>>` mapping a point to all
/// original notes that produced it, in the order encountered.
///
/// # Examples
/// ```
/// use nbslim::utils::notes_to_points;
///
/// let notes = vec![(0, 0, 1, 60, 100, 0, 0)];
/// let (points, mapping) = notes_to_points(¬es);
/// assert_eq!(points, vec![(0, 23584)]);
/// assert_eq!(mapping.get(&(0, 23584)).unwrap()[0], (0, 0, 1, 60, 100, 0, 0));
/// ```
/// Merges all TECs that satisfy a predicate into a single TEC containing all their coverage points.
///
/// The merged TEC has no translators (i.e., it is not a true TEC) and is used only for
/// reconstruction. It helps avoid many tiny TECs that would each occupy a small layer block.
///
/// # Arguments
/// * `tecs` - A vector of TECs to process.
/// * `filter` - A function that returns `true` for TECs that should be merged.
///
/// # Returns
/// A new vector of TECs where the filtered ones have been replaced by a single merged TEC
/// (if any were merged), and all others are kept unchanged.
///
/// # Examples
/// ```
/// # use nbslim::TranslationalEquivalence;
/// # use nbslim::utils::merge_tecs;
/// # use std::collections::HashSet;
/// let pattern = vec![(0, 0)];
/// let tec = TranslationalEquivalence::new(pattern, HashSet::new(), None);
/// let merged = merge_tecs(vec![tec], |t| t.coverage().len() == 1);
/// assert_eq!(merged.len(), 1);
/// ```
/// Calculates compression statistics for a list of TECs.
///
/// # Returns
/// A tuple `(original_count, encoded_units, compression_ratio)` where:
/// - `original_count` is the total number of points in the original dataset.
/// - `encoded_units` is the number of units used in the compressed representation
/// (pattern points + translators, or recursively the sum of sub‑TEC units).
/// - `compression_ratio` is `original_count / encoded_units`. A ratio greater than 1
/// indicates compression.
///
/// # Examples
/// ```
/// # use nbslim::TranslationalEquivalence;
/// # use nbslim::utils::compression_stats;
/// # use std::collections::HashSet;
/// let points = vec![(0, 0), (1, 0)];
/// let tecs = vec![TranslationalEquivalence::new(vec![(0, 0)], HashSet::new(), None)];
/// let (orig, enc, ratio) = compression_stats(&tecs, &points);
/// assert_eq!(orig, 2);
/// ```
/// Reconstructs the original set of points from a list of TECs, then maps
/// each point back to its original note data using a pre‑built mapping.
///
/// The reconstruction first computes the coverage (pattern + all translated copies)
/// of each TEC, merges them, and returns the **sorted unique** points.
/// Then, for each such point, it retrieves all notes associated with it
/// from the mapping (in the order they were originally inserted) and flattens
/// them into the result vector.
///
/// **Important:** This function assumes that the total number of times a point
/// appears in the compressed representation is exactly the number of notes
/// stored in the mapping for that point. Because the compression process does
/// not preserve duplicate points, the final note order is deterministic but
/// may not match the original order across different points. For exact order
/// preservation, you must maintain an auxiliary index sequence.
///
/// # Arguments
/// * `tecs` - A vector of `TranslationalEquivalence` objects, typically produced
/// by a compression algorithm.
/// * `mapping` - The mapping from a point to all original notes that produced it,
/// as returned by `notes_to_points`.
///
/// # Returns
/// A `Vec` of original note tuples, in the order: first all notes belonging to
/// the smallest `(tick, y)` point (sorted), then the next point, etc. Notes that
/// shared the exact same point are returned in the insertion order recorded in
/// the mapping.
///
/// # Examples
/// ```
/// # use nbslim::TranslationalEquivalence;
/// # use nbslim::utils::{notes_to_points, points_to_notes};
/// # use std::collections::HashSet;
/// let notes = vec![(0, 0, 1, 60, 100, 0, 0)];
/// let (points, mapping) = notes_to_points(¬es);
/// let tecs = vec![TranslationalEquivalence::new(points.clone(), HashSet::new(), None)];
/// let reconstructed = points_to_notes(&tecs, &mapping);
/// assert_eq!(reconstructed, notes);
/// ```