1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
//! COSIATEC: greedy iterative compression using translational equivalence classes (TECs).
use HashSet;
use cratebuild_tecs_from_mtps;
use crate;
/// COSIATEC: a greedy, iterative compression algorithm based on translational equivalence classes (TECs).
///
/// The algorithm repeatedly runs SIATEC on the **remaining uncovered points**, selects the
/// “best” TEC according to the multi‑level comparison rules (compression ratio, compactness,
/// coverage size, pattern size, temporal width, bounding box area), adds it to the result list,
/// and removes its covered points from the remaining set. This process continues until all
/// points are covered. When no pattern can be found, the remaining points are encoded as
/// trivial single‑point TECs (empty translator set).
///
/// # Arguments
/// * `dataset` - A reference to the full set of points `(tick, pitch)`. The algorithm works on
/// a copy of this set and does not modify the original data.
/// * `restrict_dpitch_zero` - If `true`, only translation vectors with zero pitch difference
/// are considered (purely temporal shifts). This restricts patterns to horizontal repetition.
///
/// # Returns
/// A `Vec` of `TranslationalEquivalence` objects that partition the input points (each point
/// belongs to exactly one TEC). Each TEC may have `sub_tecs` if recursion was applied later
/// (not used in this base function).
///
/// # Comparison details
/// When selecting the best TEC on a given iteration, the algorithm uses `tec_sort_key` with
/// the **remaining point set** (not the full dataset) to compute compactness and coverage.
/// This ensures that patterns are evaluated against the points that still need to be encoded.
///
/// # Complexity
/// In the worst case, building all TECs from the remaining points takes `O(k·n³)` time per
/// iteration (`n` = size of remaining set). The total number of iterations is at most the
/// number of points. Practical performance is acceptable for typical music datasets (up to a
/// few thousand points).
///
/// # Examples
/// ```
/// use nbslim::cosiatec_compress;
///
/// let points = vec![(0, 10), (1, 20), (2, 10), (3, 20)];
/// let tecs = cosiatec_compress(&points, true);
/// for tec in tecs {
/// println!("{}", tec.coverage().len());
/// }
/// ```