1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
//! The shared LID inference-result struct, ported from
//! [`mlx_audio.lid.models.{wav2vec2,ecapa_tdnn}.predict`][lid-predict-wav2vec2]
//! [(ecapa)][lid-predict-ecapa].
//!
//! Both LID architectures mlx-audio ships expose their
//! `Model.predict(audio, top_k=…)` result as the same shape: a
//! `List[Tuple[str, float]]` of `(language_code, probability)` pairs,
//! sorted by probability descending. mlxrs spells the same shape as a
//! typed [`LidPrediction`] + [`LidOutput`] so a per-architecture LID
//! model can return one [`LidOutput`] downstream consumers can read
//! uniformly.
//!
//! [lid-predict-wav2vec2]: https://github.com/Blaizzy/mlx-audio/blob/main/mlx_audio/lid/models/wav2vec2/wav2vec_lid.py#L101-L148
//! [lid-predict-ecapa]: https://github.com/Blaizzy/mlx-audio/blob/main/mlx_audio/lid/models/ecapa_tdnn/ecapa_tdnn.py#L135-L163
/// One `(language_code, probability)` prediction in a [`LidOutput`] —
/// port of the `Tuple[str, float]` mlx-audio's
/// `predict(…) -> List[Tuple[str, float]]` returns
/// ([wav2vec_lid.py:101-148][lid-predict-wav2vec2],
/// [ecapa_tdnn.py:135-163][lid-predict-ecapa]).
///
/// `language_code` is the model's `id2label[idx]` lookup result (e.g.
/// `"eng"`, `"fra"`) or the `"LABEL_<idx>"` fallback when the model
/// config does not carry an `id2label` map (mirror of mlx-audio's
/// `id2label.get(str(idx), f"LABEL_{idx}")`). `probability` is a softmax
/// score in `[0, 1]`.
///
/// [lid-predict-wav2vec2]: https://github.com/Blaizzy/mlx-audio/blob/main/mlx_audio/lid/models/wav2vec2/wav2vec_lid.py#L101-L148
/// [lid-predict-ecapa]: https://github.com/Blaizzy/mlx-audio/blob/main/mlx_audio/lid/models/ecapa_tdnn/ecapa_tdnn.py#L135-L163
/// The result of one LID inference pass — port of mlx-audio's
/// `Model.predict(audio, top_k=…)` return shape.
///
/// mlx-audio returns a raw `List[Tuple[str, float]]`; mlxrs wraps the
/// list in a struct so a future consumer can compose a richer envelope
/// (e.g. the input sample rate, model id, …) without breaking the call
/// sites. The list is **already sorted by probability descending** —
/// matching mlx-audio's
/// `sorted(enumerate(probs_list), key=lambda x: x[1], reverse=True)` —
/// so the top-1 prediction is at `predictions[0]`.
///
/// The list length mirrors mlx-audio's `top_k=5` default but is fully
/// governed by the caller; mlxrs does not impose a top-k cap here — the
/// per-architecture loader decides.
///
/// Both the typed [`LidPrediction`] entries and the wrapping
/// [`LidOutput`] derive full serde so a result can be persisted to disk
/// (the common "save the top-k as a JSON sidecar" consumer).