1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
//! SILK speech codec: LPC-based frame encode/decode for Opus's speech mode.
//!
//! ## What
//!
//! Implements one encode/decode cycle for a single Opus SILK audio frame using
//! the primitives from [`crate::codecs::opus::lpc`]:
//!
//! 1. **Encode** — LPC analysis → prediction residual → gain normalisation →
//! 16-bit residual quantisation.
//! 2. **Decode** — 16-bit dequantisation → scale by gain → LPC synthesis.
//!
//! ## Why
//!
//! SILK exploits the predictability of voiced speech: the LPC predictor removes
//! the spectral envelope, leaving a spectrally flat residual that is far smaller
//! in energy than the original. Quantising the residual at high resolution
//! (16 bits) achieves very high SNR for speech at low bitrates.
//!
//! For generic audio (music, noise) the LPC provides little prediction gain and
//! [`crate::codecs::opus::celt`] should be preferred via the auto-detection in
//! [`crate::codecs::opus::mode::detect_mode`].
//!
//! ## Sketch limitations
//!
//! - Each frame is encoded independently with **zero initial LPC state**.
//! A complete implementation would carry the LPC filter state across frames to
//! prevent boundary artifacts.
//! - LPC coefficients are stored as `f32` values. Real SILK transmits Line
//! Spectral Frequency (LSF) parameters quantised to a codebook; the IO layer
//! (`audio_samples_io`) is responsible for that packing.
use crate::;
use ;
// ── Constants ─────────────────────────────────────────────────────────────────
/// Scale factor for 16-bit residual quantisation.
///
/// The normalised residual (in `[−1, 1]`) is multiplied by this constant before
/// rounding to `i16`, and divided by the same constant during dequantisation.
const SILK_RESIDUAL_SCALE: f32 = 32_767.0;
/// Minimum allowed gain value, used to prevent division by zero for silent frames.
///
/// Corresponds to a residual peak amplitude of 10^-8 linear, which is well below
/// the noise floor of any practical audio system.
const MIN_GAIN_THRESHOLD: f32 = 1e-8;
// ── SilkState ─────────────────────────────────────────────────────────────────
/// Cross-frame LPC filter memory for the SILK codec.
///
/// Initialise with [`SilkState::default`] (all zeros) at the start of a
/// continuous signal. Pass the same `SilkState` to every successive
/// [`silk_encode_frame_stateful`] / [`silk_decode_frame_stateful`] call to
/// eliminate boundary artefacts between frames.
///
/// The encoder and decoder each maintain independent history buffers; the
/// encoder's history tracks the **input** samples, the decoder's history tracks
/// the **reconstructed output** samples.
// ── SilkEncodedFrame ──────────────────────────────────────────────────────────
/// One SILK-encoded audio frame.
///
/// Stores the LPC coefficients, a 16-bit quantised prediction residual, and the
/// gain used to normalise the residual before quantisation. This struct is
/// self-contained: the decoder needs no external side-channel information.
///
/// The decoded frame length equals `residual_quantized.len()`.
///
/// ## Round-trip quality
///
/// The only error in the encode/decode round-trip is residual quantisation.
/// With `gain = max(|e[n]|)` the maximum per-sample error is `gain / 32767`.
/// For a 440 Hz sine at amplitude 0.5:
///
/// - The LPC residual energy is close to floating-point noise (≈ 10⁻⁵).
/// - Gain ≈ 10⁻⁵, so maximum error ≈ 3 × 10⁻¹⁰.
/// - Expected SNR > 50 dB.
///
/// For white noise the LPC provides no prediction gain (residual ≈ input),
/// but 16-bit quantisation still gives ≈ 90 dB dynamic range.
// ── silk_encode_frame ─────────────────────────────────────────────────────────
/// Encodes a single SILK audio frame.
///
/// Steps:
/// 1. Compute an LPC predictor of order [`SILK_LPC_ORDER`] (or less for short frames).
/// 2. Apply the analysis filter to obtain the prediction residual.
/// 3. Compute `gain = max(|e[n]|)` and normalise: `e_norm[n] = e[n] / gain`.
/// 4. Quantise the normalised residual to 16-bit integers:
/// `q[n] = round(e_norm[n] × 32767)`.
///
/// # Arguments
/// - `samples` – PCM samples for the frame (f32, any amplitude range).
///
/// # Errors
/// Returns [`AudioSampleError::Parameter`] if `samples` is empty.
// ── silk_decode_frame ─────────────────────────────────────────────────────────
/// Decodes a SILK-encoded audio frame.
///
/// Steps:
/// 1. Dequantise: `e_hat[n] = q[n] / 32767 × gain`.
/// 2. If `frame.pitch_lag` is `Some(lag)`, apply LTP synthesis:
/// `e_st[n] = e_lt[n] + ltp_gain × e_st[n − lag]`.
/// 3. Apply LPC synthesis: `y[n] = e_st[n] − Σ a[k]·y[n−1−k]`.
///
/// Both encoder and decoder use zero initial state, so the round-trip is exact
/// up to quantisation error (see [`SilkEncodedFrame`] for quality details).
///
/// For cross-frame continuity use [`silk_decode_frame_stateful`] instead.
///
/// # Arguments
/// - `frame` – A SILK frame produced by [`silk_encode_frame`] or
/// [`silk_encode_frame_stateful`].
///
/// # Returns
/// A `Vec<f32>` of reconstructed PCM samples.
// ── silk_encode_frame_stateful ────────────────────────────────────────────────
/// Encodes a SILK frame with cross-frame LPC state and long-term prediction.
///
/// Extends [`silk_encode_frame`] in two ways:
///
/// 1. **Cross-frame LPC state** — the analysis filter uses `state.encoder_lpc_history`
/// to carry context from the previous frame, eliminating boundary artefacts for
/// consecutive frames.
///
/// 2. **Long-term prediction (LTP)** — after computing the short-term LPC
/// residual, [`estimate_pitch`] searches for a pitch period. When one is found,
/// a single-tap LTP filter (`d[n] = e[n] − ltp_gain × e[n − T]`) further
/// reduces the residual energy before quantisation.
///
/// # Arguments
/// - `samples` – PCM samples for the frame (f32).
/// - `sample_rate` – Signal sample rate in Hz (used for pitch lag bounds).
/// - `state` – Cross-frame state updated in place.
///
/// # Errors
/// Returns [`AudioSampleError::Parameter`] if `samples` is empty.
// ── silk_decode_frame_stateful ────────────────────────────────────────────────
/// Decodes a SILK frame with cross-frame LPC state and LTP synthesis.
///
/// Mirror of [`silk_encode_frame_stateful`]. Uses `state.decoder_lpc_history`
/// to carry synthesis filter context across frame boundaries. Must be paired
/// with [`silk_encode_frame_stateful`] (same state sequence) for correct output.
///
/// # Arguments
/// - `frame` – A SILK frame produced by [`silk_encode_frame_stateful`].
/// - `state` – Cross-frame state updated in place.
///
/// # Returns
/// A `Vec<f32>` of reconstructed PCM samples.