1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
//! The [`CodecBackend`] trait every neural audio codec implements.
//!
//! Codecs are PCM <-> discrete-token translators. Concrete backends
//! (EnCodec, DAC, SNAC, ...) implement this single trait and are then
//! consumable through [`crate::CodecBackendHandle`] (typed) or
//! [`crate::DynCodecProvider`] (erased, for binding layers).
use async_trait;
use AudioBackend;
use crateCodecError;
/// A neural audio codec.
///
/// Implementors translate mono `f32` PCM samples to and from discrete
/// codebook tokens. Both methods are async so backends are free to do
/// blocking model loads / GPU dispatch on a background thread without
/// poisoning the caller's runtime.
///
/// ## Trait shape
///
/// - Extends [`AudioBackend`] so codec backends share the same lifecycle
/// surface (`load` / `unload` / `is_loaded`) as TTS / STT / music
/// backends.
/// - `provider_kind()` should return `"codec"` for plain codecs;
/// multi-capability backends MAY return a hyphenated combination.
///
/// ## Token layout
///
/// `encode_pcm` returns and `decode_tokens` consumes a flat row-major
/// `[codebook_0_t0, codebook_0_t1, ..., codebook_1_t0, ...]` vector of
/// `u32`. The caller is expected to know its codebook count (e.g.
/// 4 codebooks for EnCodec at 6 kbps) and reshape accordingly.