Skip to main content

blazen_audio_codec/
lib.rs

1//! # blazen-audio-codec
2//!
3//! Neural-audio-codec backends for Blazen. Codecs translate raw PCM into
4//! discrete codebook tokens (and back) so generative models can operate in
5//! a low-rate, GPU-friendly token space instead of 24-48 kHz waveform
6//! samples.
7//!
8//! This crate plays the same role for *codecs* that
9//! [`blazen-audio-tts`](../blazen_audio_tts/index.html) plays for TTS and
10//! [`blazen-audio-music`](../blazen_audio_music/index.html) plays for music:
11//! a single capability-typed [`CodecBackend`] trait plus a
12//! monomorphizable [`CodecBackendHandle<B>`] and an erased
13//! [`DynCodecProvider`] for binding layers.
14//!
15//! ## Backends
16//!
17//! | Backend | Feature flag | Status |
18//! |---|---|---|
19//! | [`backends::encodec`] | `encodec` | **Functional** — Meta's EnCodec 24 kHz / 4-codebook neural codec via `candle-transformers`. |
20//! | [`backends::dac`] | `dac` | **Functional decode** — Descript Audio Codec (`descript/dac_44khz`, 9 codebooks at 8 kbps) via `candle-transformers`. Encode short-circuits until candle exposes a public RVQ encode path. |
21//! | [`backends::snac`] | `snac` | **Functional** — Multi-Scale Neural Audio Codec (`hubertsiuzdak/snac_24khz`, 3 multi-scale codebooks @ vq_strides `[4, 2, 1]`, 24 kHz, ~3 kbps) via `candle-transformers`. Both encode and decode are wired. |
22//!
23//! ## Why a dedicated codec trait?
24//!
25//! Codecs are pure functions over PCM and tokens — no prompts, no
26//! sampling temperature, no voices. Trying to thread them through the
27//! generative [`AudioBackend`](blazen_audio::AudioBackend) surface would
28//! force every codec to invent prompt semantics it doesn't have, so the
29//! [`CodecBackend`] trait adds **only** `encode_pcm` / `decode_tokens`
30//! and inherits the lifecycle methods (`load` / `unload` /
31//! `is_loaded`) from [`AudioBackend`].
32//!
33//! See `PR_AUDIO_PLAN.md` §3 + §5 W5 for the full restructure rationale.
34
35#![cfg_attr(docsrs, feature(doc_cfg))]
36// Crate prose mentions product names (EnCodec, MusicGen, AudioGen, DAC,
37// SNAC, Hugging Face, ...) frequently. Backticking every mention is
38// noisy; opt out at the crate level like `blazen-audio-candle` does.
39#![allow(clippy::doc_markdown)]
40
41pub mod backends;
42pub mod error;
43pub mod provider;
44pub mod traits;
45
46pub use error::CodecError;
47pub use provider::{CodecBackendHandle, DynCodecProvider};
48pub use traits::CodecBackend;