pub struct ModelFiles {
pub config: Option<ModelAsset>,
pub tokenizer: Option<ModelAsset>,
pub weights: Vec<ModelAsset>,
pub voices_dir: Option<ModelAssetDir>,
pub speech_tokenizer_weights: Vec<ModelAsset>,
pub speech_tokenizer_config: Option<ModelAsset>,
pub generation_config: Option<ModelAsset>,
pub preprocessor_config: Option<ModelAsset>,
}Expand description
Resolved model assets for loading.
Each model type requires a specific set of files. You can provide them
individually using the builder methods on TtsConfig, set
TtsConfig::model_path to a directory that contains all of them, or
rely on automatic HuggingFace Hub download (if the download feature
is enabled).
§File resolution order (per file)
- Explicit path — set via
with_*_file()/with_*_dir()onTtsConfig. Use this when your project has its own download manager (e.g. flow-like hash-based local caching). - Auto-discovery — if
model_pathis set, the library looks for well-known filenames inside that directory. - HuggingFace Hub download — if the
downloadfeature is enabled and the file is still missing, it is fetched from the Hub. This is the convenient fallback for quick prototyping.
Fields§
§config: Option<ModelAsset>Path to config.json — model architecture configuration.
Expected format: JSON object describing the neural-network
hyperparameters (hidden size, number of layers, vocab size, …).
This is the standard HuggingFace config.json format.
Each backend stores its architecture metadata here, such as
transformer dimensions, tokenizer sizes, sample rates, or
auxiliary decoder configuration.
tokenizer: Option<ModelAsset>Path to tokenizer.json — BPE text tokenizer definition.
Expected format: HuggingFace Tokenizers
self-contained JSON file. Contains the full vocabulary, merge rules,
special tokens, and pre/post-processing steps. No separate
vocab.json or merges.txt required when this file is present.
Used by both models to convert input text into token IDs before feeding them to the transformer backbone.
weights: Vec<ModelAsset>Paths to model weight files (.safetensors).
Expected format: One or more SafeTensors files containing the neural-network parameters.
- Single file —
model.safetensors(for models < ~5 GB). - Sharded —
model-00001-of-00004.safetensors, … When sharded, the library also expectsmodel.safetensors.index.jsonin the same directory (auto-discovered or downloaded). - Other formats — some backends use
consolidated.safetensorsor.pthfiles instead of the standard filename.
voices_dir: Option<ModelAssetDir>Path to a voice asset directory for backends that ship preset voices.
Supported layouts include:
voices/ ← Kokoro preset voices (`*.pt`)
voice_embedding/ ← Voxtral preset voices (`*.pt`)The exact file format depends on the backend.
speech_tokenizer_weights: Vec<ModelAsset>Paths to the speech/audio tokenizer decoder weight files.
Expected format: SafeTensors files for the auxiliary decoder used by models that emit discrete audio codec tokens.
Contains:
-
Residual VQ codebooks (16 groups × 2048 codes × dim)
-
Pre-conv + pre-transformer layers
-
Upsampling layers (transposed convolutions + SnakeBeta)
-
Final decoder convolution
-
Qwen3-TTS uses the separate
Qwen/Qwen3-TTS-Tokenizer-12Hzrepository. -
OmniVoice uses the
audio_tokenizer/subdirectory inside the main model snapshot.
speech_tokenizer_config: Option<ModelAsset>Path to config.json of the speech/audio tokenizer.
Expected format: JSON config for the speech tokenizer decoder model, including codebook dimensions, upsampling ratios, and activation parameters.
If not provided, will be auto-discovered from a nested
audio_tokenizer/ directory or downloaded from HuggingFace.
generation_config: Option<ModelAsset>Path to generation_config.json (optional).
Expected format: Standard HuggingFace generation configuration
with fields like max_new_tokens, top_p, temperature,
do_sample, repetition_penalty, etc.
If not provided, sensible per-model defaults are used.
preprocessor_config: Option<ModelAsset>Path to preprocessor_config.json (optional).
Used by backends such as VibeVoice that publish prompt-building and
audio-normalization defaults separately from config.json.
Implementations§
Source§impl ModelFiles
impl ModelFiles
Sourcepub fn fill_from_directory(&mut self, dir: &Path)
pub fn fill_from_directory(&mut self, dir: &Path)
Scan a directory for well-known model files and fill any that are
still None / empty.
Sourcepub fn fill_from_asset_bundle(&mut self, bundle: &ModelAssetBundle)
pub fn fill_from_asset_bundle(&mut self, bundle: &ModelAssetBundle)
Scan an in-memory asset bundle for well-known model files.
Sourcepub fn load_safetensors_vb(
assets: &[ModelAsset],
dtype: DType,
device: &Device,
) -> Result<VarBuilder<'static>, TtsError>
pub fn load_safetensors_vb( assets: &[ModelAsset], dtype: DType, device: &Device, ) -> Result<VarBuilder<'static>, TtsError>
Build a VarBuilder by reading safetensors files fully into memory.
This is the safe alternative to VarBuilder::from_mmaped_safetensors
which requires unsafe due to memory-mapping. The trade-off is a brief
peak in memory while the raw bytes and parsed tensors coexist, but for
model loading this is negligible compared to the final tensor footprint.
Sourcepub fn fill_from_hf(
&mut self,
model_id: &str,
model_type: ModelType,
bearer_token: Option<&str>,
) -> Result<(), TtsError>
pub fn fill_from_hf( &mut self, model_id: &str, model_type: ModelType, bearer_token: Option<&str>, ) -> Result<(), TtsError>
Download missing files from HuggingFace Hub.
model_type determines which files are required.
Sourcepub fn validate(&self, model_type: ModelType) -> Result<(), TtsError>
pub fn validate(&self, model_type: ModelType) -> Result<(), TtsError>
Check whether all required files for the given model type are present.
Sourcepub fn missing_files(&self, model_type: ModelType) -> Vec<&'static str>
pub fn missing_files(&self, model_type: ModelType) -> Vec<&'static str>
Return the list of files that are required but not yet set.
Trait Implementations§
Source§impl Clone for ModelFiles
impl Clone for ModelFiles
Source§fn clone(&self) -> ModelFiles
fn clone(&self) -> ModelFiles
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Debug for ModelFiles
impl Debug for ModelFiles
Source§impl Default for ModelFiles
impl Default for ModelFiles
Source§fn default() -> ModelFiles
fn default() -> ModelFiles
Auto Trait Implementations§
impl Freeze for ModelFiles
impl RefUnwindSafe for ModelFiles
impl Send for ModelFiles
impl Sync for ModelFiles
impl Unpin for ModelFiles
impl UnsafeUnpin for ModelFiles
impl UnwindSafe for ModelFiles
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more