Expand description
KittenTTS — lightweight ONNX text-to-speech for RLX.
§Backends
| Feature | RLX runtime | ONNX Runtime EP |
|---|---|---|
onnx | — | CPU (default) |
native | RLX runtime | Decomposed kitten_tts_mini_rlx graph (no ORT) |
rlx | RLX runtime | ORT inference + RLX path deps for future parity |
metal | Metal | CoreML (macOS / iOS) |
mlx | MLX | CoreML (Apple GPU) |
cuda | CUDA | CUDA |
rocm | ROCm | ROCm |
gpu | wgpu | DirectML / CUDA / CoreML |
full | all above | all ORT EPs + RLX path deps |
Re-exports§
pub use assets::DEFAULT_LOCAL_DIR;pub use assets::ModelLayout;pub use assets::default_model_dir;pub use assets::default_native_weights_dir;pub use assets::find_native_weights;pub use assets::find_rlx_bundle;pub use backend::OrtSession;pub use backend::build_onnx_session;pub use backend::execution_providers_for;pub use backend::validate_device;pub use config::DEFAULT_HF_REPO;pub use config::ModelConfig;pub use download::fetch_default;pub use download::fetch_to_local_dir;pub use features::cuda_feature_enabled;pub use features::enabled_backend_labels;pub use features::espeak_feature_enabled;pub use features::gpu_feature_enabled;pub use features::metal_feature_enabled;pub use features::mlx_feature_enabled;pub use features::native_feature_enabled;pub use features::onnx_feature_enabled;pub use features::rlx_feature_enabled;pub use features::rocm_feature_enabled;pub use infer_opts::SAMPLES_PER_DURATION_UNIT;pub use infer_opts::recommended_native_compile_opts;pub use model::KittenTTS;pub use model::MIN_AUDIBLE_PEAK;pub use model::SAMPLE_RATE;pub use model::peak_amplitude;pub use npz::NpyArray;pub use npz::load_npz;pub use npz::parse_npy;pub use phonemize::DEFAULT_LANG;pub use phonemize::is_espeak_available;pub use phonemize::phonemize;pub use phonemize::phonemize_lang;pub use phonemize::set_data_path;pub use tokenize::ipa_content_len;pub use tokenize::ipa_style_index;pub use tokenize::ipa_text_style_index;pub use tokenize::ipa_to_ids;pub use tokenize::warn_unknown_ipa_chars;
Modules§
- assets
- Model directory discovery and path layout.
- backend
- Map
rlx_runtime::Deviceto ONNX Runtime execution providers. - backend_
kind - Inference backend selection (ONNX Runtime vs native RLX).
- cli
- CLI for IPA → WAV synthesis (ONNX Runtime or native RLX graph).
- config
config.jsonschema from KittenTTS Hugging Face repos.- download
- Hugging Face Hub download for KittenTTS checkpoints.
- features
- Compile-time feature probes and backend labels.
- infer_
opts - Native compile sizing and ONNX duration → waveform length mapping.
- model
- ONNX model runner — mirrors Python’s
KittenTTS_1_Onnx. - npz
- Minimal NPZ / NPY loader.
- phonemize
- Text → IPA via the pure-Rust
espeak-ngcrate. - tokenize
- Character-level tokeniser — mirrors Python’s
TextCleaner.
Enums§
- Device
- Target device for graph execution.
Functions§
- fastest_
device - Highest-priority backend that is compiled in and live on this host.
- is_
available - parse_
device - Lower-case Cargo feature names and common aliases →
Device. - parse_
device_ list - Parse comma/semicolon/whitespace-separated device lists (
RLX_DEVICES=cpu,metal).