Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
kokoro-micro
A minimal, embeddable Text-to-Speech (TTS) library for Rust using the Kokoro 82M parameter model.
This is a reduced version of kokoro-tiny created by by 8b-is.
Features
- Minimal dependencies - Only essential crates for TTS synthesis
- Auto-downloading - Model files (310MB + 27MB) download automatically to
~/.cache/k/ - Multiple voices - Support for various voice styles with mixing capability
- Speed & gain control - Adjust speech speed and volume
- WAV export - Save synthesized audio to WAV files
- Long text support - Automatic chunking and crossfading for longer texts
- Silent by default - No output unless
KOKORO_DEBUG=1is set
Installation
Add to your Cargo.toml:
[]
= "0.2.0"
= { = "1", = ["rt", "macros"] }
Quick Start
use TtsEngine;
async
API Reference
TtsEngine
Main struct for text-to-speech synthesis.
Methods
-
new() -> Result<Self, String>
Create a new TTS engine. Downloads model files to~/.cache/k/on first run. -
with_paths(model_path: &str, voices_path: &str) -> Result<Self, String>
Create engine with custom model file paths. -
voices() -> Vec<String>
List all available voice names. -
synthesize_with_options(text: &str, voice: Option<&str>, speed: f32, gain: f32, lang: Option<&str>) -> Result<Vec<f32>, String>
Synthesize text to audio samples.text- Text to synthesizevoice- Voice name (e.g., "af_sky", "af_nicole", "am_adam") or None for defaultspeed- Speech speed (0.5 = slower, 1.0 = normal, 2.0 = faster)gain- Volume multiplier (0.5 = quieter, 1.0 = normal, 2.0 = louder)lang- Language code (e.g., "en", "es", "fr") or None for default "en"
-
save_wav(path: &str, audio: &[f32]) -> Result<(), String>
Save audio samples to a WAV file.
Voice Mixing
You can mix multiple voices by using weighted combinations:
// Mix 40% af_sky + 50% af_nicole
let audio = tts.synthesize_with_options?;
Available Voices
Common voices include:
af_sky(default) - Female, gentleaf_nicole- Femaleaf_bella- Femaleam_adam- Maleam_michael- Male
Use tts.voices() to list all available voices.
Debug Logging
By default, kokoro-micro runs silently with no console output. To enable debug logging (model download progress, synthesis details, etc.), set the KOKORO_DEBUG environment variable:
# Enable debug logging
KOKORO_DEBUG=1
# Or in your code
);
Debug logging shows:
- Model download progress
- Long-form synthesis chunking information
- Phoneme conversion details
- Audio generation statistics
Example
See examples/simple.rs:
# Run without debug output
# Run with debug output
KOKORO_DEBUG=1
Features
Optional Features
cuda- Enable CUDA acceleration for ONNX Runtime
[]
= { = "0.2.0", = ["cuda"] }
Model Files
Model files are automatically downloaded on first use to $HOME/.cache/k/:
$HOME/.cache/k/0.onnx(310MB) - Kokoro ONNX model$HOME/.cache/k/0.bin(27MB) - Voice embeddings
The same cache directory is used on all platforms (Linux, macOS, Windows):
- Linux/macOS:
$HOME/.cache/k/(e.g.,/home/user/.cache/k/) - Windows:
%USERPROFILE%/.cache/k/(e.g.,C:\Users\Username\.cache\k\)
Files are cached and shared across all applications using kokoro-micro.
License
Apache-2.0
Credits
Built with the Kokoro 82M parameter TTS model. Reduced version from kokoro-tiny by 8b-is.