Skip to main content

Crate wavekat_tts

Crate wavekat_tts 

Source
Expand description

Unified text-to-speech for voice pipelines.

Provides a clean abstraction over TTS engines — both local models and cloud APIs — behind common Rust traits. Same pattern as wavekat-vad and wavekat-turn.

All backends produce AudioFrame<'static> from wavekat-core, keeping audio abstract across the WaveKat ecosystem.

§Architecture

wavekat-vad   →  "is someone speaking?"
wavekat-turn  →  "are they done speaking?"
wavekat-tts   →  "synthesize the response"
     │                   │                     │
     └───────────────────┴─────────────────────┘
                         │
               AudioFrame (wavekat-core)

§Feature flags

§Backends

FeatureBackendMultilingualRequires
qwen3-ttsQwen3-TTS (ONNX)10 languagesONNX model download
cosyvoiceCosyVoice (ONNX)YesONNX model download

§Execution providers

Composable with any backend feature. Selects the inference hardware at build time.

FeatureProviderPlatform
cudaNVIDIA CUDALinux / Windows
tensorrtNVIDIA TensorRTLinux / Windows
coremlApple CoreMLmacOS / iOS

§Quick start

[dependencies]
wavekat-tts = { version = "0.0.1", features = ["qwen3-tts"] }
use wavekat_tts::{TtsBackend, SynthesizeRequest};
use wavekat_tts::backends::qwen3_tts::Qwen3Tts;

let tts = Qwen3Tts::new("path/to/model.onnx")?;
let request = SynthesizeRequest::new("我觉得这个方案");
let audio = tts.synthesize(&request)?;
// audio: AudioFrame<'static> at 24kHz

Modules§

backends
Backend implementations for various TTS engines.

Structs§

AudioFrame
A frame of audio samples with associated sample rate.
SynthesizeRequest
A TTS synthesis request.
VoiceInfo
Metadata about a voice available in a backend.

Enums§

Gender
Voice gender hint.
TtsError
Errors produced by TTS backends.

Traits§

StreamingTtsBackend
Streaming TTS backend: text in, AudioFrame<'static> chunks out.
TtsBackend
Batch TTS backend: text in, AudioFrame<'static> out.