Expand description
§wavekat-turn
Unified turn detection with multiple backends.
Provides a clean abstraction over turn-detection models that predict whether a user has finished speaking. Two trait families cover the two fundamental input modalities:
AudioTurnDetector— operates on raw audio frames (e.g. Pipecat Smart Turn)TextTurnDetector— operates on ASR transcript text (e.g. LiveKit EOU)
For most use cases, wrap a detector in TurnController to get
automatic state tracking and soft-reset logic for VAD integration.
See controller for details.
§Feature flags
| Feature | Backend | Input |
|---|---|---|
pipecat | Pipecat Smart Turn v3 (ONNX, embedded) | Audio (16 kHz) |
wavekat-smart-turn | WaveKat language-specialized fine-tunes (ONNX, runtime download) | Audio (16 kHz) |
livekit | LiveKit Turn Detector (ONNX) | Text |
wavekat-smart-turn implies pipecat and adds an hf-hub runtime
dependency. Weights live in
wavekat/smart-turn-ONNX
and are cached under $HF_HOME/hub/. Set WAVEKAT_TURN_MODEL_DIR to a
directory containing <lang>/smart-turn-cpu.onnx to skip the download.
Re-exports§
pub use controller::TurnController;pub use error::TurnError;
Modules§
- audio
- Audio-based turn detection backends.
- controller
- error
- text
- Text-based turn detection backends.
Structs§
- Audio
Frame - A frame of audio samples with associated sample rate.
- Conversation
Turn - A single turn in the conversation, for context-aware text detectors.
- Stage
Timing - Per-stage timing entry.
- Turn
Prediction - A turn detection prediction with confidence and timing metadata.
Enums§
Traits§
- Audio
Turn Detector - Turn detector that operates on raw audio.
- Text
Turn Detector - Turn detector that operates on ASR transcript text.