Crate wavekat_turn

Expand description

§wavekat-turn

Unified turn detection with multiple backends.

Provides a clean abstraction over turn-detection models that predict whether a user has finished speaking. Two trait families cover the two fundamental input modalities:

AudioTurnDetector — operates on raw audio frames (e.g. Pipecat Smart Turn)
TextTurnDetector — operates on ASR transcript text (e.g. LiveKit EOU)

For most use cases, wrap a detector in TurnController to get automatic state tracking and soft-reset logic for VAD integration. See controller for details.

§Feature flags

Feature	Backend	Input
`pipecat`	Pipecat Smart Turn v3 (ONNX, embedded)	Audio (16 kHz)
`wavekat-smart-turn`	WaveKat language-specialized fine-tunes (ONNX, runtime download)	Audio (16 kHz)
`livekit`	LiveKit Turn Detector (ONNX)	Text

wavekat-smart-turn implies pipecat and adds an hf-hub runtime dependency. Weights live in wavekat/smart-turn-ONNX and are cached under $HF_HOME/hub/. Set WAVEKAT_TURN_MODEL_DIR to a directory containing <lang>/smart-turn-cpu.onnx to skip the download.

Re-exports§

pub use controller::TurnController;
pub use error::TurnError;

Modules§

audio: Audio-based turn detection backends.
controller
error
text: Text-based turn detection backends.

Structs§

AudioFrame: A frame of audio samples with associated sample rate.
ConversationTurn: A single turn in the conversation, for context-aware text detectors.
StageTiming: Per-stage timing entry.
TurnPrediction: A turn detection prediction with confidence and timing metadata.

Enums§

Role: Speaker role in a conversation turn.
TurnState: The predicted turn state.

Traits§

AudioTurnDetector: Turn detector that operates on raw audio.
TextTurnDetector: Turn detector that operates on ASR transcript text.