pub enum RealtimeTurnDetection {
ServerVAD {
create_response: Option<bool>,
idle_timeout_ms: Option<u32>,
interrupt_response: Option<bool>,
prefix_padding_ms: u32,
silence_duration_ms: u32,
threshold: f32,
},
SemanticVAD {
create_response: Option<bool>,
eagerness: String,
interrupt_response: Option<bool>,
},
}realtime-types only.Variants§
ServerVAD
Server-side voice activity detection (VAD) which flips on when user speech is detected and off after a period of silence.
Fields
create_response: Option<bool>Whether or not to automatically generate a response when a VAD stop event occurs. If
interrupt_response is set to false this may fail to create a response if the model is
already responding.
If both create_response and interrupt_response are set to false, the model will
never respond automatically but VAD events will still be emitted.
idle_timeout_ms: Option<u32>Optional timeout after which a model response will be triggered automatically. This is useful for situations in which a long pause from the user is unexpected, such as a phone call. The model will effectively prompt the user to continue the conversation based on the current context.
The timeout value will be applied after the last model response’s audio has finished playing, i.e. it’s set to the response.done time plus audio playback duration.
An input_audio_buffer.timeout_triggered event (plus events associated with the Response) will be emitted when the timeout is reached. Idle timeout is currently only supported for server_vad mode.
interrupt_response: Option<bool>Whether or not to automatically interrupt (cancel) any ongoing response with output to the
default conversation (i.e. conversation of auto) when a VAD start event occurs. If true then
the response will be cancelled, otherwise it will continue until complete.
If both create_response and interrupt_response are set to false, the model will
never respond automatically but VAD events will still be emitted.
prefix_padding_ms: u32Used only for server_vad mode. Amount of audio to include before the VAD detected speech (in milliseconds). Defaults to 300ms.
SemanticVAD
Server-side semantic turn detection which uses a model to determine when the user has finished speaking.
Fields
create_response: Option<bool>Whether or not to automatically generate a response when a VAD stop event occurs.
Trait Implementations§
Source§impl Clone for RealtimeTurnDetection
impl Clone for RealtimeTurnDetection
Source§fn clone(&self) -> RealtimeTurnDetection
fn clone(&self) -> RealtimeTurnDetection
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more