Enum RealtimeServerEvent

Source

pub enum RealtimeServerEvent {
Show 43 variants
    Error(RealtimeServerEventError),
    SessionCreated(RealtimeServerEventSessionCreated),
    SessionUpdated(RealtimeServerEventSessionUpdated),
    ConversationItemAdded(RealtimeServerEventConversationItemAdded),
    ConversationItemDone(RealtimeServerEventConversationItemDone),
    ConversationItemRetrieved(RealtimeServerEventConversationItemRetrieved),
    ConversationItemInputAudioTranscriptionCompleted(RealtimeServerEventConversationItemInputAudioTranscriptionCompleted),
    ConversationItemInputAudioTranscriptionDelta(RealtimeServerEventConversationItemInputAudioTranscriptionDelta),
    ConversationItemInputAudioTranscriptionSegment(RealtimeServerEventConversationItemInputAudioTranscriptionSegment),
    ConversationItemInputAudioTranscriptionFailed(RealtimeServerEventConversationItemInputAudioTranscriptionFailed),
    ConversationItemTruncated(RealtimeServerEventConversationItemTruncated),
    ConversationItemDeleted(RealtimeServerEventConversationItemDeleted),
    InputAudioBufferCommitted(RealtimeServerEventInputAudioBufferCommitted),
    InputAudioBufferCleared(RealtimeServerEventInputAudioBufferCleared),
    InputAudioBufferSpeechStarted(RealtimeServerEventInputAudioBufferSpeechStarted),
    InputAudioBufferSpeechStopped(RealtimeServerEventInputAudioBufferSpeechStopped),
    InputAudioBufferTimeoutTriggered(RealtimeServerEventInputAudioBufferTimeoutTriggered),
    OutputAudioBufferStarted(RealtimeServerEventOutputAudioBufferStarted),
    OutputAudioBufferStopped(RealtimeServerEventOutputAudioBufferStopped),
    OutputAudioBufferCleared(RealtimeServerEventOutputAudioBufferCleared),
    ResponseCreated(RealtimeServerEventResponseCreated),
    ResponseDone(RealtimeServerEventResponseDone),
    ResponseOutputItemAdded(RealtimeServerEventResponseOutputItemAdded),
    ResponseOutputItemDone(RealtimeServerEventResponseOutputItemDone),
    ResponseContentPartAdded(RealtimeServerEventResponseContentPartAdded),
    ResponseContentPartDone(RealtimeServerEventResponseContentPartDone),
    ResponseOutputTextDelta(RealtimeServerEventResponseTextDelta),
    ResponseOutputTextDone(RealtimeServerEventResponseTextDone),
    ResponseOutputAudioTranscriptDelta(RealtimeServerEventResponseAudioTranscriptDelta),
    ResponseOutputAudioTranscriptDone(RealtimeServerEventResponseAudioTranscriptDone),
    ResponseOutputAudioDelta(RealtimeServerEventResponseAudioDelta),
    ResponseOutputAudioDone(RealtimeServerEventResponseAudioDone),
    ResponseFunctionCallArgumentsDelta(RealtimeServerEventResponseFunctionCallArgumentsDelta),
    ResponseFunctionCallArgumentsDone(RealtimeServerEventResponseFunctionCallArgumentsDone),
    ResponseMCPCallArgumentsDelta(RealtimeServerEventResponseMCPCallArgumentsDelta),
    ResponseMCPCallArgumentsDone(RealtimeServerEventResponseMCPCallArgumentsDone),
    ResponseMCPCallInProgress(RealtimeServerEventResponseMCPCallInProgress),
    ResponseMCPCallCompleted(RealtimeServerEventResponseMCPCallCompleted),
    ResponseMCPCallFailed(RealtimeServerEventResponseMCPCallFailed),
    MCPListToolsInProgress(RealtimeServerEventMCPListToolsInProgress),
    MCPListToolsCompleted(RealtimeServerEventMCPListToolsCompleted),
    MCPListToolsFailed(RealtimeServerEventMCPListToolsFailed),
    RateLimitsUpdated(RealtimeServerEventRateLimitsUpdated),
}

Available on crate feature realtime only.

Expand description

These are events emitted from the OpenAI Realtime WebSocket server to the client.

Variants§

§

Error(RealtimeServerEventError)

Returned when an error occurs, which could be a client problem or a server problem. Most errors are recoverable and the session will stay open, we recommend to implementors to monitor and log error messages by default.

§

SessionCreated(RealtimeServerEventSessionCreated)

Returned when a Session is created. Emitted automatically when a new connection is established as the first server event. This event will contain the default Session configuration.

§

SessionUpdated(RealtimeServerEventSessionUpdated)

Returned when a session is updated with a session.update event, unless there is an error.

§

ConversationItemAdded(RealtimeServerEventConversationItemAdded)

Sent by the server when an Item is added to the default Conversation. This can happen in several cases:

When the client sends a conversation.item.create event
When the input audio buffer is committed. In this case the item will be a user message containing the audio from the buffer.
When the model is generating a Response. In this case the conversation.item.added event will be sent when the model starts generating a specific Item, and thus it will not yet have any content (and status will be in_progress).

The event will include the full content of the Item (except when model is generating a Response) except for audio data, which can be retrieved separately with a conversation.item.retrieve event if necessary.

§

ConversationItemDone(RealtimeServerEventConversationItemDone)

Returned when a conversation item is finalized.

The event will include the full content of the Item except for audio data, which can be retrieved separately with a conversation.item.retrieve event if needed.

§

ConversationItemRetrieved(RealtimeServerEventConversationItemRetrieved)

Returned when a conversation item is retrieved with conversation.item.retrieve. This is provided as a way to fetch the server’s representation of an item, for example to get access to the post-processed audio data after noise cancellation and VAD. It includes the full content of the Item, including audio data.

§

ConversationItemInputAudioTranscriptionCompleted(RealtimeServerEventConversationItemInputAudioTranscriptionCompleted)

This event is the output of audio transcription for user audio written to the user audio buffer. Transcription begins when the input audio buffer is committed by the client or server (when VAD is enabled). Transcription runs asynchronously with Response creation, so this event may come before or after the Response events.

Realtime API models accept audio natively, and thus input transcription is a separate process run on a separate ASR (Automatic Speech Recognition) model. The transcript may diverge somewhat from the model’s interpretation, and should be treated as a rough guide.

§

ConversationItemInputAudioTranscriptionDelta(RealtimeServerEventConversationItemInputAudioTranscriptionDelta)

Returned when the text value of an input audio transcription content part is updated with incremental transcription results.

§

ConversationItemInputAudioTranscriptionSegment(RealtimeServerEventConversationItemInputAudioTranscriptionSegment)

Returned when an input audio transcription segment is identified for an item.

§

ConversationItemInputAudioTranscriptionFailed(RealtimeServerEventConversationItemInputAudioTranscriptionFailed)

Returned when input audio transcription is configured, and a transcription request for a user message failed. These events are separate from other error events so that the client can identify the related Item.

§

ConversationItemTruncated(RealtimeServerEventConversationItemTruncated)

Returned when an earlier assistant audio message item is truncated by the client with a conversation.item.truncate event. This event is used to synchronize the server’s understanding of the audio with the client’s playback.

This action will truncate the audio and remove the server-side text transcript to ensure there is no text in the context that hasn’t been heard by the user.

§

ConversationItemDeleted(RealtimeServerEventConversationItemDeleted)

Returned when an item in the conversation is deleted by the client with a conversation.item.delete event. This event is used to synchronize the server’s understanding of the conversation history with the client’s view.

§

InputAudioBufferCommitted(RealtimeServerEventInputAudioBufferCommitted)

Returned when an input audio buffer is committed, either by the client or automatically in server VAD mode. The item_id property is the ID of the user message item that will be created, thus a conversation.item.created event will also be sent to the client.

§

InputAudioBufferCleared(RealtimeServerEventInputAudioBufferCleared)

Returned when the input audio buffer is cleared by the client with a input_audio_buffer.clear event.

§

InputAudioBufferSpeechStarted(RealtimeServerEventInputAudioBufferSpeechStarted)

Sent by the server when in server_vad mode to indicate that speech has been detected in the audio buffer. This can happen any time audio is added to the buffer (unless speech is already detected). The client may want to use this event to interrupt audio playback or provide visual feedback to the user.

The client should expect to receive a input_audio_buffer.speech_stopped event when speech stops. The item_id property is the ID of the user message item that will be created when speech stops and will also be included in the input_audio_buffer.speech_stopped event (unless the client manually commits the audio buffer during VAD activation).

§

InputAudioBufferSpeechStopped(RealtimeServerEventInputAudioBufferSpeechStopped)

Returned in server_vad mode when the server detects the end of speech in the audio buffer. The server will also send a conversation.item.created event with the user message item that is created from the audio buffer.

§

InputAudioBufferTimeoutTriggered(RealtimeServerEventInputAudioBufferTimeoutTriggered)

Returned when the Server VAD timeout is triggered for the input audio buffer. This is configured with idle_timeout_ms in the turn_detection settings of the session, and it indicates that there hasn’t been any speech detected for the configured duration.

The audio_start_ms and audio_end_ms fields indicate the segment of audio after the last model response up to the triggering time, as an offset from the beginning of audio written to the input audio buffer. This means it demarcates the segment of audio that was silent and the difference between the start and end values will roughly match the configured timeout.

The empty audio will be committed to the conversation as an input_audio item (there will be a input_audio_buffer.committed event) and a model response will be generated. There may be speech that didn’t trigger VAD but is still detected by the model, so the model may respond with something relevant to the conversation or a prompt to continue speaking.

§

OutputAudioBufferStarted(RealtimeServerEventOutputAudioBufferStarted)

WebRTC Only: Emitted when the server begins streaming audio to the client. This event is emitted after an audio content part has been added (response.content_part.added) to the response. Learn more.

§

OutputAudioBufferStopped(RealtimeServerEventOutputAudioBufferStopped)

WebRTC Only: Emitted when the output audio buffer has been completely drained on the server, and no more audio is forthcoming. This event is emitted after the full response data has been sent to the client (response.done). Learn more.

§

OutputAudioBufferCleared(RealtimeServerEventOutputAudioBufferCleared)

WebRTC Only: Emitted when the output audio buffer is cleared. This happens either in VAD mode when the user has interrupted (input_audio_buffer.speech_started), or when the client has emitted the output_audio_buffer.clear event to manually cut off the current audio response. Learn more.

§

ResponseCreated(RealtimeServerEventResponseCreated)

Returned when a new Response is created. The first event of response creation, where the response is in an initial state of in_progress.

§

ResponseDone(RealtimeServerEventResponseDone)

Returned when a Response is done streaming. Always emitted, no matter the final state. The Response object included in the response.done event will include all output Items in the Response but will omit the raw audio data.

Clients should check the status field of the Response to determine if it was successful (completed) or if there was another outcome: cancelled, failed, or incomplete.

A response will contain all output items that were generated during the response, excluding any audio content.

§

ResponseOutputItemAdded(RealtimeServerEventResponseOutputItemAdded)

Returned when a new Item is created during Response generation.

§

ResponseOutputItemDone(RealtimeServerEventResponseOutputItemDone)

Returned when an Item is done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled.

§

ResponseContentPartAdded(RealtimeServerEventResponseContentPartAdded)

Returned when a new content part is added to an assistant message item during response generation.

§

ResponseContentPartDone(RealtimeServerEventResponseContentPartDone)

Returned when a content part is done streaming in an assistant message item. Also emitted when a Response is interrupted, incomplete, or cancelled.

§

ResponseOutputTextDelta(RealtimeServerEventResponseTextDelta)

Returned when the text value of an “output_text” content part is updated.

§

ResponseOutputTextDone(RealtimeServerEventResponseTextDone)

Returned when the text value of an “output_text” content part is done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled.

§

ResponseOutputAudioTranscriptDelta(RealtimeServerEventResponseAudioTranscriptDelta)

Returned when the model-generated transcription of audio output is updated.

§

ResponseOutputAudioTranscriptDone(RealtimeServerEventResponseAudioTranscriptDone)

Returned when the model-generated transcription of audio output is done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled.

§

ResponseOutputAudioDelta(RealtimeServerEventResponseAudioDelta)

Returned when the model-generated audio is updated.

§

ResponseOutputAudioDone(RealtimeServerEventResponseAudioDone)

Returned when the model-generated audio is done. Also emitted when a Response is interrupted, incomplete, or cancelled.

§

ResponseFunctionCallArgumentsDelta(RealtimeServerEventResponseFunctionCallArgumentsDelta)

Returned when the model-generated function call arguments are updated.

§

ResponseFunctionCallArgumentsDone(RealtimeServerEventResponseFunctionCallArgumentsDone)

Returned when the model-generated function call arguments are done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled.

§

ResponseMCPCallArgumentsDelta(RealtimeServerEventResponseMCPCallArgumentsDelta)

Returned when MCP tool call arguments are updated.

§

ResponseMCPCallArgumentsDone(RealtimeServerEventResponseMCPCallArgumentsDone)

Returned when MCP tool call arguments are finalized during response generation.

§

ResponseMCPCallInProgress(RealtimeServerEventResponseMCPCallInProgress)

Returned when an MCP tool call is in progress.

§

ResponseMCPCallCompleted(RealtimeServerEventResponseMCPCallCompleted)

Returned when an MCP tool call has completed successfully.

§

ResponseMCPCallFailed(RealtimeServerEventResponseMCPCallFailed)

Returned when an MCP tool call has failed.

§

MCPListToolsInProgress(RealtimeServerEventMCPListToolsInProgress)

Returned when listing MCP tools is in progress for an item.

§

MCPListToolsCompleted(RealtimeServerEventMCPListToolsCompleted)

Returned when listing MCP tools has completed for an item.

§

MCPListToolsFailed(RealtimeServerEventMCPListToolsFailed)

Returned when listing MCP tools has failed for an item.

§

RateLimitsUpdated(RealtimeServerEventRateLimitsUpdated)

Emitted at the beginning of a Response to indicate the updated rate limits. When a Response is created some tokens will be “reserved” for the output tokens, the rate limits shown here reflect that reservation, which is then adjusted accordingly once the Response is completed.

RealtimeServerEvent

Enum RealtimeServerEvent Copy item path

Variants§

Error(RealtimeServerEventError)

SessionCreated(RealtimeServerEventSessionCreated)

SessionUpdated(RealtimeServerEventSessionUpdated)

ConversationItemAdded(RealtimeServerEventConversationItemAdded)

ConversationItemDone(RealtimeServerEventConversationItemDone)

ConversationItemRetrieved(RealtimeServerEventConversationItemRetrieved)

ConversationItemInputAudioTranscriptionCompleted(RealtimeServerEventConversationItemInputAudioTranscriptionCompleted)

ConversationItemInputAudioTranscriptionDelta(RealtimeServerEventConversationItemInputAudioTranscriptionDelta)

ConversationItemInputAudioTranscriptionSegment(RealtimeServerEventConversationItemInputAudioTranscriptionSegment)

ConversationItemInputAudioTranscriptionFailed(RealtimeServerEventConversationItemInputAudioTranscriptionFailed)

ConversationItemTruncated(RealtimeServerEventConversationItemTruncated)

ConversationItemDeleted(RealtimeServerEventConversationItemDeleted)

InputAudioBufferCommitted(RealtimeServerEventInputAudioBufferCommitted)

InputAudioBufferCleared(RealtimeServerEventInputAudioBufferCleared)

InputAudioBufferSpeechStarted(RealtimeServerEventInputAudioBufferSpeechStarted)

InputAudioBufferSpeechStopped(RealtimeServerEventInputAudioBufferSpeechStopped)

InputAudioBufferTimeoutTriggered(RealtimeServerEventInputAudioBufferTimeoutTriggered)

OutputAudioBufferStarted(RealtimeServerEventOutputAudioBufferStarted)

OutputAudioBufferStopped(RealtimeServerEventOutputAudioBufferStopped)

OutputAudioBufferCleared(RealtimeServerEventOutputAudioBufferCleared)

ResponseCreated(RealtimeServerEventResponseCreated)

ResponseDone(RealtimeServerEventResponseDone)

ResponseOutputItemAdded(RealtimeServerEventResponseOutputItemAdded)

ResponseOutputItemDone(RealtimeServerEventResponseOutputItemDone)

ResponseContentPartAdded(RealtimeServerEventResponseContentPartAdded)

ResponseContentPartDone(RealtimeServerEventResponseContentPartDone)

ResponseOutputTextDelta(RealtimeServerEventResponseTextDelta)

ResponseOutputTextDone(RealtimeServerEventResponseTextDone)

ResponseOutputAudioTranscriptDelta(RealtimeServerEventResponseAudioTranscriptDelta)

ResponseOutputAudioTranscriptDone(RealtimeServerEventResponseAudioTranscriptDone)

ResponseOutputAudioDelta(RealtimeServerEventResponseAudioDelta)

ResponseOutputAudioDone(RealtimeServerEventResponseAudioDone)

ResponseFunctionCallArgumentsDelta(RealtimeServerEventResponseFunctionCallArgumentsDelta)

ResponseFunctionCallArgumentsDone(RealtimeServerEventResponseFunctionCallArgumentsDone)

ResponseMCPCallArgumentsDelta(RealtimeServerEventResponseMCPCallArgumentsDelta)

ResponseMCPCallArgumentsDone(RealtimeServerEventResponseMCPCallArgumentsDone)

ResponseMCPCallInProgress(RealtimeServerEventResponseMCPCallInProgress)

ResponseMCPCallCompleted(RealtimeServerEventResponseMCPCallCompleted)

ResponseMCPCallFailed(RealtimeServerEventResponseMCPCallFailed)

MCPListToolsInProgress(RealtimeServerEventMCPListToolsInProgress)

MCPListToolsCompleted(RealtimeServerEventMCPListToolsCompleted)

MCPListToolsFailed(RealtimeServerEventMCPListToolsFailed)

RateLimitsUpdated(RealtimeServerEventRateLimitsUpdated)

Trait Implementations§

impl Clone for RealtimeServerEvent

fn clone(&self) -> RealtimeServerEvent

fn clone_from(&mut self, source: &Self)

impl Debug for RealtimeServerEvent

fn fmt(&self, f: &mut Formatter<'_>) -> Result

impl<'de> Deserialize<'de> for RealtimeServerEvent

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where __D: Deserializer<'de>,

impl Serialize for RealtimeServerEvent

fn serialize<__S>(&self, __serializer: __S) -> Result<__S::Ok, __S::Error>where __S: Serializer,

Auto Trait Implementations§

impl Freeze for RealtimeServerEvent

impl RefUnwindSafe for RealtimeServerEvent

impl Send for RealtimeServerEvent

impl Sync for RealtimeServerEvent

impl Unpin for RealtimeServerEvent

impl UnwindSafe for RealtimeServerEvent

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> CloneToUninit for Twhere T: Clone,

unsafe fn clone_to_uninit(&self, dest: *mut u8)

impl<T> From<T> for T

fn from(t: T) -> T

impl<T> Instrument for T

fn instrument(self, span: Span) -> Instrumented<Self>

fn in_current_span(self) -> Instrumented<Self>

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T> PolicyExt for Twhere T: ?Sized,

Enum RealtimeServerEvent

fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where __D: Deserializer<'de>,

fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where S: Serializer,

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T> CloneToUninit for T
where T: Clone,

impl<T, U> Into<U> for T
where U: From<T>,

impl<T> PolicyExt for T
where T: ?Sized,

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

impl<T> ToOwned for T
where T: Clone,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,