RealtimeServerEvent

Enum RealtimeServerEvent 

Source
pub enum RealtimeServerEvent {
Show 43 variants Error(RealtimeServerEventError), SessionCreated(RealtimeServerEventSessionCreated), SessionUpdated(RealtimeServerEventSessionUpdated), ConversationItemAdded(RealtimeServerEventConversationItemAdded), ConversationItemDone(RealtimeServerEventConversationItemDone), ConversationItemRetrieved(RealtimeServerEventConversationItemRetrieved), ConversationItemInputAudioTranscriptionCompleted(RealtimeServerEventConversationItemInputAudioTranscriptionCompleted), ConversationItemInputAudioTranscriptionDelta(RealtimeServerEventConversationItemInputAudioTranscriptionDelta), ConversationItemInputAudioTranscriptionSegment(RealtimeServerEventConversationItemInputAudioTranscriptionSegment), ConversationItemInputAudioTranscriptionFailed(RealtimeServerEventConversationItemInputAudioTranscriptionFailed), ConversationItemTruncated(RealtimeServerEventConversationItemTruncated), ConversationItemDeleted(RealtimeServerEventConversationItemDeleted), InputAudioBufferCommitted(RealtimeServerEventInputAudioBufferCommitted), InputAudioBufferCleared(RealtimeServerEventInputAudioBufferCleared), InputAudioBufferSpeechStarted(RealtimeServerEventInputAudioBufferSpeechStarted), InputAudioBufferSpeechStopped(RealtimeServerEventInputAudioBufferSpeechStopped), InputAudioBufferTimeoutTriggered(RealtimeServerEventInputAudioBufferTimeoutTriggered), OutputAudioBufferStarted(RealtimeServerEventOutputAudioBufferStarted), OutputAudioBufferStopped(RealtimeServerEventOutputAudioBufferStopped), OutputAudioBufferCleared(RealtimeServerEventOutputAudioBufferCleared), ResponseCreated(RealtimeServerEventResponseCreated), ResponseDone(RealtimeServerEventResponseDone), ResponseOutputItemAdded(RealtimeServerEventResponseOutputItemAdded), ResponseOutputItemDone(RealtimeServerEventResponseOutputItemDone), ResponseContentPartAdded(RealtimeServerEventResponseContentPartAdded), ResponseContentPartDone(RealtimeServerEventResponseContentPartDone), ResponseOutputTextDelta(RealtimeServerEventResponseTextDelta), ResponseOutputTextDone(RealtimeServerEventResponseTextDone), ResponseOutputAudioTranscriptDelta(RealtimeServerEventResponseAudioTranscriptDelta), ResponseOutputAudioTranscriptDone(RealtimeServerEventResponseAudioTranscriptDone), ResponseOutputAudioDelta(RealtimeServerEventResponseAudioDelta), ResponseOutputAudioDone(RealtimeServerEventResponseAudioDone), ResponseFunctionCallArgumentsDelta(RealtimeServerEventResponseFunctionCallArgumentsDelta), ResponseFunctionCallArgumentsDone(RealtimeServerEventResponseFunctionCallArgumentsDone), ResponseMCPCallArgumentsDelta(RealtimeServerEventResponseMCPCallArgumentsDelta), ResponseMCPCallArgumentsDone(RealtimeServerEventResponseMCPCallArgumentsDone), ResponseMCPCallInProgress(RealtimeServerEventResponseMCPCallInProgress), ResponseMCPCallCompleted(RealtimeServerEventResponseMCPCallCompleted), ResponseMCPCallFailed(RealtimeServerEventResponseMCPCallFailed), MCPListToolsInProgress(RealtimeServerEventMCPListToolsInProgress), MCPListToolsCompleted(RealtimeServerEventMCPListToolsCompleted), MCPListToolsFailed(RealtimeServerEventMCPListToolsFailed), RateLimitsUpdated(RealtimeServerEventRateLimitsUpdated),
}
Available on crate feature realtime only.
Expand description

These are events emitted from the OpenAI Realtime WebSocket server to the client.

Variants§

§

Error(RealtimeServerEventError)

Returned when an error occurs, which could be a client problem or a server problem. Most errors are recoverable and the session will stay open, we recommend to implementors to monitor and log error messages by default.

§

SessionCreated(RealtimeServerEventSessionCreated)

Returned when a Session is created. Emitted automatically when a new connection is established as the first server event. This event will contain the default Session configuration.

§

SessionUpdated(RealtimeServerEventSessionUpdated)

Returned when a session is updated with a session.update event, unless there is an error.

§

ConversationItemAdded(RealtimeServerEventConversationItemAdded)

Sent by the server when an Item is added to the default Conversation. This can happen in several cases:

  • When the client sends a conversation.item.create event
  • When the input audio buffer is committed. In this case the item will be a user message containing the audio from the buffer.
  • When the model is generating a Response. In this case the conversation.item.added event will be sent when the model starts generating a specific Item, and thus it will not yet have any content (and status will be in_progress).

The event will include the full content of the Item (except when model is generating a Response) except for audio data, which can be retrieved separately with a conversation.item.retrieve event if necessary.

§

ConversationItemDone(RealtimeServerEventConversationItemDone)

Returned when a conversation item is finalized.

The event will include the full content of the Item except for audio data, which can be retrieved separately with a conversation.item.retrieve event if needed.

§

ConversationItemRetrieved(RealtimeServerEventConversationItemRetrieved)

Returned when a conversation item is retrieved with conversation.item.retrieve. This is provided as a way to fetch the server’s representation of an item, for example to get access to the post-processed audio data after noise cancellation and VAD. It includes the full content of the Item, including audio data.

§

ConversationItemInputAudioTranscriptionCompleted(RealtimeServerEventConversationItemInputAudioTranscriptionCompleted)

This event is the output of audio transcription for user audio written to the user audio buffer. Transcription begins when the input audio buffer is committed by the client or server (when VAD is enabled). Transcription runs asynchronously with Response creation, so this event may come before or after the Response events.

Realtime API models accept audio natively, and thus input transcription is a separate process run on a separate ASR (Automatic Speech Recognition) model. The transcript may diverge somewhat from the model’s interpretation, and should be treated as a rough guide.

§

ConversationItemInputAudioTranscriptionDelta(RealtimeServerEventConversationItemInputAudioTranscriptionDelta)

Returned when the text value of an input audio transcription content part is updated with incremental transcription results.

§

ConversationItemInputAudioTranscriptionSegment(RealtimeServerEventConversationItemInputAudioTranscriptionSegment)

Returned when an input audio transcription segment is identified for an item.

§

ConversationItemInputAudioTranscriptionFailed(RealtimeServerEventConversationItemInputAudioTranscriptionFailed)

Returned when input audio transcription is configured, and a transcription request for a user message failed. These events are separate from other error events so that the client can identify the related Item.

§

ConversationItemTruncated(RealtimeServerEventConversationItemTruncated)

Returned when an earlier assistant audio message item is truncated by the client with a conversation.item.truncate event. This event is used to synchronize the server’s understanding of the audio with the client’s playback.

This action will truncate the audio and remove the server-side text transcript to ensure there is no text in the context that hasn’t been heard by the user.

§

ConversationItemDeleted(RealtimeServerEventConversationItemDeleted)

Returned when an item in the conversation is deleted by the client with a conversation.item.delete event. This event is used to synchronize the server’s understanding of the conversation history with the client’s view.

§

InputAudioBufferCommitted(RealtimeServerEventInputAudioBufferCommitted)

Returned when an input audio buffer is committed, either by the client or automatically in server VAD mode. The item_id property is the ID of the user message item that will be created, thus a conversation.item.created event will also be sent to the client.

§

InputAudioBufferCleared(RealtimeServerEventInputAudioBufferCleared)

Returned when the input audio buffer is cleared by the client with a input_audio_buffer.clear event.

§

InputAudioBufferSpeechStarted(RealtimeServerEventInputAudioBufferSpeechStarted)

Sent by the server when in server_vad mode to indicate that speech has been detected in the audio buffer. This can happen any time audio is added to the buffer (unless speech is already detected). The client may want to use this event to interrupt audio playback or provide visual feedback to the user.

The client should expect to receive a input_audio_buffer.speech_stopped event when speech stops. The item_id property is the ID of the user message item that will be created when speech stops and will also be included in the input_audio_buffer.speech_stopped event (unless the client manually commits the audio buffer during VAD activation).

§

InputAudioBufferSpeechStopped(RealtimeServerEventInputAudioBufferSpeechStopped)

Returned in server_vad mode when the server detects the end of speech in the audio buffer. The server will also send a conversation.item.created event with the user message item that is created from the audio buffer.

§

InputAudioBufferTimeoutTriggered(RealtimeServerEventInputAudioBufferTimeoutTriggered)

Returned when the Server VAD timeout is triggered for the input audio buffer. This is configured with idle_timeout_ms in the turn_detection settings of the session, and it indicates that there hasn’t been any speech detected for the configured duration.

The audio_start_ms and audio_end_ms fields indicate the segment of audio after the last model response up to the triggering time, as an offset from the beginning of audio written to the input audio buffer. This means it demarcates the segment of audio that was silent and the difference between the start and end values will roughly match the configured timeout.

The empty audio will be committed to the conversation as an input_audio item (there will be a input_audio_buffer.committed event) and a model response will be generated. There may be speech that didn’t trigger VAD but is still detected by the model, so the model may respond with something relevant to the conversation or a prompt to continue speaking.

§

OutputAudioBufferStarted(RealtimeServerEventOutputAudioBufferStarted)

WebRTC Only: Emitted when the server begins streaming audio to the client. This event is emitted after an audio content part has been added (response.content_part.added) to the response. Learn more.

§

OutputAudioBufferStopped(RealtimeServerEventOutputAudioBufferStopped)

WebRTC Only: Emitted when the output audio buffer has been completely drained on the server, and no more audio is forthcoming. This event is emitted after the full response data has been sent to the client (response.done). Learn more.

§

OutputAudioBufferCleared(RealtimeServerEventOutputAudioBufferCleared)

WebRTC Only: Emitted when the output audio buffer is cleared. This happens either in VAD mode when the user has interrupted (input_audio_buffer.speech_started), or when the client has emitted the output_audio_buffer.clear event to manually cut off the current audio response. Learn more.

§

ResponseCreated(RealtimeServerEventResponseCreated)

Returned when a new Response is created. The first event of response creation, where the response is in an initial state of in_progress.

§

ResponseDone(RealtimeServerEventResponseDone)

Returned when a Response is done streaming. Always emitted, no matter the final state. The Response object included in the response.done event will include all output Items in the Response but will omit the raw audio data.

Clients should check the status field of the Response to determine if it was successful (completed) or if there was another outcome: cancelled, failed, or incomplete.

A response will contain all output items that were generated during the response, excluding any audio content.

§

ResponseOutputItemAdded(RealtimeServerEventResponseOutputItemAdded)

Returned when a new Item is created during Response generation.

§

ResponseOutputItemDone(RealtimeServerEventResponseOutputItemDone)

Returned when an Item is done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled.

§

ResponseContentPartAdded(RealtimeServerEventResponseContentPartAdded)

Returned when a new content part is added to an assistant message item during response generation.

§

ResponseContentPartDone(RealtimeServerEventResponseContentPartDone)

Returned when a content part is done streaming in an assistant message item. Also emitted when a Response is interrupted, incomplete, or cancelled.

§

ResponseOutputTextDelta(RealtimeServerEventResponseTextDelta)

Returned when the text value of an “output_text” content part is updated.

§

ResponseOutputTextDone(RealtimeServerEventResponseTextDone)

Returned when the text value of an “output_text” content part is done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled.

§

ResponseOutputAudioTranscriptDelta(RealtimeServerEventResponseAudioTranscriptDelta)

Returned when the model-generated transcription of audio output is updated.

§

ResponseOutputAudioTranscriptDone(RealtimeServerEventResponseAudioTranscriptDone)

Returned when the model-generated transcription of audio output is done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled.

§

ResponseOutputAudioDelta(RealtimeServerEventResponseAudioDelta)

Returned when the model-generated audio is updated.

§

ResponseOutputAudioDone(RealtimeServerEventResponseAudioDone)

Returned when the model-generated audio is done. Also emitted when a Response is interrupted, incomplete, or cancelled.

§

ResponseFunctionCallArgumentsDelta(RealtimeServerEventResponseFunctionCallArgumentsDelta)

Returned when the model-generated function call arguments are updated.

§

ResponseFunctionCallArgumentsDone(RealtimeServerEventResponseFunctionCallArgumentsDone)

Returned when the model-generated function call arguments are done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled.

§

ResponseMCPCallArgumentsDelta(RealtimeServerEventResponseMCPCallArgumentsDelta)

Returned when MCP tool call arguments are updated.

§

ResponseMCPCallArgumentsDone(RealtimeServerEventResponseMCPCallArgumentsDone)

Returned when MCP tool call arguments are finalized during response generation.

§

ResponseMCPCallInProgress(RealtimeServerEventResponseMCPCallInProgress)

Returned when an MCP tool call is in progress.

§

ResponseMCPCallCompleted(RealtimeServerEventResponseMCPCallCompleted)

Returned when an MCP tool call has completed successfully.

§

ResponseMCPCallFailed(RealtimeServerEventResponseMCPCallFailed)

Returned when an MCP tool call has failed.

§

MCPListToolsInProgress(RealtimeServerEventMCPListToolsInProgress)

Returned when listing MCP tools is in progress for an item.

§

MCPListToolsCompleted(RealtimeServerEventMCPListToolsCompleted)

Returned when listing MCP tools has completed for an item.

§

MCPListToolsFailed(RealtimeServerEventMCPListToolsFailed)

Returned when listing MCP tools has failed for an item.

§

RateLimitsUpdated(RealtimeServerEventRateLimitsUpdated)

Emitted at the beginning of a Response to indicate the updated rate limits. When a Response is created some tokens will be “reserved” for the output tokens, the rate limits shown here reflect that reservation, which is then adjusted accordingly once the Response is completed.

Trait Implementations§

Source§

impl Clone for RealtimeServerEvent

Source§

fn clone(&self) -> RealtimeServerEvent

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for RealtimeServerEvent

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl<'de> Deserialize<'de> for RealtimeServerEvent

Source§

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>
where __D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more
Source§

impl Serialize for RealtimeServerEvent

Source§

fn serialize<__S>(&self, __serializer: __S) -> Result<__S::Ok, __S::Error>
where __S: Serializer,

Serialize this value into the given Serde serializer. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> PolicyExt for T
where T: ?Sized,

Source§

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow only if self and other return Action::Follow. Read more
Source§

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow if either self or other returns Action::Follow. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,