Skip to main content

Crate adk_realtime

Crate adk_realtime 

Source
Expand description

§adk-realtime

Real-time bidirectional audio/video streaming for ADK agents.

This crate provides a unified interface for building voice-enabled AI agents using real-time streaming APIs from various providers (OpenAI, Gemini, etc.).

§Architecture

adk-realtime follows the same pattern as OpenAI’s Agents SDK, providing both a low-level session interface and a high-level RealtimeAgent that implements the standard ADK Agent trait.

                    ┌─────────────────────────────────────────┐
                    │              Agent Trait                │
                    │  (name, description, run, sub_agents)   │
                    └────────────────┬────────────────────────┘
                                     │
             ┌───────────────────────┼───────────────────────┐
             │                       │                       │
    ┌────────▼────────┐    ┌─────────▼─────────┐   ┌─────────▼─────────┐
    │    LlmAgent     │    │  RealtimeAgent    │   │  SequentialAgent  │
    │  (text-based)   │    │  (voice-based)    │   │   (workflow)      │
    └─────────────────┘    └───────────────────┘   └───────────────────┘

§Features

  • RealtimeAgent: Implements adk_core::Agent with callbacks, tools, instructions
  • Multiple Providers: OpenAI Realtime API and Gemini Live API support
  • Audio Streaming: Bidirectional audio with various formats (PCM16, G711)
  • Voice Activity Detection: Server-side VAD for natural conversations
  • Tool Calling: Real-time function execution during voice sessions
use adk_realtime::{RealtimeAgent, openai::OpenAIRealtimeModel};
use adk_runner::Runner;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let model = OpenAIRealtimeModel::new(api_key, "gpt-4o-realtime-preview-2024-12-17");

    let agent = RealtimeAgent::builder("voice_assistant")
        .model(Arc::new(model))
        .instruction("You are a helpful voice assistant.")
        .voice("alloy")
        .server_vad()
        .tool(Arc::new(weather_tool))
        .build()?;

    // Use with standard ADK runner
    let runner = Runner::new(Arc::new(agent));
    runner.run(session, content).await?;

    Ok(())
}

§Example - Using Low-Level Session API

use adk_realtime::{RealtimeModel, RealtimeConfig, ServerEvent};
use adk_realtime::openai::OpenAIRealtimeModel;

let model = OpenAIRealtimeModel::new(api_key, "gpt-4o-realtime-preview-2024-12-17");
let session = model.connect(config).await?;

while let Some(event) = session.next_event().await {
    match event? {
        ServerEvent::AudioDelta { delta, .. } => { /* play audio */ }
        ServerEvent::TextDelta { delta, .. } => println!("{}", delta),
        _ => {}
    }
}

Re-exports§

pub use agent::RealtimeAgent;
pub use agent::RealtimeAgentBuilder;
pub use audio::AudioEncoding;
pub use audio::AudioFormat;
pub use config::RealtimeConfig;
pub use config::RealtimeConfigBuilder;
pub use config::VadConfig;
pub use config::VadMode;
pub use error::RealtimeError;
pub use error::Result;
pub use events::ClientEvent;
pub use events::ServerEvent;
pub use events::ToolCall;
pub use events::ToolResponse;
pub use model::BoxedModel;
pub use model::RealtimeModel;
pub use runner::RealtimeRunner;
pub use session::BoxedSession;
pub use session::RealtimeSession;
pub use session::RealtimeSessionExt;

Modules§

agent
RealtimeAgent - an Agent implementation for real-time voice interactions.
audio
Audio format definitions and utilities.
config
Configuration types for realtime sessions.
error
Error types for the realtime module.
events
Event types for realtime communication.
model
Core RealtimeModel trait definition.
runner
RealtimeRunner for integrating realtime sessions with agents.
session
Core RealtimeSession trait definition.