LiveSpeech SDK for Rust
A Rust SDK for real-time speech-to-speech AI conversations.
Features
- đī¸ Real-time Voice Conversations - Natural, low-latency voice interactions
- đ Multi-language Support - Korean, English, Japanese, Chinese, and more
- đ Streaming Audio - Send and receive audio in real-time
- âšī¸ Barge-in Support - Interrupt AI mid-speech by talking or programmatically
- đ Auto-reconnection - Automatic recovery from network issues
Installation
Add to your Cargo.toml:
[]
= "0.1"
= { = "1.35", = ["full"] }
Quick Start (5 minutes)
use ;
async
Core API
Everything you need for basic voice conversations.
Methods
| Method | Description |
|---|---|
connect() |
Establish connection |
disconnect() |
Close connection |
start_session(config) |
Start conversation with system prompt |
end_session() |
End conversation |
send_audio_chunk(data) |
Send PCM16 audio (16kHz) |
Events
| Event | Description | Action Required |
|---|---|---|
Audio |
AI's audio output | Play audio (PCM16 @ 24kHz) |
TurnComplete |
AI finished speaking | Ready for next input |
Interrupted |
User barged in | Clear audio buffer! |
Error |
Error occurred | Handle/log error |
â ī¸ Critical: Handle Interrupted
When the user speaks while AI is responding, you must clear your audio buffer:
Interrupted =>
Without this, 2-3 seconds of buffered audio continues playing after the user interrupts.
Audio Format
| Direction | Format | Sample Rate |
|---|---|---|
| Input (mic) | PCM16 | 16,000 Hz |
| Output (AI) | PCM16 | 24,000 Hz |
Configuration
let config = builder
.region // Required
.api_key // Required
.build?;
let session = new
.with_language; // Optional: ko-KR, en-US, ja-JP, etc.
Advanced API
Optional features for power users.
Additional Methods
| Method | Description |
|---|---|
audio_start() / audio_end() |
Manual audio stream control |
interrupt() |
Explicitly stop AI response (for Stop button) |
send_system_message(text) |
Inject context during conversation |
send_tool_response(id, result) |
Reply to function calls |
update_user_id(user_id) |
Migrate guest to authenticated user |
Additional Events
| Event | Description |
|---|---|
Connected / Disconnected |
Connection lifecycle |
SessionStarted / SessionEnded |
Session lifecycle |
Ready |
Session ready for audio |
UserTranscript |
User's speech transcribed |
Response |
AI's response text |
ToolCall |
AI wants to call a function |
UserIdUpdated |
Guest-to-user migration complete |
Explicit Interrupt (Stop Button)
For UI "Stop" buttons or programmatic control:
// User clicks Stop button
client.interrupt.await?;
Note: Voice barge-in works automatically via Gemini's VAD. This method is for explicit control.
System Messages
Inject text context during live sessions (game events, app state, etc.):
// AI responds immediately
client.send_system_message.await?;
// Context only, no response
client.send_system_message_with_options.await?;
Requires active live session (
audio_start()called). Max 500 characters.
Function Calling (Tool Use)
Let AI call functions in your app:
1. Define Tools
let tools = vec!;
let session = new
.with_tools;
2. Handle ToolCall Events
ToolCall =>
Conversation Memory
Enable persistent memory across sessions:
let config = builder
.region
.api_key
.user_id // Enables memory
.build?;
| Mode | Memory |
|---|---|
With user_id |
Permanent (entities, summaries) |
Without user_id |
Session only (guest) |
Guest-to-User Migration
// User logs in during session
client.update_user_id.await?;
// Listen for confirmation
UserIdUpdated =>
AI Speaks First
AI initiates the conversation:
let session = new
.with_ai_speaks_first;
client.start_session.await?;
client.audio_start.await?; // AI speaks immediately
Session Options
| Option | Default | Description |
|---|---|---|
prePrompt |
- | System prompt |
language |
"en-US" |
Language code |
pipeline_mode |
Live |
Live (~300ms) or Composed (~1-2s) |
ai_speaks_first |
false |
AI initiates (Live mode only) |
allow_harm_category |
false |
Disable safety filters |
tools |
[] |
Function definitions |
Audio Utilities
use ;
let pcm = float32_to_int16;
let bytes = int16_to_bytes;
let wav = wrap_pcm_in_wav;
Error Handling
match client.connect.await
Regions
| Region | Code |
|---|---|
| Seoul (Korea) | Region::ApNortheast2 |
License
MIT