LiveSpeech SDK for Rust
A Rust SDK for real-time speech-to-speech AI conversations.
Features
- 🎙️ Real-time Voice Conversations - Natural, low-latency voice interactions
- 🌐 Multi-language Support - Korean, English, Japanese, Chinese, and more
- 🔊 Streaming Audio - Send and receive audio in real-time
- 📝 Live Transcription - Get transcriptions of both user and AI speech
- 🔄 Auto-reconnection - Automatic recovery from network issues
Installation
Add to your Cargo.toml:
[]
= "0.1"
= { = "1.35", = ["full"] }
Quick Start
use ;
async
Audio Flow
connect() → start_session() → audio_start() → send_audio_chunk()* → audio_end() → end_session()
↓
send_system_message() (optional, during live session)
send_tool_response() (when toolCall received)
| Step | Description |
|---|---|
connect() |
Establish connection |
start_session(config) |
Start conversation with optional system prompt |
audio_start() |
Begin audio streaming |
send_audio_chunk(data) |
Send PCM16 audio (call multiple times) |
send_system_message(msg) |
Inject context or trigger AI response (optional) |
send_tool_response(id, result) |
Send function result back to AI (after toolCall) |
audio_end() |
End streaming, triggers AI response |
end_session() |
End conversation |
disconnect() |
Close connection |
Configuration
let config = builder
.region // Required: Seoul region
.api_key // Required: Your API key
.user_id // Optional: Enable conversation memory
.auto_reconnect // Auto-reconnect on disconnect
.debug // Enable debug logging
.build?;
let session = new
.with_language // Language: ko-KR, en-US, ja-JP, etc.
.with_pipeline_mode // 'live' (default) or 'composed'
.with_ai_speaks_first // AI speaks first (live mode only)
.with_allow_harm_category // Disable safety filtering (use with caution)
.with_tools; // Function calling
Session Options
| Option | Type | Default | Description |
|---|---|---|---|
prePrompt |
&str |
- | System prompt for the AI assistant |
language |
&str |
"en-US" |
Language code (e.g., ko-KR, ja-JP) |
pipeline_mode |
PipelineMode |
Live |
Audio processing mode |
ai_speaks_first |
bool |
false |
AI initiates conversation (live mode only) |
allow_harm_category |
bool |
false |
Disable content safety filtering |
tools |
Vec<Tool> |
vec![] |
Function definitions for AI to call |
Pipeline Modes
| Mode | Latency | Description |
|---|---|---|
Live |
Lower (~300ms) | Direct audio-to-audio via Live API |
Composed |
Higher (~1-2s) | Separate STT → LLM → TTS pipeline |
AI Speaks First
When ai_speaks_first(true), the AI will immediately speak a greeting based on your prePrompt:
let session = new
.with_ai_speaks_first;
client.start_session.await?;
client.audio_start.await?; // AI greeting plays immediately
⚠️ Note: Only works with
PipelineMode::Live
Content Safety
By default, LLM applies content safety filtering. Set allow_harm_category(true) to disable:
let session = new
.with_allow_harm_category; // ⚠️ Disables all safety filters
⚠️ Warning: Only use in controlled environments where content moderation is handled by other means.
Function Calling (Tool Use)
Define functions that the AI can call during conversation. When the AI decides to call a function, you receive a ToolCall event and must respond with send_tool_response().
Define Tools
use ;
let tools = vec!;
let session = new
.with_tools;
client.start_session.await?;
Handle Tool Calls
let mut events = client.subscribe;
spawn;
⚠️ Note: Function calling only works with
PipelineMode::Live
System Messages
During an active live session, you can inject text messages to the AI using send_system_message(). This is useful for:
- Game events ("User completed level 5, congratulate them!")
- App state changes ("User opened the cart with 3 items")
- Timer/engagement triggers ("User has been quiet, engage them")
- External data updates ("Weather changed to rainy")
Usage
// Simple usage - AI responds immediately
client.send_system_message.await?;
// With trigger_response option - context only, no immediate response
client.send_system_message_with_options.await?;
Methods
| Method | Description |
|---|---|
send_system_message(text) |
Send message, AI responds immediately |
send_system_message_with_options(text, trigger_response) |
Send with explicit trigger option |
Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
text |
&str |
Yes | - | Message text (max 500 chars) |
trigger_response |
bool |
No | true |
AI responds immediately if true |
⚠️ Note: Requires an active live session (
audio_start()must have been called). Only works withPipelineMode::Live.
Conversation Memory
When you provide a user_id, the SDK enables persistent conversation memory:
- Entity Memory: AI remembers facts shared in previous sessions (names, preferences, relationships)
- Session Summaries: Recent conversation summaries are available to the AI
- Cross-Session: Memory persists across sessions for the same
user_id
// With memory (authenticated user)
let config = builder
.region
.api_key
.user_id // Enables conversation memory
.build?;
// Without memory (guest)
let config = builder
.region
.api_key
// No user_id = guest mode, no persistent memory
.build?;
| Mode | Memory Persistence | Use Case |
|---|---|---|
With user_id |
Permanent | Authenticated users |
Without user_id |
Session only | Guests, anonymous users |
Events
| Event | Description | Key Fields |
|---|---|---|
Connected |
Connection established | connection_id |
Disconnected |
Connection closed | reason |
SessionStarted |
Session created | session_id |
Ready |
Ready for audio input | - |
UserTranscript |
Your speech transcribed | text |
Response |
AI's response text | text, is_final |
Audio |
AI's audio output | data, sample_rate |
TurnComplete |
AI finished speaking | - |
ToolCall |
AI wants to call a function | id, name, args |
Error |
Error occurred | code, message |
Event Subscription
let mut events = client.subscribe;
spawn;
Convenience Handlers
// AI's text response
client.on_response.await;
// AI's audio output
client.on_audio.await;
// Error handling
client.on_error.await;
Audio Format
Input (Your Microphone)
| Property | Value |
|---|---|
| Format | PCM16 (16-bit signed, little-endian) |
| Sample Rate | 16,000 Hz |
| Channels | 1 (Mono) |
| Chunk Size | ~3200 bytes (100ms) |
Output (AI Response)
| Property | Value |
|---|---|
| Format | PCM16 (16-bit signed, little-endian) |
| Sample Rate | 24,000 Hz |
| Channels | 1 (Mono) |
Audio Utilities
use ;
// Convert f32 samples to PCM16 bytes
let pcm_samples = float32_to_int16;
let pcm_bytes = int16_to_bytes;
// Create WAV file
let wav = wrap_pcm_in_wav;
Error Handling
match client.connect.await
Regions
| Region | Code | Location |
|---|---|---|
| Asia Pacific (Seoul) | Region::ApNortheast2 |
Korea |
License
MIT