LiveSpeech SDK for Rust
A Rust SDK for real-time speech-to-speech AI conversations.
Installation
Add to your Cargo.toml:
[]
= "0.1"
= { = "1.35", = ["full"] }
Quick Start
use ;
async
Pipeline Modes
The SDK supports two pipeline modes for audio processing:
Live Mode (Default)
Uses Gemini 2.5 Flash Live API for end-to-end audio conversation:
- Lower latency - Direct audio-to-audio processing
- Natural conversation - Built-in voice activity detection
- Real-time transcription - Both user and AI speech transcribed
let session_config = new
.with_pipeline_mode; // Default, can be omitted
Composed Mode
Uses separate STT + LLM + TTS services for more customization:
- More control - Separate services for each step
- Custom voices - Use different TTS voices
- Text responses - Access to intermediate text responses
let session_config = new
.with_pipeline_mode;
API Reference
Regions
The SDK provides built-in region support, so you don't need to remember endpoint URLs:
| Region | Variant | Location |
|---|---|---|
ap-northeast-2 |
Region::ApNortheast2 |
Asia Pacific (Seoul) |
us-west-2 |
Region::UsWest2 |
US West (Oregon) - Coming soon |
Config
Use the builder pattern to create configuration:
let config = builder
.region // Required
.api_key // Required
.connection_timeout
.auto_reconnect
.max_reconnect_attempts
.reconnect_delay
.debug
.build?;
LiveSpeechClient
Methods
| Method | Description |
|---|---|
connect() |
Connect to the server |
disconnect() |
Disconnect from the server |
start_session(config) |
Start a conversation session |
end_session() |
End the current session |
send_audio(data, format) |
Send audio data to be transcribed |
connection_state() |
Get current connection state |
is_connected() |
Check if connected |
has_active_session() |
Check if session is active |
Event Handlers
// User transcript handler (user's speech)
client.on_user_transcript.await;
// Transcript handler (AI's speech transcription in live mode)
client.on_transcript.await;
// Response handler (AI text response in composed mode)
client.on_response.await;
// Audio handler (audio bytes)
client.on_audio.await;
// Error handler (ErrorEvent)
client.on_error.await;
SessionConfig
// Simple creation (uses Live mode by default)
let config = new;
// Builder pattern for more options
let config = new
.with_language
.with_pipeline_mode;
// Empty config (uses defaults)
let config = empty;
| Option | Type | Default | Description |
|---|---|---|---|
pre_prompt |
Option<String> |
None |
System prompt for the AI |
language |
Option<String> |
None |
Language code (e.g., "ko-KR") |
pipeline_mode |
PipelineMode |
Live |
Audio processing mode |
AudioFormat
Audio Utilities
use ;
// Convert f32 samples to i16 PCM
let pcm = float32_to_int16;
// Create WAV from PCM
let wav = wrap_pcm_in_wav;
// Use AudioEncoder for convenience
let encoder = new;
let base64 = encoder.encode;
let decoded = encoder.decode?;
Events
All events can be received through the event channel:
let mut events = client.events.await;
while let Some = events.recv.await
Error Handling
The SDK uses a custom error type:
use ;
match client.connect.await
License
MIT