# LiveSpeech SDK for Rust
[](https://crates.io/crates/livespeech-sdk)
[](https://docs.rs/livespeech-sdk)
[](https://opensource.org/licenses/MIT)
A Rust SDK for real-time speech-to-speech AI conversations.
## Features
- đī¸ **Real-time Voice Conversations** - Natural, low-latency voice interactions
- đ **Multi-language Support** - Korean, English, Japanese, Chinese, and more
- đ **Streaming Audio** - Send and receive audio in real-time
- âšī¸ **Barge-in Support** - Interrupt AI mid-speech by talking or programmatically
- đ **Auto-reconnection** - Automatic recovery from network issues
## Installation
Add to your `Cargo.toml`:
```toml
[dependencies]
livespeech-sdk = "0.1"
tokio = { version = "1.35", features = ["full"] }
```
## Quick Start (5 minutes)
```rust
use livespeech_sdk::{Config, LiveSpeechClient, LiveSpeechEvent, Region, SessionConfig};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// 1. Create client
let config = Config::builder()
.region(Region::ApNortheast2)
.api_key("your-api-key")
.build()?;
let client = LiveSpeechClient::new(config);
// 2. Handle events (only 4 essential events!)
let mut events = client.subscribe();
tokio::spawn(async move {
while let Ok(event) = events.recv().await {
match event {
// Play AI audio
LiveSpeechEvent::Audio(e) => {
audio_player.queue(&e.data); // PCM16 @ 24kHz
}
// User interrupted - CLEAR BUFFER!
LiveSpeechEvent::Interrupted(_) => {
audio_player.clear();
}
// AI finished speaking
LiveSpeechEvent::TurnComplete(_) => {
println!("AI finished");
}
// Handle errors
LiveSpeechEvent::Error(e) => {
eprintln!("Error: {}", e.message);
}
_ => {}
}
}
});
// 3. Connect and start
client.connect().await?;
client.start_session(Some(SessionConfig::new("You are a helpful assistant."))).await?;
// 4. Send audio
client.audio_start().await?;
for chunk in audio_chunks {
client.send_audio_chunk(&chunk).await?; // PCM16 @ 16kHz
}
client.audio_end().await?;
// 5. Cleanup
client.end_session().await?;
client.disconnect().await;
Ok(())
}
```
---
# Core API
Everything you need for basic voice conversations.
## Methods
| `connect()` | Establish connection |
| `disconnect()` | Close connection |
| `start_session(config)` | Start conversation with system prompt |
| `end_session()` | End conversation |
| `send_audio_chunk(data)` | Send PCM16 audio (16kHz) |
## Events
| `Audio` | AI's audio output | Play audio (PCM16 @ 24kHz) |
| `TurnComplete` | AI finished speaking | Ready for next input |
| `Interrupted` | User barged in | **Clear audio buffer!** |
| `Error` | Error occurred | Handle/log error |
### â ī¸ Critical: Handle `Interrupted`
When the user speaks while AI is responding, **you must clear your audio buffer**:
```rust
LiveSpeechEvent::Interrupted(_) => {
audio_player.clear(); // Stop buffered audio immediately
audio_player.stop();
}
```
Without this, 2-3 seconds of buffered audio continues playing after the user interrupts.
## Audio Format
| Input (mic) | PCM16 | 16,000 Hz |
| Output (AI) | PCM16 | 24,000 Hz |
## Configuration
```rust
let config = Config::builder()
.region(Region::ApNortheast2) // Required
.api_key("your-api-key") // Required
.build()?;
let session = SessionConfig::new("You are a helpful assistant.")
.with_language("ko-KR"); // Optional: ko-KR, en-US, ja-JP, etc.
```
---
# Advanced API
Optional features for power users.
## Additional Methods
| `audio_start()` / `audio_end()` | Manual audio stream control |
| `interrupt()` | Explicitly stop AI response (for Stop button) |
| `send_system_message(text)` | Inject context during conversation |
| `send_tool_response(id, result)` | Reply to function calls |
| `update_user_id(user_id)` | Migrate guest to authenticated user |
## Additional Events
| `Connected` / `Disconnected` | Connection lifecycle |
| `SessionStarted` / `SessionEnded` | Session lifecycle |
| `Ready` | Session ready for audio |
| `UserTranscript` | User's speech transcribed |
| `Response` | AI's response text |
| `ToolCall` | AI wants to call a function |
| `UserIdUpdated` | Guest-to-user migration complete |
---
## Explicit Interrupt (Stop Button)
For UI "Stop" buttons or programmatic control:
```rust
// User clicks Stop button
client.interrupt().await?;
```
Note: Voice barge-in works automatically via Gemini's VAD. This method is for explicit control.
---
## System Messages
Inject text context during live sessions (game events, app state, etc.):
```rust
// AI responds immediately
client.send_system_message("User completed level 5. Congratulate them!").await?;
// Context only, no response
client.send_system_message_with_options("User is browsing", false).await?;
```
> Requires active live session (`audio_start()` called). Max 500 characters.
---
## Function Calling (Tool Use)
Let AI call functions in your app:
### 1. Define Tools
```rust
let tools = vec![Tool {
name: "get_price".to_string(),
description: "Gets product price by ID".to_string(),
parameters: Some(FunctionParameters {
r#type: "OBJECT".to_string(),
properties: serde_json::json!({
"productId": { "type": "string" }
}),
required: vec!["productId".to_string()],
}),
}];
let session = SessionConfig::new("You are helpful.")
.with_tools(tools);
```
### 2. Handle ToolCall Events
```rust
LiveSpeechEvent::ToolCall(e) => {
let result = match e.name.as_str() {
"get_price" => {
let price = lookup_price(&e.args["productId"]);
serde_json::json!({ "price": price })
}
_ => serde_json::json!({ "error": "Unknown" })
};
client.send_tool_response(&e.id, result).await.ok();
}
```
---
## Conversation Memory
Enable persistent memory across sessions:
```rust
let config = Config::builder()
.region(Region::ApNortheast2)
.api_key("your-api-key")
.user_id("user-123") // Enables memory
.build()?;
```
| With `user_id` | Permanent (entities, summaries) |
| Without `user_id` | Session only (guest) |
### Guest-to-User Migration
```rust
// User logs in during session
client.update_user_id("authenticated-user-123").await?;
// Listen for confirmation
LiveSpeechEvent::UserIdUpdated(e) => {
println!("Migrated {} messages", e.migrated_messages);
}
```
---
## AI Speaks First
AI initiates the conversation:
```rust
let session = SessionConfig::new("Greet the customer warmly.")
.with_ai_speaks_first(true);
client.start_session(Some(session)).await?;
client.audio_start().await?; // AI speaks immediately
```
---
## Session Options
| `prePrompt` | - | System prompt |
| `language` | `"en-US"` | Language code |
| `pipeline_mode` | `Live` | `Live` (~300ms) or `Composed` (~1-2s) |
| `ai_speaks_first` | `false` | AI initiates (Live mode only) |
| `allow_harm_category` | `false` | Disable safety filters |
| `tools` | `[]` | Function definitions |
---
## Audio Utilities
```rust
use livespeech_sdk::{float32_to_int16, int16_to_bytes, wrap_pcm_in_wav};
let pcm = float32_to_int16(&float_samples);
let bytes = int16_to_bytes(&pcm);
let wav = wrap_pcm_in_wav(&bytes, 16000, 1, 16);
```
---
## Error Handling
```rust
match client.connect().await {
Ok(()) => println!("Connected"),
Err(LiveSpeechError::ConnectionTimeout) => eprintln!("Timed out"),
Err(LiveSpeechError::NotConnected) => eprintln!("Not connected"),
Err(e) => eprintln!("Error: {}", e),
}
```
---
## Regions
| Seoul (Korea) | `Region::ApNortheast2` |
## License
MIT