LiveSpeech SDK for Rust

A Rust SDK for real-time speech-to-speech AI conversations.

Installation

Add to your Cargo.toml:

[dependencies]
livespeech-sdk = "0.1"
tokio = { version = "1.35", features = ["full"] }

Quick Start

use livespeech_sdk::{Config, LiveSpeechClient, Region, SessionConfig, AudioFormat};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create configuration using region
    let config = Config::builder()
        .region(Region::ApNortheast2)  // Asia Pacific (Seoul)
        .api_key("your-api-key")
        .build()?;

    // Create client
    let client = LiveSpeechClient::new(config);

    // Set up event handlers
    client.on_transcript(|text, is_final| {
        println!("Transcript: {} (final: {})", text, is_final);
    }).await;

    client.on_response(|text, is_final| {
        println!("AI Response: {}", text);
    }).await;

    client.on_audio(|audio_data| {
        // Play audio through speakers
    }).await;

    // Connect and start session
    client.connect().await?;
    
    let session_config = SessionConfig::new("You are a helpful assistant.");
    client.start_session(session_config).await?;

    // Send audio
    client.send_audio(&audio_data, AudioFormat::Pcm16).await?;

    Ok(())
}

API Reference

Regions

The SDK provides built-in region support, so you don't need to remember endpoint URLs:

Region	Variant	Location
`ap-northeast-2`	`Region::ApNortheast2`	Asia Pacific (Seoul)
`us-west-2`	`Region::UsWest2`	US West (Oregon) - Coming soon

Config

Use the builder pattern to create configuration:

let config = Config::builder()
    .region(Region::ApNortheast2)    // Required
    .api_key("...")                  // Required
    .connection_timeout(Duration::from_secs(30))
    .auto_reconnect(true)
    .max_reconnect_attempts(5)
    .reconnect_delay(Duration::from_secs(1))
    .debug(false)
    .build()?;

LiveSpeechClient

Methods

Method	Description
`connect()`	Connect to the server
`disconnect()`	Disconnect from the server
`start_session(config)`	Start a conversation session
`end_session()`	End the current session
`send_audio(data, format)`	Send audio data to be transcribed
`connection_state()`	Get current connection state
`is_connected()`	Check if connected
`has_active_session()`	Check if session is active

Event Handlers

// Transcript handler (text, is_final)
client.on_transcript(|text, is_final| {
    println!("Transcript: {}", text);
}).await;

// Response handler (text, is_final)
client.on_response(|text, is_final| {
    println!("Response: {}", text);
}).await;

// Audio handler (audio bytes)
client.on_audio(|audio_data| {
    // Process audio
}).await;

// Error handler (ErrorEvent)
client.on_error(|event| {
    eprintln!("Error: {}", event.message);
}).await;

SessionConfig

// Simple creation
let config = SessionConfig::new("You are a helpful assistant.");

// Builder pattern for more options
let config = SessionConfig::builder("You are a helpful assistant.")
    .voice_id("en-US-Standard-D")
    .language_code("en-US")
    .sample_rate(16000)
    .input_format(AudioFormat::Pcm16)
    .output_format(AudioFormat::Pcm16)
    .metadata("user_id", "12345")
    .build();

AudioFormat

pub enum AudioFormat {
    Pcm16,  // 16-bit PCM (default)
    Opus,   // Opus encoded
    Wav,    // WAV format
}

Audio Utilities

use livespeech_sdk::{
    encode_to_base64,
    decode_from_base64,
    float32_to_int16,
    int16_to_float32,
    wrap_pcm_in_wav,
    AudioEncoder,
};

// Convert f32 samples to i16 PCM
let pcm = float32_to_int16(&float_samples);

// Create WAV from PCM
let wav = wrap_pcm_in_wav(&pcm_bytes, 16000, 1, 16);

// Use AudioEncoder for convenience
let encoder = AudioEncoder::new();
let base64 = encoder.encode(&audio_bytes);
let decoded = encoder.decode(&base64)?;

Events

All events can be received through the event channel:

let mut events = client.events().await;

while let Some(event) = events.recv().await {
    match event {
        LiveSpeechEvent::Connected(e) => println!("Connected: {}", e.connection_id),
        LiveSpeechEvent::Transcript(e) => println!("Transcript: {}", e.text),
        LiveSpeechEvent::Response(e) => println!("Response: {}", e.text),
        LiveSpeechEvent::Audio(e) => { /* handle audio */ },
        LiveSpeechEvent::Error(e) => eprintln!("Error: {}", e.message),
        _ => {}
    }
}

Error Handling

The SDK uses a custom error type:

use livespeech_sdk::{LiveSpeechError, Result};

match client.connect().await {
    Ok(()) => println!("Connected!"),
    Err(LiveSpeechError::ConnectionTimeout) => eprintln!("Timeout"),
    Err(LiveSpeechError::AuthenticationFailed(msg)) => eprintln!("Auth failed: {}", msg),
    Err(e) => eprintln!("Error: {}", e),
}

License

MIT

livespeech-sdk 0.1.0