whispr
A general-purpose voice <-> crate — text-to-speech, speech-to-text, and audio-to-audio transformations. Also supports realtime conversations.
Overview
Whispr provides a clean, ergonomic API for working with audio AI services. It's designed to be provider-agnostic, though only openai is currently implemented.
Installation
[]
= "0.1"
= { = "1", = ["full"] }
Quick Start
use ;
async
Features
Text to Speech
Convert text to natural-sounding audio with multiple voices and customization options.
use ;
let client = from_env?;
let audio = client
.speech
.text
.voice
.model
.format
.speed
.instructions // Voice personality (gpt-4o-mini-tts only)
.generate
.await?;
write?;
Available Voices: Alloy, Ash, Ballad, Coral, Echo, Fable, Nova, Onyx, Sage, Shimmer, Verse
Available Models:
Gpt4oMiniTts— Latest model with instruction supportTts1— Optimized for speedTts1Hd— Optimized for quality
Speech to Text
Transcribe audio files to text with optional language hints.
let result = client
.transcription
.file.await?
.language
.transcribe
.await?;
println!;
From bytes (useful for recorded audio):
let wav_data: = record_audio;
let result = client
.transcription
.bytes
.transcribe
.await?;
Audio to Audio
Transcribe audio and generate new speech in one call — useful for voice transformation, translation, or processing pipelines.
let = client.audio_to_audio.await?;
println!;
write?;
Streaming
For real-time applications, stream audio as it's generated:
use StreamExt;
let mut stream = client
.speech
.text
.generate_stream
.await?;
while let Some = stream.next.await
Prompts
The prompts module includes pre-built voice personalities for common use cases:
use prompts;
client.speech
.text
.model
.instructions
.generate
.await?;
License
MIT License — see LICENSE for details.