elevenlabs_stt
A type-safe, async Rust client for the ElevenLabs Speech-to-Text API. Transcribe audio and videos to text with a simple, ergonomic API.
Features
- Type-safe & Async: Built with Rust's type system and async/await support
- Builder Pattern: Intuitive, chainable API for configuring STT requests
- Model Support: Full support for ElevenLabs models (
models::elevenlabs_models::*) - Customizable: Elevanlabs STT APIs, custom base URLs, and enterprise support
- Tokio Ready: Works seamlessly with the Tokio runtime
- Audio & Video: Works with audios and videos, up to 3.0GB
Check-out Also:
This project is part of a milestone to implement all ElevenLabs APIs in Rust.
- Elevenlabs TTS: ElevenLabs Text-to-Speech API. ✅
- Elevenlabs STT: ElevenLabs Speech-to-Text API. ✅
- Elevenlabs TTD: ElevenLabs Text-to-Dialogue API. ✅
- Elevenlabs TTV: ElevenLabs Text-to-Voice API. ✅
- Elevenlabs TTM: ElevenLabs Text-to-Music API. ✅
- Elevenlabs SFX: ElevenLabs Sound Effects API. ✅
- Elevenlabs VC: ElevenLabs Voice Changer API. ✅
- Elevenlabs AUI: ElevenLabs Audio Isolation API. ⏳
- Elevenlabs DUB: ElevenLabs Dubbing API. ⏳
Installation
Add this to your Cargo.toml:
[]
= "0.0.5"
Quick Start
use ;
async
Examples
Basic Usage
use ;
use env;
async
Advanced Configuration
use ;
use env;
async
Running Examples
# Set your API key
# Run the basic example
# Run the advanced example
API Overview
| Method | Description |
|---|---|
ElevenLabsSTTClient::new(String) |
Create client instance (required)* |
.speech_to_text(Option<Vec<u8>>) |
Build a STT request, (File or cloud_storage_url) (required)* |
.model(String) |
Select model (optional) |
.language_code(String) |
Force language pronounce/accent only (no translation) (optional) |
.tag_audio_events(bool) |
Tag audio events like (laughter), (footsteps), etc. (optional) |
.num_speakers(u32) |
The max amount of speakers talking in the uploaded file. (optional) |
.timestamps_granularity(String) |
Allowed values: none, word, character. Defaults to word. (optional) |
.diarize(bool) |
Which speaker is currently talking in the uploaded file. (optional) |
.diarization_threshold(f32) |
Can only be set when diarize=True and num_speakers=None. (optional) |
.cloud_storage_url(String) |
URL of the file to transcribe, if this is None, you must provide file. (optional) |
.webhook(bool) |
Send the transcription result to configured speech-to-text webhooks. (optional) |
.webhook_id(String) |
Optional specific webhook ID to send the transcription result to. (optional) |
.temperature(f32) |
Controls the randomness of the transcription output, between 0.0 and 2.0 (optional) |
.seed(u32) |
Our system will make a best effort to sample deterministically (optional) |
.use_multi_channel(bool) |
Whether the audio file contains multiple channels (optional) |
.webhook_metadata(String) |
Optional metadata to be included in the webhook response (optional) |
.execute() |
Run request → transcribe file (required)* |
Error Handling
The crate uses standard Rust error handling patterns. All async methods return Result types:
match client.speech_to_text.execute.await
Requirements
- Rust 1.70+ (for async/await support)
- Tokio runtime
- Valid ElevenLabs API key
License
Licensed under either of:
at your option.
Contributing
Contributions are welcome! Please feel free to:
- Open issues for bugs or feature requests
- Submit pull requests with improvements
- Improve documentation or examples
- Add tests or benchmarks
Before contributing, please ensure your code follows Rust conventions and includes appropriate tests.
Support
If you like this project, consider supporting me on Patreon 💖
Changelog
See CHANGELOG.md for a detailed history of changes.
Note: This crate is not officially affiliated with ElevenLabs. Please refer to the ElevenLabs API documentation for the most up-to-date API information.