Crate autoagents_speech

Expand description

§AutoAgents Speech

Speech (TTS/STT) provider abstractions for the AutoAgents framework.

This crate provides trait-based abstraction layers for speech providers, allowing different backends to be used interchangeably within the AutoAgents ecosystem.

§Features

§TTS (Text-to-Speech)

Speech Generation: Generate audio from text
Voice Management: Use predefined voices
Streaming Support: Optional streaming for real-time audio generation
Model Management: Support for multiple models and languages

§STT (Speech-to-Text)

Transcription: Convert audio to text
Streaming Support: Real-time audio transcription
Timestamp Support: Token-level timestamps for transcriptions
Multilingual: Support for multiple languages with auto-detection

§Architecture

The crate follows a trait-based design with provider implementations in the providers module:

§TTS Traits

TTSProvider: Marker trait combining all TTS capabilities
TTSSpeechProvider: Speech generation capabilities
TTSModelsProvider: Model and language support

§STT Traits

STTProvider: Marker trait combining all STT capabilities
STTSpeechProvider: Transcription capabilities
STTModelsProvider: Model and language support

§Providers

Enable providers using feature flags:

pocket-tts: Pocket-TTS model support (TTS)
parakeet: Parakeet (NVIDIA) model support (STT)
vad: Silero VAD support (speech segmentation)

Re-exports§

pub use error::TTSError;
pub use error::TTSResult;
pub use tts::ChunkerConfig;
pub use tts::SentenceChunker;
pub use tts::StreamingTtsPipeline;
pub use types::AudioChunk;
pub use types::AudioData;
pub use types::AudioFormat;
pub use types::ModelInfo;
pub use types::SharedAudioData;
pub use types::SpeechRequest;
pub use types::SpeechResponse;
pub use types::VoiceIdentifier;
pub use error::STTError;
pub use error::STTResult;
pub use model_source::ModelSource;
pub use types::TextChunk;
pub use types::TokenTimestamp;
pub use types::TranscriptionRequest;
pub use types::TranscriptionResponse;

Modules§

error
model_source
providers: Speech provider implementations
tts: TTS utilities for sentence chunking and streaming pipeline.
types

Traits§

STTModelsProvider: Trait for STT model management capabilities
STTProvider: Marker trait for STT providers
STTSpeechProvider: Trait for STT transcription capabilities
TTSModelsProvider: Trait for TTS model management capabilities
TTSProvider: Marker Trait for TTS providers
TTSSpeechProvider: Trait for TTS speech generation capabilities

Crate autoagents_speech

Crate autoagents_speech Copy item path

§AutoAgents Speech

§Features

§TTS (Text-to-Speech)

§STT (Speech-to-Text)

§Architecture

§TTS Traits

§STT Traits

§Providers

Re-exports§

Modules§

Traits§

Crate autoagents_speech