Skip to main content

Crate autoagents_speech

Crate autoagents_speech 

Source
Expand description

§AutoAgents Speech

Speech (TTS/STT) provider abstractions for the AutoAgents framework.

This crate provides trait-based abstraction layers for speech providers, allowing different backends to be used interchangeably within the AutoAgents ecosystem.

§Features

§TTS (Text-to-Speech)

  • Speech Generation: Generate audio from text
  • Voice Management: Use predefined voices
  • Streaming Support: Optional streaming for real-time audio generation
  • Model Management: Support for multiple models and languages

§STT (Speech-to-Text)

  • Transcription: Convert audio to text
  • Streaming Support: Real-time audio transcription
  • Timestamp Support: Token-level timestamps for transcriptions
  • Multilingual: Support for multiple languages with auto-detection

§Architecture

The crate follows a trait-based design with provider implementations in the providers module:

§TTS Traits

  • TTSProvider: Marker trait combining all TTS capabilities
  • TTSSpeechProvider: Speech generation capabilities
  • TTSModelsProvider: Model and language support

§STT Traits

  • STTProvider: Marker trait combining all STT capabilities
  • STTSpeechProvider: Transcription capabilities
  • STTModelsProvider: Model and language support

§Providers

Enable providers using feature flags:

  • pocket-tts: Pocket-TTS model support (TTS)
  • parakeet: Parakeet (NVIDIA) model support (STT)
  • vad: Silero VAD support (speech segmentation)

Re-exports§

pub use error::TTSError;
pub use error::TTSResult;
pub use tts::ChunkerConfig;
pub use tts::SentenceChunker;
pub use tts::StreamingTtsPipeline;
pub use types::AudioChunk;
pub use types::AudioData;
pub use types::AudioFormat;
pub use types::ModelInfo;
pub use types::SharedAudioData;
pub use types::SpeechRequest;
pub use types::SpeechResponse;
pub use types::VoiceIdentifier;
pub use error::STTError;
pub use error::STTResult;
pub use model_source::ModelSource;
pub use types::TextChunk;
pub use types::TokenTimestamp;
pub use types::TranscriptionRequest;
pub use types::TranscriptionResponse;

Modules§

error
model_source
providers
Speech provider implementations
tts
TTS utilities for sentence chunking and streaming pipeline.
types

Traits§

STTModelsProvider
Trait for STT model management capabilities
STTProvider
Marker trait for STT providers
STTSpeechProvider
Trait for STT transcription capabilities
TTSModelsProvider
Trait for TTS model management capabilities
TTSProvider
Marker Trait for TTS providers
TTSSpeechProvider
Trait for TTS speech generation capabilities