Expand description
§VoiRS SDK
Unified SDK and public API for VoiRS speech synthesis framework.
VoiRS SDK provides a comprehensive, high-level interface for neural speech synthesis, abstracting the complexity of G2P (Grapheme-to-Phoneme), acoustic modeling, and vocoding into a simple, efficient API.
§Quick Start
use voirs_sdk::prelude::*;
#[tokio::main]
async fn main() -> Result<()> {
// Create a pipeline with default settings
let pipeline = VoirsPipelineBuilder::new()
.with_quality(QualityLevel::High)
.with_voice("default")
.build()
.await?;
// Synthesize speech
let audio = pipeline.synthesize("Hello, world!").await?;
// Save to file
audio.save_wav("output.wav")?;
Ok(())
}§Key Features
- Simple API: High-level interface for speech synthesis
- Async/Concurrent: Built for modern async Rust applications
- Streaming: Real-time synthesis with low latency
- Plugin System: Extensible audio effects and processing
- Caching: Intelligent model and result caching
- Quality Control: Comprehensive audio quality validation
- Performance: Optimized for both speed and memory efficiency
§Architecture
The VoiRS SDK consists of several key components:
VoirsPipeline: Main synthesis pipelineVoirsPipelineBuilder: Fluent API for pipeline configurationAudioBuffer: Audio data management and processingstreaming: Real-time synthesis capabilitiesplugins: Extensible effects systemcache: Intelligent caching system
§Examples
§Basic Synthesis
use voirs_sdk::prelude::*;
#[tokio::main]
async fn main() -> Result<()> {
let pipeline = VoirsPipelineBuilder::new().build().await?;
let audio = pipeline.synthesize("Hello, world!").await?;
audio.save_wav("hello.wav")?;
Ok(())
}§Streaming Synthesis
use voirs_sdk::prelude::*;
use futures::StreamExt;
#[tokio::main]
async fn main() -> Result<()> {
let pipeline = Arc::new(VoirsPipelineBuilder::new().build().await?);
let mut stream = pipeline.synthesize_stream(
"This is a longer text that will be synthesized in real-time."
).await?;
while let Some(chunk) = stream.next().await {
let audio_chunk = chunk?;
// Process audio chunk in real-time
println!("Received {} samples", audio_chunk.len());
}
Ok(())
}§Voice Management
use voirs_sdk::prelude::*;
#[tokio::main]
async fn main() -> Result<()> {
let pipeline = VoirsPipelineBuilder::new()
.with_voice("female_voice")
.build()
.await?;
// List available voices
let voices = pipeline.list_voices().await?;
for voice in voices {
println!("Available voice: {} ({})", voice.name, voice.language);
}
// Switch voice at runtime
pipeline.set_voice("male_voice").await?;
let audio = pipeline.synthesize("Speaking with a different voice").await?;
Ok(())
}§Advanced Configuration
use voirs_sdk::prelude::*;
#[tokio::main]
async fn main() -> Result<()> {
let pipeline = VoirsPipelineBuilder::new()
.with_quality(QualityLevel::High)
.with_gpu_acceleration(true)
.with_threads(4)
.build()
.await?;
let audio = pipeline.synthesize("High quality synthesis!").await?;
audio.save_wav("quality_output.wav")?;
Ok(())
}§Configuration and Quality Control
use voirs_sdk::prelude::*;
use std::path::PathBuf;
#[tokio::main]
async fn main() -> Result<()> {
let pipeline = VoirsPipelineBuilder::new()
.with_quality(QualityLevel::High)
.with_threads(4)
.with_cache_dir(PathBuf::from("/tmp/voirs-cache"))
.build()
.await?;
let audio = pipeline.synthesize("High quality synthesis").await?;
// Access audio properties
println!("Sample rate: {} Hz", audio.sample_rate());
println!("Duration: {:.2} seconds", audio.duration());
Ok(())
}§Performance
The VoiRS SDK is designed for high performance:
- Initialization: ≤ 2 seconds (cold start with model download)
- Synthesis Latency: ≤ 100ms overhead per synthesis
- Memory Usage: ≤ 50MB SDK overhead
- Real-time Factor: ≤ 0.5 (synthesis faster than playback)
- Concurrent Operations: 100+ simultaneous operations supported
§Error Handling
All operations return Result<T, VoirsError> for comprehensive error handling:
use voirs_sdk::prelude::*;
#[tokio::main]
async fn main() {
match VoirsPipelineBuilder::new().build().await {
Ok(pipeline) => {
match pipeline.synthesize("Hello!").await {
Ok(audio) => println!("Success! {} samples", audio.len()),
Err(e) => eprintln!("Synthesis error: {}", e),
}
}
Err(e) => eprintln!("Pipeline creation error: {}", e),
}
}§Feature Flags
gpu: Enable GPU acceleration for modelsonnx: Enable ONNX runtime supportdefault: Standard CPU-based processing
§Platform Support
- Operating Systems: Linux, macOS, Windows
- Architectures: x86_64, ARM64
- Runtimes: Tokio async runtime required
Re-exports§
pub use adaptive::AdaptiveConfig;pub use adaptive::AdaptiveController;pub use adaptive::QualityTarget;pub use audio::AudioBuffer;pub use builder::VoirsPipelineBuilder;pub use capabilities::CapabilityManager;pub use diagnostics::ProductionReadiness;pub use diagnostics::ReadinessConfig;pub use diagnostics::ReadinessReport;pub use error::VoirsError;pub use performance::PerformanceMonitor;pub use pipeline::VoirsPipeline;pub use traits::AcousticModel;pub use traits::G2p;pub use traits::Vocoder;pub use types::*;
Modules§
- adapters
- Trait adapters for integrating VoiRS component crates with the SDK.
- adaptive
- Adaptive quality control and optimization for VoiRS SDK.
- async
- audio
- Audio processing module with modular architecture.
- batch
- Advanced batch processing optimization for VoiRS SDK.
- builder
- Pipeline builder for fluent API construction.
- cache
- Comprehensive caching system for VoiRS SDK.
- capabilities
- Feature capability detection and negotiation system for VoiRS SDK.
- config
- Configuration management for VoiRS.
- diagnostics
- Production readiness checker and diagnostic tools for VoiRS SDK.
- error
- Enhanced error system for VoiRS SDK.
- logging
- Logging configuration and utilities for VoiRS.
- memory
- Memory management and optimization for VoiRS SDK
- model_
runtime - Model runtime module for unified ONNX and model loading.
- performance
- Performance monitoring and metrics utilities for VoiRS SDK.
- pipeline
- VoiRS synthesis pipeline implementation.
- plugins
- Plugin system for extensible audio processing and effects.
- prelude
- Prelude module with commonly used imports.
- profiling
- Comprehensive performance profiling and analysis for VoiRS SDK.
- streaming
- Streaming synthesis module with modular architecture.
- traits
- Core traits for VoiRS components.
- types
- Core types for VoiRS speech synthesis.
- validation
- versioning
- Semantic versioning compliance and API stability guarantees for VoiRS
- voice
- Voice management module with modular architecture.
Macros§
- log_
audio - log_
model - log_
synthesis - Macros for convenient logging with context
- measure_
performance - Convenience macro for measuring operation performance.
- report_
error - Convenience macro for error reporting
- voirs_
error - Convenience macro for creating VoirsError instances
- with_
recovery - Convenience macro for error recovery
Type Aliases§
- Result
- Result type alias for VoiRS operations