Expand description
Voice style transfer module (GH-132).
Provides voice style transfer primitives for:
- Prosody transfer (pitch, rhythm, energy)
- Timbre conversion (spectral characteristics)
- Cross-lingual style transfer
§Architecture
Source Audio → Content Encoder → Linguistic Features
↓
Reference Audio → Style Encoder → Style Vector → Decoder → Styled Audio§Example
use aprender::voice::style::{StyleConfig, StyleVector, prosody_distance};
let style_a = StyleVector::new(vec![0.5, 0.3, 0.2], vec![0.1, 0.2, 0.3], vec![0.4, 0.5, 0.6]);
let style_b = StyleVector::new(vec![0.6, 0.4, 0.3], vec![0.2, 0.3, 0.4], vec![0.5, 0.6, 0.7]);
let distance = prosody_distance(&style_a, &style_b);
assert!(distance >= 0.0);§References
- Qian, K., et al. (2019).
AutoVC: Zero-Shot Voice Style Transfer. - Wang, Y., et al. (2018). Style Tokens for Expressive Speech Synthesis.
- Chen, M., et al. (2021). Adaspeech: Adaptive Text to Speech for Custom Voice.
§PMAT Compliance
- Zero
unwrap()calls - All public APIs return
Result<T, E>where fallible
Structs§
- Auto
VcTransfer - AutoVC-based voice style transfer.
- GstEncoder
- Global Style Token (GST) based style encoder.
- Style
Config - Configuration for voice style transfer.
- Style
Vector - Voice style vector capturing prosody, timbre, and rhythm.
Traits§
- Style
Encoder - Trait for style encoding from audio.
- Style
Transfer - Trait for voice style transfer.
Functions§
- average_
styles - Average multiple style vectors.
- prosody_
distance - Compute prosody distance between two styles.
- style_
distance - Compute total style distance (Euclidean).
- style_
from_ embedding - Create style from speaker embedding (approximate).
- timbre_
distance - Compute timbre distance between two styles.