Expand description
Log-mel filterbank (fbank) feature extraction for speaker embeddings.
Typical parameters for ECAPA-TDNN (16 kHz):
n_fft = 512win_length = 400(25 ms)hop_length = 160(10 ms)n_mels = 80f_min = 20.0,f_max = 7600.0pre_emphasis = 0.97
Structs§
- Fbank
Config - Configuration for log-mel filterbank extraction.
- Fbank
Extractor - Cached log-mel filterbank extractor.
Enums§
- Fbank
Error - Error during fbank computation.
Functions§
- apply_
cmvn - { true }
pub fn apply_cmvn(frames: &[Vec<f32>]) -> Vec<Vec<f32>>{ ret.len() == frames.len() } Apply cepstral mean normalization (CMN) to fbank features.