Expand description
Log-mel filterbank (fbank) feature extraction for speaker embeddings.
Typical parameters for ECAPA-TDNN (16 kHz):
n_fft = 512win_length = 400(25 ms)hop_length = 160(10 ms)n_mels = 80f_min = 20.0,f_max = 7600.0pre_emphasis = 0.97
Structs§
- Fbank
Config - Configuration for log-mel filterbank extraction.
- Fbank
Extractor - Cached log-mel filterbank extractor.
Enums§
- Fbank
Error - Error during fbank computation.
Functions§
- apply_
cmvn - Apply cepstral mean normalization (CMN) to fbank features.
- compute_
fbank Deprecated - Standalone log-mel filterbank computation.