Skip to main content

Module features

Module features 

Source
Expand description

Log-mel filterbank (fbank) feature extraction for speaker embeddings.

Typical parameters for ECAPA-TDNN (16 kHz):

  • n_fft = 512
  • win_length = 400 (25 ms)
  • hop_length = 160 (10 ms)
  • n_mels = 80
  • f_min = 20.0, f_max = 7600.0
  • pre_emphasis = 0.97

Structs§

FbankConfig
Configuration for log-mel filterbank extraction.
FbankExtractor
Cached log-mel filterbank extractor.

Enums§

FbankError
Error during fbank computation.

Functions§

apply_cmvn
{ true } pub fn apply_cmvn(frames: &[Vec<f32>]) -> Vec<Vec<f32>> { ret.len() == frames.len() } Apply cepstral mean normalization (CMN) to fbank features.