Skip to main content

Module features

Module features 

Source
Expand description

Log-mel filterbank (fbank) feature extraction for speaker embeddings.

Typical parameters for ECAPA-TDNN (16 kHz):

  • n_fft = 512
  • win_length = 400 (25 ms)
  • hop_length = 160 (10 ms)
  • n_mels = 80
  • f_min = 20.0, f_max = 7600.0
  • pre_emphasis = 0.97

Structs§

FbankConfig
Configuration for log-mel filterbank extraction.
FbankExtractor
Cached log-mel filterbank extractor.

Enums§

FbankError
Error during fbank computation.

Functions§

apply_cmvn
Apply cepstral mean normalization (CMN) to fbank features.
compute_fbankDeprecated
Standalone log-mel filterbank computation.