Trait Embedder

Source

pub trait Embedder: Send + Sync {
    // Required methods
    fn dim(&self) -> usize;
    fn embed(&self, audio: &[f32]) -> Result<Vec<f32>, EmbedderError>;

    // Provided method
    fn embed_batch(
        &self,
        audios: &[&[f32]],
    ) -> Result<Vec<Vec<f32>>, EmbedderError> { ... }
}

Expand description

Speaker embedding extractor — turns a slice of 16 kHz mono audio into a fixed-dimension embedding vector. Implementations are expected to L2-normalize their output so cosine similarity is a meaningful metric downstream.

In v1.0 (M2) the polyvoice crate introduces Embedder as the canonical trait. The legacy EmbeddingExtractor trait and its implementations (FbankOnnxExtractor, OnnxEmbeddingExtractor, DummyExtractor) remain available unchanged — M6 will deprecate them.

Required Methods§

Source

fn dim(&self) -> usize

Output dimension of this embedder. Constant per instance.

Source

fn embed(&self, audio: &[f32]) -> Result<Vec<f32>, EmbedderError>

Compute an embedding for one audio segment.

Requires: audio is 16 kHz mono PCM. Guarantees on Ok: result.len() == self.dim() and the vector is L2-normalized (|sum(x²)¹ᐟ² − 1.0| < 1e-3).