whisperforge-diarize
Speaker diarization for speech transcription via embedding clustering.
Quick Links
- Full Documentation: WhisperForge Repository
- Architecture: Overview
Features
- Speaker embedding extraction
- Cosine similarity clustering
SPEAKER_NNlabel assignment- Configurable similarity threshold
- Works with SRT and JSON output
Usage
use DiarizationConfig;
use ;
let config = tiny_en;
let model = load?;
// With CLI: use --diarize flag
// wf -a audio.wav -m tiny_en_converted --diarize
CLI Integration
The CLI automatically applies diarization labels when using the --diarize flag:
Output includes speaker labels:
1
00:00:00,000 --> 00:00:05,000
SPEAKER_0: Hello, how are you?
2
00:00:05,000 --> 00:00:10,000
SPEAKER_1: I'm doing great, thanks for asking.
See Also
whisperforge-core— Librarywhisperforge— CLI binary (wf);wf convertports HuggingFace safetensorswhisperforge-align— VAD & SRT
For full documentation, visit the WhisperForge repository.