whisperforge-diarize 0.4.0

Speaker diarization for speech transcription via embedding clustering

Coverage
50%
5 out of 10 items documented0 out of 6 items with examples
Size
Source code size: 13.41 kB This is the summed size of all the files inside the crates.io package for this release.
Documentation size: 340.43 kB This is the summed size of all files generated by rustdoc for all configured targets
Ø build duration
this release: 3s Average build duration of successful builds.
all releases: 20s Average build duration of successful builds in releases after 2024-10-23.
Links
Homepage
bevsxyz/WhisperForge
2 0 1
crates.io
Dependencies
- anyhow ^1.0 normal
Versions
Owners

whisperforge-diarize

Speaker diarization for speech transcription via embedding clustering.

Quick Links

Full Documentation: WhisperForge Repository
Architecture: Overview

Features

Speaker embedding extraction
Cosine similarity clustering
SPEAKER_NN label assignment
Configurable similarity threshold
Works with SRT and JSON output

Usage

use whisperforge_diarize::DiarizationConfig;
use whisperforge_core::{Model, WhisperConfig};

let config = WhisperConfig::tiny_en();
let model = Model::load(Path::new("models/tiny_en_converted"))?;

// With CLI: use --diarize flag
// wf -a audio.wav -m tiny_en_converted --diarize

CLI Integration

The CLI automatically applies diarization labels when using the --diarize flag:

wf -a audio.wav -m tiny_en_converted --diarize --output-format srt -o output.srt

Output includes speaker labels:

1
00:00:00,000 --> 00:00:05,000
SPEAKER_0: Hello, how are you?

2
00:00:05,000 --> 00:00:10,000
SPEAKER_1: I'm doing great, thanks for asking.

See Also

whisperforge-core — Library
whisperforge — CLI binary (wf); wf convert ports HuggingFace safetensors
whisperforge-align — VAD & SRT

For full documentation, visit the WhisperForge repository.