whisperforge-core
GPU-accelerated Whisper model inference with streaming audio, quantization, and KV-cached decoding.
Quick Links
- Full Documentation: WhisperForge Repository
- Installation: See Installation Guide
- Examples: Library Usage
Features
- All Whisper model sizes (tiny.en through large-v2/v3)
- GPU acceleration via WGPU (Vulkan/DX12/Metal)
- burn-flex backend: CPU + automatic GPU dispatch
- INT8 quantization (~4× compression)
- Streaming audio pipeline with resampling
- KV-cache O(n) decoder
- Per-token timestamps via cross-attention
Usage
use ;
use Path;
let config = tiny_en;
let model = load?;
let transcript = model.transcribe?;
println!;
See Also
whisperforge-cli— Command-line binarywhisperforge-convert— Model converterwhisperforge-align— VAD & SRT outputwhisperforge-diarize— Speaker diarization
For full documentation, visit the WhisperForge repository.