Skip to main content

Crate oxiwhisper

Crate oxiwhisper 

Source
Expand description

§OxiWhisper

Pure Rust Whisper speech-to-text inference engine with zero C/C++ dependencies.

OxiWhisper loads GGML-format Whisper models and transcribes audio to text, supporting quantized inference (Q4_0, Q5_0, Q8_0), streaming, beam search, word-level timestamps, and SIMD-accelerated kernels (AVX2, NEON, WASM simd128).

§Quick Start

use oxiwhisper::{WhisperModel, TranscribeOptions};
use std::path::Path;

let model = WhisperModel::from_file(Path::new("ggml-tiny.bin"))?;
let audio = oxiwhisper::audio::load_wav(Path::new("audio.wav"))?;
let text = model.transcribe(&audio, &TranscribeOptions::default())?;
println!("{text}");

Re-exports§

pub use types::*;

Modules§

attention
audio
Pure Rust WAV file loader for oxiwhisper.
beam_search
Beam search decoder for Whisper.
decode_utils
Helper functions for Whisper decoder: logit manipulation, sampling, n-gram blocking.
decoder
Core Whisper text decoder: forward pass, greedy/sample decoding, language detection.
dtw
Dynamic Time Warping (DTW) for word-level timestamp alignment.
encoder
fft
FFT module backed by OxiFFT. Provides Complex type and FFT functions used throughout the codebase.
hallucination
Hallucination detection via character entropy and compression ratio analysis.
linear
mel
mel_filters
Programmatic generation of the Whisper mel filter bank.
model
quantize
GGML quantization support: Q4_0, Q5_0, and Q8_0 block quantization and dequantization.
stream
Streaming transcription support for incremental audio processing.
subtitle
Subtitle export: SRT and WebVTT formats.
tensor
tokenizer
types
Public types, error handling, and validation for oxiwhisper.
vad
Voice Activity Detection (VAD) module.

Structs§

WhisperModel