Skip to main content

Crate whisper_apr

Crate whisper_apr 

Source
Expand description

§Whisper.apr

WASM-first automatic speech recognition engine implementing OpenAI’s Whisper architecture.

§Overview

Whisper.apr is designed from inception for WASM deployment via wasm32-unknown-unknown, leveraging Rust’s superior WASM toolchain for:

  • 30-40% smaller binary sizes through tree-shaking
  • Native WASM SIMD 128-bit intrinsics without Emscripten overhead
  • Zero-copy audio buffer handling via shared memory

§Quick Start

use whisper_apr::{WhisperApr, TranscribeOptions};

let whisper = WhisperApr::load("base.apr")?;
let result = whisper.transcribe(&audio_samples, TranscribeOptions::default())?;
println!("{}", result.text);

§Features

  • std (default): Standard library support
  • wasm: WASM bindings via wasm-bindgen
  • simd: SIMD acceleration via trueno
  • tracing: Performance tracing via renacer

Re-exports§

pub use error::WhisperError;
pub use error::WhisperResult;
pub use error::WhisperError;
pub use error::WhisperResult;

Modules§

audio
Audio preprocessing module
backend
Backend abstraction and automatic selection Backend abstraction and selection (WAPR-140 to WAPR-141)
benchmark_generated
Benchmark infrastructure for multi-backend comparison Benchmark infrastructure for multi-backend performance comparison
detection
Language detection module
diarization
Speaker diarization module (who spoke when) Speaker diarization module (WAPR-150 to WAPR-153)
error
Error types for Whisper.apr
format
APR Model Format (v2)
gpu
WebGPU compute backend for accelerated inference WebGPU compute backend for accelerated inference (WAPR-120 to WAPR-143)
inference
Inference engine
memory
Memory management for Whisper inference
model
Model loading and inference
parallel
Unified parallelism abstraction for CLI and WASM (§11.3.2) Unified parallelism abstraction for CLI and WASM
probe
Activation probing for forward-pass debugging (WAPR-MOONSHINE-013) Activation probing infrastructure for forward-pass debugging
progress
Progress tracking and callbacks
publish
HuggingFace Hub publishing (WAPR-PUB-001) HuggingFace Hub Publishing Module (WAPR-PUB-001)
realizar_inference
Re-exports world-class production inference primitives from realizar.
simd
SIMD-accelerated operations via trueno
timestamps
Timestamp extraction and word-level alignment (WAPR-160 to WAPR-163)
tokenizer
BPE tokenizer
trace
Tracing utilities for pipeline instrumentation
vad
Voice Activity Detection (VAD)
verify
Pre-publish verification (WAPR-PUB-001) Pre-publish Verification Module (WAPR-PUB-001)
vocabulary
Vocabulary and hotword customization Custom vocabulary fine-tuning module (WAPR-170 to WAPR-173)

Macros§

trace_enter
Enter a tracing span (no-op when tracing feature is disabled)
trace_event
Log a tracing event (no-op when tracing feature is disabled)
trace_span
Create a tracing span (no-op when tracing feature is disabled)

Structs§

BatchTranscriptionResult
Result of batch transcription (WAPR-083)
PartialTranscriptionResult
Result of partial transcription during streaming (WAPR-101)
ProfilingStats
Performance profiling statistics (WAPR-PERF-004)
Segment
A timestamped segment of transcription
StreamingSession
Streaming transcription session (WAPR-101)
SummarizeOptions
Options for LFM2 summarization (Phase 3 API - Section 18.5)
TranscribeOptions
Options for transcription
TranscribeSummaryResult
Result of transcription with summarization (Phase 3 API - Section 18.5)
TranscriptionResult
Result of transcription
VadSpeechSegment
A speech segment detected by VAD (WAPR-093)
VadTranscriptionResult
Result of VAD-triggered transcription (WAPR-093)
WhisperApr
Main Whisper ASR engine

Enums§

DecodingStrategy
Decoding strategy for transcription
ModelType
Whisper model configuration
Task
Task type for transcription