Expand description
scribble — a small, focused transcription library built on top of Whisper.
§Overview
Scribble provides a clean, idiomatic Rust API for audio transcription, designed to work equally well in CLI tools and long-running services.
At a high level, Scribble wires together:
- Media demuxing and audio decoding (via Symphonia)
- Audio normalization and resampling (mono, 16 kHz)
- Optional Voice Activity Detection (VAD)
- Whisper inference
- Pluggable output encoders (JSON, VTT, etc.)
The library emphasizes:
- Explicit control flow
- Streaming-friendly design
- Clear separation of concerns
- Minimal surprises for callers
Most consumers should start with scribble::Scribble.
Modules§
- audio_
pipeline - Audio resampling, downmixing, and chunk emission pipeline. Audio normalization pipeline for Scribble.
- ctx
- Whisper model loading and context management.
- decode
- Codec-level decode helpers. Decoder helpers built on top of Symphonia.
- decoder
- Streaming-friendly audio decoding and normalization helpers.
Stream-decode media (audio/video containers) into Whisper-friendly mono
f32@ 16kHz, emitting fixed-size chunks via a callback. - demux
- Low-level demux helpers (container probing, packet iteration). Demux helpers for Symphonia.
- json_
array_ encoder - JSON array encoder.
- logging
- Logging configuration and control.
- opts
- User-configurable transcription options.
- output_
type - Output format selection.
- scribble
- User-facing transcription entry point and orchestration logic. High-level API for running transcriptions with Scribble.
- segment_
encoder - Shared encoder trait definitions.
- segments
- Segment data structures and transcription helpers.
- vad
- Voice Activity Detection (VAD) utilities and policies.
- vtt_
encoder - WebVTT encoder.
- wav
- WAV helpers (primarily for tests and simple inputs).