Skip to main content

whisper_guard/
lib.rs

1//! # whisper-guard
2//!
3//! The post-processing layer Whisper should have shipped with.
4//!
5//! [whisper.cpp](https://github.com/ggerganov/whisper.cpp) and its bindings
6//! ([whisper-rs](https://crates.io/crates/whisper-rs), and many forks) hallucinate in
7//! predictable, well-documented ways: looping on silence, generating phantom `[music]`
8//! tags, drifting into a foreign script when the audio is too quiet, gluing voice
9//! commands like *"stop recording"* onto the end of every transcript.
10//!
11//! whisper-guard catches the common patterns, with defaults tuned in production by
12//! [Minutes](https://github.com/silverstein/minutes), an OSS meeting-memory tool that
13//! processes meeting and voice-memo audio across multiple languages.
14//!
15//! ## Quick start
16//!
17//! If you already have `Vec<String>` segments from a transcription engine
18//! (`whisper_state.get_segment(i).to_str()`, a parakeet sidecar, a fork - anything):
19//!
20//! ```
21//! use whisper_guard::clean_segments;
22//!
23//! let raw = vec![
24//!     "Thank you.".to_string(),
25//!     "Thank you.".to_string(),
26//!     "Thank you.".to_string(),
27//!     "Thank you.".to_string(),
28//!     "What's the budget for this quarter?".to_string(),
29//! ];
30//!
31//! let (cleaned, stats) = clean_segments(&raw);
32//! assert!(cleaned.iter().any(|s| s.contains("budget")));
33//! println!("{}", stats.summary());
34//! // → whisper-guard: 5 → 3 segments (2 removed)
35//! //   (the loop collapses to first-occurrence + an annotation line; see CleanOptions)
36//! ```
37//!
38//! That's the whole API for the common case. No builders, no setup, no engine
39//! coupling. Six guards run in a fixed order; opt out individually via
40//! [`CleanOptions`] when you have a good reason.
41//!
42//! ## Works with any whisper variant
43//!
44//! whisper-guard's segment/audio modules are **pure Rust with no whisper-rs dependency**.
45//! If you depend on a forked or pinned `whisper-rs` (common for Metal/CUDA tuning,
46//! GPU patches, or model compatibility), use:
47//!
48//! ```toml
49//! whisper-guard = { version = "0.2", default-features = false }
50//! ```
51//!
52//! …and the cleaning pipeline works regardless of which whisper-rs is in your tree.
53//! The optional `whisper` feature only adds [`params`] presets that wrap
54//! `whisper_rs::FullParams`.
55//!
56//! ## What it catches
57//!
58//! **Pre-transcription audio prep** ([`audio`]):
59//! - Silence stripping with adaptive noise floor
60//! - Auto-normalization for quiet microphones
61//! - Windowed-sinc resampling (32-tap Hann, alias-free) for `44.1k → 16k`
62//!
63//! **Post-transcription segment cleaning** ([`segments`]):
64//! - Consecutive repetition (3+ similar segments collapsed)
65//! - Interleaved A/B/A/B hallucination patterns
66//! - Bracketed noise marker collapse (`[Śmiech]`, `[music]`, `[risas]`, any language)
67//! - Foreign-script hallucination detection (e.g., CJK in a Latin transcript)
68//! - Trailing noise trimming (`[music]`, `[BLANK_AUDIO]`, filler at the end)
69//! - Voice command stripping (`stop recording`, `end recording` at the tail)
70//!
71//! **Whisper parameter presets** ([`params`], requires `whisper` feature):
72//! - Batch transcription params matching `whisper-cli` defaults
73//! - Low-latency streaming params
74
75pub mod audio;
76pub mod segments;
77
78#[cfg(feature = "whisper")]
79pub mod params;
80
81// Re-export the most common entry points
82pub use audio::{normalize_audio, resample, strip_silence};
83pub use segments::{
84    clean_segments, clean_segments_with_options, clean_transcript, strip_trailing_commands,
85    CleanOptions, CleanStats,
86};