Expand description
Detect prompt-echo hallucinations from cloud STT backends.
Whisper-family models (OpenAI whisper-1, gpt-4o-*-transcribe, and most
Whisper-derived APIs) condition decoding on the optional prompt parameter.
When the audio carries no speech the model has nothing to anchor decoding
to and falls back to its strongest prior — the prompt itself — emitting it
verbatim (or in long contiguous chunks) as the “transcription”.
Without filtering, those echoes are typed at the cursor by whisrs, which for a multi-hundred-character prompt can take tens of seconds at the configured key delay. This module provides a conservative substring/word-run heuristic that flags the obvious cases without false-positiving on real speech that happens to use vocabulary present in the prompt.
Functions§
- is_
prompt_ echo - Heuristically classify
responseas an echo ofprompt.