pub fn looks_corrupted(text: &str) -> Option<&'static str>Expand description
Detect output that almost-certainly came from a corrupted provider stream (binary bytes decoded as UTF-8, mojibake from wrong encoding, KV-cache poisoning after timeout/retry, etc.) and should NOT be committed to conversation history.
Returns Some(reason) when the text is corrupted; None when it looks
like real model output. Conservative by design: only fires on
unambiguously non-textual signals. False positives here would silently
drop legitimate responses, which is far worse than letting one garbage
turn through.
Trigger context (2026-05-02 datalog evidence): deepseek-v4-flash at
~28K ctx after a successful file write hung 155s on the next turn,
then the framework’s stream-timeout retry returned P<ďĎĎĎĎ (UTF-8
bytes 0x50 0x3C 0xC4 0x8F 0xC4 0x8E ×4 — Latin Extended-A mojibake of
what was almost certainly raw binary in the provider’s response
buffer). Once that string lands in conversation history the next turn
sees its own garbage as prior context and the session is unrecoverable.