Expand description
Heuristic secrets redaction.
§Threat model
Hooks (post-tool-use, stop, pre-compact) capture raw conversation / tool-call text into the distill queue. That text occasionally contains:
- API keys (
sk-…,pk-…,xai-…,xoxb-…) - Bearer / JWT tokens (
Bearer ey…,eyJ…standalone) - GitHub tokens (
ghp_…,gho_…,ghu_…,ghr_…,ghs_…) - Long opaque base64 / hex blobs that might be tokens
We deliberately use simple regex heuristics — NOT a perfect parser. The goal is “strip the obvious 90%, surface a flag for the rest”. The flag is consumed by Stop hook to decide whether the signal is safe to write or should be dropped entirely.
§What this module does NOT do
- It does NOT promise zero-secret output. Format-specific tokens (custom envs, internal API formats) will slip through.
- It does NOT ship the redacted payload anywhere — the caller decides whether to record / drop / send to sampling.
- It does NOT redact in place; we always return a new String so callers can keep the original for audit / forensic logs.
§Output
redact returns a RedactReport with:
redacted— the cleaned texthits— list of(pattern_name, count)for each pattern that matched. Stop hook uses this to decide:- 0 hits → write payload as-is
- 1+ hits → write the redacted version + record
redacted_kindsin metadata; or drop entirely if a strict policy is enabled (R4 may add this).
Structs§
Functions§
- redact
- Redact known secret patterns from
text. Returns the cleaned text alongside per-pattern hit counts.