Module redact

Expand description

Heuristic secrets redaction.

§Threat model

Hooks (post-tool-use, stop, pre-compact) capture raw conversation / tool-call text into the distill queue. That text occasionally contains:

API keys (sk-…, pk-…, xai-…, xoxb-…)
Bearer / JWT tokens (Bearer ey…, eyJ… standalone)
GitHub tokens (ghp_…, gho_…, ghu_…, ghr_…, ghs_…)
Long opaque base64 / hex blobs that might be tokens

We deliberately use simple regex heuristics — NOT a perfect parser. The goal is “strip the obvious 90%, surface a flag for the rest”. The flag is consumed by Stop hook to decide whether the signal is safe to write or should be dropped entirely.

§What this module does NOT do

It does NOT promise zero-secret output. Format-specific tokens (custom envs, internal API formats) will slip through.
It does NOT ship the redacted payload anywhere — the caller decides whether to record / drop / send to sampling.
It does NOT redact in place; we always return a new String so callers can keep the original for audit / forensic logs.

§Output

redact returns a RedactReport with:

redacted — the cleaned text
hits — list of (pattern_name, count) for each pattern that matched. Stop hook uses this to decide:
- 0 hits → write payload as-is
- 1+ hits → write the redacted version + record redacted_kinds in metadata; or drop entirely if a strict policy is enabled (R4 may add this).

Structs§

RedactHit
RedactReport

Functions§

redact: Redact known secret patterns from text. Returns the cleaned text alongside per-pattern hit counts.

Module redact

Module redact Copy item path

§Threat model

§What this module does NOT do

§Output

Structs§

Functions§

Module redact