aura-redact
Two-pass secret and PII scrubber. Regex patterns catch what they can, then a Shannon-entropy pass catches the rest.
[]
= "0.1"
Why
Most secret scanners miss random tokens that don't match a known prefix. Most entropy scanners flag normal English. aura-redact runs both, in order, so the regex pass catches the obvious stuff (ghp_…, sk-…, emails, IPs) and the entropy pass catches everything else over 5.2 bits/char.
Use it as a last line of defense before sending text to a third party — LLM context, error trackers, telemetry, support bundles.
Example
use Redactor;
let dirty = "
Auth: ghp_aBcDeFgHiJkLmNoPqRsTuVwXyZ12345
From: alice@example.com
Server: 10.0.42.17
Random: kJ8s2nF3lPq9wXvB7tYzM5cR1aH4dGuEi6oN0bVx
";
let clean = scrub;
println!;
Output:
Auth: [REDACTED_TOKEN]
From: [REDACTED_EMAIL]
Server: [REDACTED_IP]
Random: [REDACTED_HIGH_ENTROPY]
What it catches
| Category | Method |
|---|---|
| Emails | regex |
| IPv4 addresses | regex |
sk-…, ghp_…, xoxb-…, AIza… tokens |
regex |
| Random / base64 / cryptographic keys (>20 chars, >5.2 bits/char entropy) | Shannon entropy |
| Normal English | preserved (sits at ~4.0–4.8 bits/char) |
Status
- ✅ Pattern pass (email, IP, common token prefixes)
- ✅ Entropy pass (Shannon, configurable threshold internally)
- ⏳ Configurable patterns and threshold via builder API (PRs welcome)
Origin
Extracted from Aura — the semantic version control engine for AI-generated code. Aura uses aura-redact before forwarding any source-code-derived strings to external LLM APIs.
License
Apache-2.0