llm-pii-redact
Regex-based PII redaction for LLM prompts and tool outputs, with reversible placeholders.
use Redactor;
let r = default;
let out = r.redact;
// out.text -> "Email me at <EMAIL_0> or call <PHONE_US_0>"
// out.mapping -> { "<EMAIL_0>": "ops@example.invalid",
// "<PHONE_US_0>": "555-123-4567" }
let answer_from_llm = format!;
let restored = r.reveal;
// restored -> "Confirmed: ops@example.invalid"
Stable placeholders mean the LLM keeps coherent references (talk about "<EMAIL_0>" five times in the reply, restore to the real address everywhere). Repeated values share a single placeholder, so the redacted text is deterministic.
Catches by default:
| Type | Example shape |
|---|---|
EMAIL |
ops@example.invalid |
PHONE_US |
555-123-4567, +1 (555) 123-4567 |
SSN |
000-00-0000, 9 contiguous digits |
CREDIT_CARD |
13-19 digit runs, Luhn-checked |
IP_V4 |
192.0.2.10 |
IP_V6 |
2001:db8::1, ::1 |
IBAN |
DE89370400440532013000 |
URL |
http://, https:// |
The credit-card detector runs the Luhn checksum on every candidate. A 16-digit run with a flipped last digit is dropped.
Why
tool-secret-scrubber covers API keys, JWTs, bearer tokens, AWS keys. It is the right tool for "do not log this." llm-pii-redact is the right tool for "send this through the LLM, then put the real values back." That second case wants:
- Reversible mapping, not a one-way redact.
- Per-value stable placeholders so the model can talk about the same person twice.
- PII detectors (emails, phones, SSN, cards) rather than credential detectors.
Install
[]
= "0.1"
Optional serde feature to derive Serialize/Deserialize for Redacted:
[]
= { = "0.1", = ["serde"] }
Use
Default detectors:
use Redactor;
let r = default;
let out = r.redact;
assert!;
One detector at a time:
use Redactor;
let r = email; // or ::phone(), ::ssn(), ::cc(), ::ip()
let out = r.redact;
assert!;
assert!; // phone untouched
Custom pattern:
use Redactor;
let r = default
.with_pattern
.unwrap;
let out = r.redact;
assert_eq!;
Round trip with an LLM:
use Redactor;
let r = default;
let user_message = "Confirm subscription for ops@example.invalid";
let red = r.redact;
// send red.text to the LLM
let assistant_reply = format!; // pretend the LLM said this
let real_reply = r.reveal;
assert_eq!;
What it does NOT do
- No name / address / DOB classifier. Regex only.
- No network calls, no async, no I/O.
- No secret / credential detection. Use
tool-secret-scrubberfor that.
Companion crates
tool-secret-scrubber: API keys, JWTs, bearer tokens, AWS keys.agentguard-rs: network egress allowlist for agent tools.agentvet-rs: tool-arg validator for LLM tool calls.
License
MIT. See LICENSE.