llm-pii-redact 0.1.0

Regex-based PII redaction for LLM prompts and tool outputs, with reversible placeholders.
Documentation
  • Coverage
  • 100%
    29 out of 29 items documented2 out of 15 items with examples
  • Size
  • Source code size: 39.24 kB This is the summed size of all the files inside the crates.io package for this release.
  • Documentation size: 560.73 kB This is the summed size of all files generated by rustdoc for all configured targets
  • Ø build duration
  • this release: 5s Average build duration of successful builds.
  • all releases: 5s Average build duration of successful builds in releases after 2024-10-23.
  • Links
  • Homepage
  • MukundaKatta/llm-pii-redact-rs
    0 0 0
  • crates.io
  • Dependencies
  • Versions
  • Owners
  • MukundaKatta

llm-pii-redact

Crates.io Documentation License

Regex-based PII redaction for LLM prompts and tool outputs, with reversible placeholders.

use llm_pii_redact::Redactor;

let r = Redactor::default();
let out = r.redact("Email me at ops@example.invalid or call 555-123-4567");
// out.text    -> "Email me at <EMAIL_0> or call <PHONE_US_0>"
// out.mapping -> { "<EMAIL_0>": "ops@example.invalid",
//                  "<PHONE_US_0>": "555-123-4567" }

let answer_from_llm = format!("Confirmed: <EMAIL_0>");
let restored = r.reveal(&answer_from_llm, &out.mapping);
// restored -> "Confirmed: ops@example.invalid"

Stable placeholders mean the LLM keeps coherent references (talk about "<EMAIL_0>" five times in the reply, restore to the real address everywhere). Repeated values share a single placeholder, so the redacted text is deterministic.

Catches by default:

Type Example shape
EMAIL ops@example.invalid
PHONE_US 555-123-4567, +1 (555) 123-4567
SSN 000-00-0000, 9 contiguous digits
CREDIT_CARD 13-19 digit runs, Luhn-checked
IP_V4 192.0.2.10
IP_V6 2001:db8::1, ::1
IBAN DE89370400440532013000
URL http://, https://

The credit-card detector runs the Luhn checksum on every candidate. A 16-digit run with a flipped last digit is dropped.

Why

tool-secret-scrubber covers API keys, JWTs, bearer tokens, AWS keys. It is the right tool for "do not log this." llm-pii-redact is the right tool for "send this through the LLM, then put the real values back." That second case wants:

  • Reversible mapping, not a one-way redact.
  • Per-value stable placeholders so the model can talk about the same person twice.
  • PII detectors (emails, phones, SSN, cards) rather than credential detectors.

Install

[dependencies]
llm-pii-redact = "0.1"

Optional serde feature to derive Serialize/Deserialize for Redacted:

[dependencies]
llm-pii-redact = { version = "0.1", features = ["serde"] }

Use

Default detectors:

use llm_pii_redact::Redactor;

let r = Redactor::default();
let out = r.redact("ping ops@example.invalid");
assert!(out.text.contains("<EMAIL_0>"));

One detector at a time:

use llm_pii_redact::Redactor;

let r = Redactor::email(); // or ::phone(), ::ssn(), ::cc(), ::ip()
let out = r.redact("ops@example.invalid call 555-123-4567");
assert!(out.text.contains("<EMAIL_0>"));
assert!(out.text.contains("555-123-4567")); // phone untouched

Custom pattern:

use llm_pii_redact::Redactor;

let r = Redactor::default()
    .with_pattern("AWS_KEY", r"AKIA[0-9A-Z]{16}")
    .unwrap();
let out = r.redact("key=AKIAABCDEFGHIJKLMNOP ok");
assert_eq!(out.mapping["<AWS_KEY_0>"], "AKIAABCDEFGHIJKLMNOP");

Round trip with an LLM:

use llm_pii_redact::Redactor;

let r = Redactor::default();
let user_message = "Confirm subscription for ops@example.invalid";

let red = r.redact(user_message);
// send red.text to the LLM
let assistant_reply = format!("Confirmed: {}", "<EMAIL_0>"); // pretend the LLM said this
let real_reply = r.reveal(&assistant_reply, &red.mapping);
assert_eq!(real_reply, "Confirmed: ops@example.invalid");

What it does NOT do

  • No name / address / DOB classifier. Regex only.
  • No network calls, no async, no I/O.
  • No secret / credential detection. Use tool-secret-scrubber for that.

Companion crates

License

MIT. See LICENSE.