leakguard

Fast, zero-dependency redaction of secrets & PII from text and logs — in pure Rust.

leakguard finds and removes sensitive data — emails, credit cards, IP addresses, JWTs, SSNs, MAC addresses, AWS keys, and URLs with embedded credentials — from arbitrary strings and log lines. It's a library and a CLI.

use leakguard::Redactor;

let s = Redactor::new();
let clean = s.clean("Contact alice@example.com from 10.0.0.1");
assert_eq!(clean, "Contact [REDACTED:EMAIL] from [REDACTED:IPV4]");

Why leakguard?

The Rust ecosystem has crypto, parsers, and web frameworks — but no small, maintained, dependency-free library for the everyday job of not leaking PII and secrets into your logs. Python has scrubadub, JS has redact-pii. leakguard fills that gap with:

Zero dependencies. No regex, no lazy_static, nothing. Just core + alloc. Tiny build, tiny binary, fast compile.
#![no_std] friendly. Works in embedded / WASM with default-features = false.
#![forbid(unsafe_code)]. 100% safe Rust.
Correct by construction. Match offsets always land on UTF-8 boundaries, Luhn-validated card numbers, range-checked IP octets — fewer false positives.
Extensible. Plug in your own detectors with a closure.
Batteries included. A leakguard CLI you can pipe logs through.

Install

# Library
[dependencies]
leakguard = "0.4.0"

# CLI
cargo install leakguard

Library usage

Pick a masking strategy

use leakguard::{Redactor, Mask};

// [REDACTED:EMAIL]  (default)
Redactor::new();

// fixed string, from either a literal or a runtime String
Redactor::new().mask(Mask::fixed("***"));
Redactor::new().mask(Mask::fixed(String::from("***")));

// keep the last 4 chars: 4111 1111 1111 1111 -> ***************1111
Redactor::new().mask(Mask::Partial { keep_last: 4, ch: '*' });

// stable non-cryptographic fingerprint for correlation (not anonymization)
Redactor::new().mask(Mask::Hash);

Pick what to detect

use leakguard::{Redactor, Kind};

let s = Redactor::only(&[Kind::Email, Kind::CreditCard]);
let s = Redactor::new().without(&Kind::IpV4); // everything except IPv4

Inspect without mutating

use leakguard::Redactor;

let s = Redactor::new();
for m in s.find("email a@b.com ip 10.0.0.1") {
    println!("{} at {}..{}", m.kind, m.start, m.end);
}
assert!(s.is_dirty("token AKIAIOSFODNN7EXAMPLE"));

Add a custom detector

use leakguard::{Redactor, Kind, FnDetector, Match};

let tickets = FnDetector::new(Kind::Custom("TICKET"), |input, out| {
    let mut from = 0;
    while let Some(i) = input[from..].find("JIRA-") {
        let start = from + i;
        let mut end = start + 5;
        let b = input.as_bytes();
        while end < b.len() && b[end].is_ascii_digit() { end += 1; }
        out.push(Match::new(Kind::Custom("TICKET"), start, end));
        from = end;
    }
});

let s = Redactor::new().with_detector(tickets);
assert_eq!(s.clean("see JIRA-1234"), "see [REDACTED:TICKET]");

CLI usage

# Pipe a live log through it
tail -f app.log | leakguard

# Redact a file to stdout, keeping last 4 chars
leakguard --mask partial --keep 4 access.log > clean.log

# Only redact emails and IPv4, masking with '#'
leakguard --only email,ipv4 --mask char --char '#' < input.txt

# Redact everything except phone numbers
leakguard --without phone app.log

# Print supported detector names
leakguard --list-kinds

# CI guard: fail the build if a file contains secrets; print kinds/offsets to stderr
leakguard --check --verbose secrets-scan.txt || echo "found sensitive data!"

Detectors

Kind	Example	Notes
`Email`	`alice@example.com`	requires a real-looking TLD
`CreditCard`	`4111 1111 1111 1111`	Luhn-validated, 13–19 digits
`IpV4`	`192.168.0.1`	each octet range-checked 0–255
`IpV6`	`2001:db8::1`	supports `::` compression
`Jwt`	`eyJ….eyJ….sig`	three base64url segments
`UsSsn`	`123-45-6789`	rejects invalid area numbers
`MacAddress`	`00:1A:2B:3C:4D:5E`	`:` or `-` separators
`AwsAccessKey`	`AKIAIOSFODNN7EXAMPLE`	AKIA/ASIA/… + 16 chars
`UrlCredentials`	`https://user:pass@host`	redacts the `user:pass` userinfo
`PhoneNumber`	`+1 (415) 555-0132`	conservative; needs grouping/`+`
`GitHubToken`	`ghp_…`, `github_pat_…`	PAT / OAuth / app / refresh
`SlackToken`	`xoxb-…`, `xoxp-…`	bot / user / app tokens
`StripeKey`	`sk_live_…`, `pk_test_…`	secret / restricted / publishable
`GoogleApiKey`	`AIza…` (39 chars)	fixed-length token
`OpenAiKey`	`sk-…`, `sk-proj-…`	hyphenated form (≠ Stripe `sk_`)
`PrivateKey`	`-----BEGIN … PRIVATE KEY-----`	whole PEM block, incl. body
`Iban`	`DE89370400440532013000`	mod-97 checksum-validated
`GenericSecret`	high-entropy tokens	opt-in `HighEntropy` detector
`Custom(&str)`	anything you want	via `FnDetector`

GenericSecret (the HighEntropy detector) is not in the defaults — it's the most false-positive-prone, so you enable it explicitly:
use leakguard::{Redactor, detectors::HighEntropy};
let s = Redactor::new().with_detector(HighEntropy::default());
// or tune it: HighEntropy::new(/* min_len */ 24, /* min_entropy bits */ 4.0)

Security model and limitations

leakguard is a best-effort redaction tool intended to reduce accidental leakage of secrets and personally identifiable information in logs, text, and CI workflows. It is not a substitute for secret management, access control, code review, or incident response.

Important limitations:

Detectors are intentionally conservative in several places to reduce false positives, so some real secrets or PII formats may not be detected.
Some detectors can still produce false positives, especially phone numbers and opt-in high-entropy scanning.
Redaction should happen as early as possible, before sensitive data leaves your process or enters persistent logs.
Mask::Hash is a stable, non-cryptographic fingerprint for correlation only. It is not anonymization and does not protect low-entropy values from guessing or dictionary attacks.
Keep raw logs and unredacted inputs protected. Treat leakguard as a defense in depth layer, not as the only control protecting sensitive data.

If you believe you found a vulnerability or a serious redaction bypass, please report it privately through GitHub's vulnerability reporting flow when available, or contact the maintainer through GitHub before opening a public issue.

Performance

leakguard uses hand-written, single-pass byte scanners — no regex backtracking. Detection is roughly linear in input size. Run the bundled example and benchmark harness:

cargo run --example redact_logs
cargo run --release --example bench

The benchmark harness is intentionally dependency-free and uses std::time::Instant, so run it several times on an otherwise idle machine when comparing changes.

`no_std`

[dependencies]
leakguard = { version = "0.4", default-features = false }

This drops the CLI and std-only conveniences but keeps the full detection and redaction API (it needs alloc).

Contributing

Issues and PRs welcome — especially new detectors and false-positive reports with sample inputs. Run cargo test && cargo clippy --all-targets -- -D warnings before submitting.

Author

Created and maintained by ptukovar.

License

Licensed under either of MIT or Apache-2.0 at your option.

leakguard 0.4.0