cloakrs
cloakrs is a Rust library and CLI for detecting and masking personally identifiable information in text, logs, JSON, CSV, and database dumps.
It ships universal recognizers for emails, phone numbers, credit cards, IBANs, IP addresses, URLs, API keys, JWTs, AWS access keys, MAC addresses, hostnames, user home paths, crypto wallet addresses, and context-dependent dates of birth. Locale bundles add identifiers such as US SSNs, Dutch BSNs, UK NINO/NHS numbers, German Steuer-IDs, Indian Aadhaar/PAN values, Brazilian CPF/CNPJ values, and French INSEE/NIR numbers.
See supported entities for the full detection matrix, including validation algorithms, confidence ranges, and examples.
Install
For local development:
Quick Start
use Locale;
let scanner = default_registry
.into_scanner_builder
.locale
.build?;
let result = scanner.scan?;
assert_eq!;
# Ok::
CLI Examples
# Scan a file and print a human-readable report.
# Produce SARIF for code scanning systems.
# Mask a CSV file, scanning selected columns only.
Architecture
The workspace is split into five crates with one-way dependencies:
cloakrs-core -> cloakrs-patterns -> cloakrs-locales -> cloakrs-adapters -> cloakrs-cli
cloakrs-core: scanner, recognizer trait, shared types, masking strategiescloakrs-patterns: universal recognizers such as email, phone, card, IBANcloakrs-locales: country-specific recognizers such as US SSN and Dutch BSNcloakrs-adapters: streaming handlers for text, JSON, CSV, logs, and SQL dumpscloakrs-cli: thecloakrscommand-line interface
Comparison
| Tool | Language | Runtime requirements | Primary fit | Benchmark status |
|---|---|---|---|---|
| cloakrs | Rust | Single native binary | Fast local scanning and masking | Criterion suite included |
| Microsoft Presidio | Python | Python plus NLP dependencies | NLP-rich enterprise workflows | Run locally for same-hardware numbers |
| DataFog | Python | Python runtime | App-level PII detection | Run locally for same-hardware numbers |
| scrubadub | Python | Python runtime | Text scrubbing | Not benchmarked in-tree |
| piidetect | Go | Native binary | Lightweight PII detection | Not benchmarked in-tree |
Run the local benchmark suite with:
The benchmark harness covers 1KB through 10MB inputs for plain text, JSON, and CSV, each recognizer individually, and all masking strategies. See docs/benchmarking.md.
Guides
- Adding recognizers
- Adding locale recognizers
- Supported entities
- CI/CD integration
- Benchmarking
- Release checklist
Status
The first Rust release is published on crates.io. See implementation status for completed work and known gaps.
License
MIT. See LICENSE.md.