tiktag
Rust library + CLI for text anonymization.
tiktag uses a built-in ONNX NER model for PERSON, ORG, and LOCATION, then applies additive regex recognizers such as email.
Install
Or from source:
Quickstart
Download bundled model assets first:
CLI:
|
Library:
use Path;
use Tiktag;
let profiles_path = new;
let mut tiktag = new?;
let out = tiktag.anonymize?;
println!;
Tiktag::new takes an explicit profiles_path; model_dir resolves relative to that file's parent.
CLI
tiktag "<text>"prints anonymized texttiktag --stdinreads input from stdintiktag --jsonemits safe machine-readable outputtiktag --debug-jsonemits reversible replacement metadata for local debugging onlytiktag --show-tokensprints per-token predictions to stderrtiktag downloadfetches bundled model assets
JSON
--jsonfields:schema_version,provenance,profile,anonymized_text,statsstats.timingsis machine-dependent; content-hash pipelines must ignore it- additive field changes keep
schema_version; breaking changes bump it
Development
Contributions are very welcome.
Caveat
Model-based anonymization can miss entities. Treat tiktag as an assistive control, not a sole compliance or safety gate.
See AGENTS.md for project contract and invariants.