Expand description
PII detection and anonymization interfaces.
This crate provides a deterministic pipeline for detecting and redacting personally identifiable information (PII). It is designed for CPU-only execution, explicit auditability, and controlled degradation when certain language features (lemma, POS, NER) are unavailable.
Typical usage:
- Build an
Analyzerwith anNlpEngine, recognizers, and policy. - Run
analyzeto get detections with stable byte offsets. - Pass detections to the
Anonymizerwith an operator configuration.
The pipeline is modular: you can swap in a custom NlpEngine, add custom
recognizers, or augment the system with Candle-based NER when needed.
Specification: docs/rfc-1200-pii.md.
Re-exports§
pub use analyzer::Analyzer;pub use anonymize::AnonymizeConfig;pub use anonymize::Anonymizer;pub use anonymize::Operator;pub use capabilities::Capabilities;pub use config::PolicyConfig;pub use error::PiiError;pub use error::PiiResult;pub use presets::default_recognizers;pub use profile::ContextTerms;pub use profile::LanguageProfile;pub use profile::LanguageRegistry;pub use types::AnalyzeResult;pub use types::AnonymizeResult;pub use types::AnonymizedItem;pub use types::Detection;pub use types::DetectionExplanation;pub use types::EntityType;pub use types::Language;pub use types::NerSpan;pub use types::NlpArtifacts;pub use types::Token;pub use types::LANGUAGE_DE;pub use types::LANGUAGE_EN;pub use types::LANGUAGE_ES;
Modules§
- analyzer
- Analyzer pipeline wiring for PII detection.
- anonymize
- Anonymization operators and helpers.
- capabilities
- Capability flags produced by an NLP engine.
- config
- Policy configuration for entity filtering and thresholds.
- context
- Context-aware score boosting for ambiguous detections.
- decision
- Candidate resolution and deterministic overlap handling.
- error
- Error types for the PII library.
- nlp
- NLP engine traits and a simple reference implementation.
- presets
- Built-in recognizer presets for common PII types.
- profile
- Language profiles and context term configuration.
- recognizers
- Recognizer traits and implementations.
- types
- Core data types for PII detection and anonymization.
Constants§
- SPEC_
PATH - Relative path to the RFC specification document.