Crate pii

Crate pii 

Source
Expand description

PII detection and anonymization interfaces.

This crate provides a deterministic pipeline for detecting and redacting personally identifiable information (PII). It is designed for CPU-only execution, explicit auditability, and controlled degradation when certain language features (lemma, POS, NER) are unavailable.

Typical usage:

  • Build an Analyzer with an NlpEngine, recognizers, and policy.
  • Run analyze to get detections with stable byte offsets.
  • Pass detections to the Anonymizer with an operator configuration.

The pipeline is modular: you can swap in a custom NlpEngine, add custom recognizers, or augment the system with Candle-based NER when needed.

Specification: docs/rfc-1200-pii.md.

Re-exports§

pub use analyzer::Analyzer;
pub use anonymize::AnonymizeConfig;
pub use anonymize::Anonymizer;
pub use anonymize::Operator;
pub use capabilities::Capabilities;
pub use config::PolicyConfig;
pub use error::PiiError;
pub use error::PiiResult;
pub use presets::default_recognizers;
pub use profile::ContextTerms;
pub use profile::LanguageProfile;
pub use profile::LanguageRegistry;
pub use types::AnalyzeResult;
pub use types::AnonymizeResult;
pub use types::AnonymizedItem;
pub use types::Detection;
pub use types::DetectionExplanation;
pub use types::EntityType;
pub use types::Language;
pub use types::NerSpan;
pub use types::NlpArtifacts;
pub use types::Token;
pub use types::LANGUAGE_DE;
pub use types::LANGUAGE_EN;
pub use types::LANGUAGE_ES;

Modules§

analyzer
Analyzer pipeline wiring for PII detection.
anonymize
Anonymization operators and helpers.
capabilities
Capability flags produced by an NLP engine.
config
Policy configuration for entity filtering and thresholds.
context
Context-aware score boosting for ambiguous detections.
decision
Candidate resolution and deterministic overlap handling.
error
Error types for the PII library.
nlp
NLP engine traits and a simple reference implementation.
presets
Built-in recognizer presets for common PII types.
profile
Language profiles and context term configuration.
recognizers
Recognizer traits and implementations.
types
Core data types for PII detection and anonymization.

Constants§

SPEC_PATH
Relative path to the RFC specification document.