Expand description
Schema registry and model persistence for zer.
This crate provides three cooperating components:
-
SchemaInferrer, automaticFieldKinddetection from column names and value patterns; produces aSchemawithout requiring the caller to know the dataset structure upfront. -
SchemaFingerprint, a compact identity for a schema plus its data distribution (SHA-256 hash of field names/kinds, per-field null rates, cardinalities). -
SchemaRegistry, asled-backed persistent store forModelArtifacts (trained Fellegi-Sunter parameters). On startup the pipeline callsSchemaRegistry::lookup_startup_modeto decide whether to load params directly (exact match), warm-start EM (similar schema), or run full EM from priors (new/incompatible schema).
Re-exports§
pub use artifact::ModelArtifact;pub use config::NameHeuristics;pub use config::ValuePatterns;pub use fingerprint::FieldStats;pub use fingerprint::SchemaFingerprint;pub use infer::SchemaInferrer;pub use registry::SchemaRegistry;pub use registry::StartupMode;pub use similarity::fingerprint_distance;pub use similarity::EXACT_MATCH_THRESHOLD;pub use similarity::WARM_START_THRESHOLD;