zer-schema
Schema inference and Fellegi-Sunter model registry for the zer entity-resolution library.
This crate handles two concerns: automatically detecting the schema of an incoming dataset, and persisting trained Fellegi-Sunter parameters so they can be reused across pipeline runs without re-running EM.
What it provides
| Item | Description |
|---|---|
SchemaInferrer |
Detects FieldKind from column names and value patterns; no manual schema definition needed |
SchemaFingerprint |
Compact SHA-256 identity for a schema plus its data distribution |
SchemaRegistry |
sled-backed persistent store for trained ModelArtifacts |
ModelArtifact |
Serialisable container for m/u parameters, thresholds, and metadata |
StartupMode |
Decision enum: load exact match, warm-start EM, or run full EM from priors |
Usage
use ;
let schema = default.infer_from_csv_headers?;
let registry = open?;
let mode = registry.lookup_startup_mode?;
License
Apache-2.0 · GitHub