Skip to main content Module prelude Copy item path Source pub use crate::Backend ;pub use crate::Comparator ;pub use crate::Scorer ;AddressInitialKey Blocks on: (first token of address field) + “:” + (first char of first-name field).
Handles surname transpositions, two records at the same address with the same initial
should end up in the same bucket even if the surname differs. AddressTokenOverlap Jaccard similarity on normalized token sets. AliasPhoneticKey Emits a "phonetic_dob:CODE:YEAR" key for each name stored in a
pipe-delimited alias field (e.g. SIS II alias_namen). BlockerFactory CameraTimeWindowKey Groups passages by camera identifier and a fixed-width time window. ClusterConfig Parameters controlling cluster shape after graph construction. ComparisonBatch Field-major SoA batch of comparison results for many pairs. ComparisonVector Comparison result for a single candidate pair. CompositeBlocker Composite blocker that applies multiple blocking keys. ConnectedComponentsClusterer Connected-components clusterer with weak-edge removal and star pruning. DateFragmentKey Blocking key that extracts the leading date fragment at a given granularity. DocumentDigitSuffixKey Variant that strips ALL non-digit characters before taking the suffix. DocumentSuffixKey Blocking key that strips non-alphanumeric characters from a document number
and emits the last suffix_len characters as a key. Entity A resolved entity grouping one or more records. EntityMember A record’s membership in an entity, with its resolution score and method. ExactFieldKey FellegiSunterScorer Fellegi-Sunter scorer. FieldComparator Pairwise field comparator that applies similarity functions to produce a field-major ComparisonBatch. FuzzyYearKey Phonetic blocking key that emits year-range variants for records with an estimated date of birth
(the YYYY-01-01 Jan-1 convention), so estimated DOBs that differ by up to fuzzy_range
years still share a blocking key. GeoGridKey Groups records by rounding geographic coordinates to a fixed grid cell. InvertedIndex Inverted index mapping blocking keys to record IDs. JaroWinklerSimilarity LevelThresholds Configurable per-field thresholds for mapping a float similarity score to a ComparisonLevel. LicensePlateNormKey Normalizes a license plate (strips hyphens/spaces, uppercases) and emits
the result as a single exact blocking key. ModelArtifact Everything that must be persisted after a successful EM training run. ModelParams Learned Fellegi-Sunter m/u parameters and classification thresholds for one schema. PhoneticEqualitySimilarity PhoneticNameDobKey Blocking key that encodes the surname phonetically combined with the birth year. PlateOCRFuzzyKey Emits the normalized plate plus a deletion-neighbourhood key for each
character position. Record A single data record with a unique ID and a map of field values. RecordPool Column-major record store: columns[field_idx][record_idx]. Schema Ordered list of field definitions for a dataset. SchemaBuilder Fluent builder for constructing a Schema. SchemaFingerprint Fingerprint that identifies a schema structure plus its data distribution. SchemaInferrer Automatic schema detector. SchemaRegistry Persistent store for trained ModelArtifact s. ScoredPair A candidate pair annotated with its match weight, probability, and band. StreetNumberEditDistance Levenshtein edit distance on the leading street number. SuffixKey Blocking key that extracts the last N digits from a field value. TokenOverlapSimilarity TransliteratedPhoneticKey Phonetic blocking key that first transliterates non-Latin script (Arabic,
Cyrillic, Greek, etc.) to ASCII via any_ascii, then applies NFKD
diacritic stripping and DoubleMetaphone encoding, combined with the DOB
year. VecRecordStore Default in-memory RecordStore backed by a Vec, zero-config. ZalEntityStore SQLite-backed entity store persisted as a single .zes file. ComparisonLevel DateGranularity Controls how much of an ISO 8601 date is used as a blocking key. FieldKind FieldValue Typed value stored in a record field. JudgeVerdict MatchBand Coarse classification of a scored pair based on match probability. PhoneticAlgo Phonetic encoding algorithm. ResolutionMethod How an entity member was resolved. SchemaCategory High-level domain category for a dataset. StartupMode Decides how the pipeline should initialize when a new dataset arrives. ZerError BlockIndex Opaque blocking index. Blocker Extracts blocking keys from records and looks up candidates in an index. Clusterer Groups scored pairs into entity clusters. ComparatorTrait EntityStore Persistent store for resolved entities. Judge Neural re-ranker that adjudicates borderline record pairs. RecordStore Backing store for records used during ingestion and batch runs. ScorerTrait SimilarityFn Returns a similarity in [0.0, 1.0].
0.0 = completely different, 1.0 = identical. EntityId RecordId