pub struct MatchingEngine { /* private fields */ }Expand description
Worker matcher engine.
The engine is immutable after construction and cheap to clone (it
owns only a MatchConfig). Construct one and call its methods from any
thread.
use worker_matcher::{MatchConfig, MatchingEngine};
let engine_a = MatchingEngine::default_config();
let engine_b = MatchingEngine::new(MatchConfig::strict());Implementations§
Source§impl MatchingEngine
impl MatchingEngine
Sourcepub fn new(config: MatchConfig) -> Self
pub fn new(config: MatchConfig) -> Self
Construct an engine with the given configuration.
use worker_matcher::{MatchConfig, MatchingEngine};
let engine = MatchingEngine::new(MatchConfig::lenient());Sourcepub fn default_config() -> Self
pub fn default_config() -> Self
Construct an engine with MatchConfig::default.
use worker_matcher::MatchingEngine;
let engine = MatchingEngine::default_config();Sourcepub fn match_workers(&self, worker1: &Worker, worker2: &Worker) -> MatchResult
pub fn match_workers(&self, worker1: &Worker, worker2: &Worker) -> MatchResult
Compare two workers probabilistically and return a MatchResult.
The score is the weight-renormalised sum of every component that scored on both records. Missing fields are skipped, not penalised.
use worker_matcher::{MatchingEngine, Worker};
use chrono::NaiveDate;
let p = Worker::builder()
.given_name("Carys")
.family_name("Pritchard")
.date_of_birth(NaiveDate::from_ymd_opt(1985, 1, 1).unwrap())
.build();
let result = MatchingEngine::default_config().match_workers(&p, &p);
assert!(result.is_match);
assert!(result.score > 0.99);Sourcepub fn match_one_to_many(
&self,
query: &Worker,
candidates: &[Worker],
) -> Vec<MatchResult>
pub fn match_one_to_many( &self, query: &Worker, candidates: &[Worker], ) -> Vec<MatchResult>
Score a single query against many candidates in parallel-friendly
fashion. Returns one MatchResult per candidate, in the same
order as the input slice.
This is the building block for a Master Worker Index workflow:
given a new record and the existing population, produce a fully
audited score for each potential link. The engine is immutable
and Send + Sync, so call-sites that want parallel evaluation
can wrap the call in rayon::par_iter or similar without
further changes to this crate.
For sparse / large populations consider blocking — pre-filter
candidates with a cheap predicate (e.g. matching family-name
Soundex or postcode prefix) before passing the survivors to this
function. Blocking is a consumer concern and is intentionally
not baked into the API; the crate stays a pure scoring library.
§Examples
use worker_matcher::{MatchingEngine, Worker};
let query = Worker::builder()
.given_name("Ada")
.family_name("Lovelace")
.build();
let candidates = vec![
Worker::builder().given_name("Ada").family_name("Lovelace").build(),
Worker::builder().given_name("Alan").family_name("Turing").build(),
Worker::builder().given_name("Grace").family_name("Hopper").build(),
];
let engine = MatchingEngine::default_config();
let results = engine.match_one_to_many(&query, &candidates);
assert_eq!(results.len(), 3);
assert!(results[0].is_match);
assert!(!results[1].is_match);Empty candidates yield an empty result:
let q = Worker::builder().given_name("Solo").build();
let r = MatchingEngine::default_config().match_one_to_many(&q, &[]);
assert!(r.is_empty());Sourcepub fn rank_one_to_many(
&self,
query: &Worker,
candidates: &[Worker],
) -> Vec<(usize, MatchResult)>
pub fn rank_one_to_many( &self, query: &Worker, candidates: &[Worker], ) -> Vec<(usize, MatchResult)>
Score and rank: return (original_index, MatchResult) tuples
sorted by descending score. Ties are broken by ascending original
index, so the result is deterministic.
Convenience wrapper around MatchingEngine::match_one_to_many
for the common “give me the best matches first” workflow.
Consumers that need a filtered view (e.g. only is_match == true)
can drop entries off the front, while consumers that need to
pair results with external metadata can use the preserved
original index.
§Examples
use worker_matcher::{MatchingEngine, Worker};
let query = Worker::builder().given_name("Ada").family_name("Lovelace").build();
let candidates = vec![
Worker::builder().given_name("Grace").family_name("Hopper").build(), // index 0
Worker::builder().given_name("Ada").family_name("Lovelace").build(), // index 1 — best match
Worker::builder().given_name("Alan").family_name("Turing").build(), // index 2
];
let ranked = MatchingEngine::default_config().rank_one_to_many(&query, &candidates);
assert_eq!(ranked.len(), 3);
assert_eq!(ranked[0].0, 1); // best match's original index
assert!(ranked[0].1.score >= ranked[1].1.score);
assert!(ranked[1].1.score >= ranked[2].1.score);Sourcepub fn deterministic_match(&self, worker1: &Worker, worker2: &Worker) -> bool
pub fn deterministic_match(&self, worker1: &Worker, worker2: &Worker) -> bool
Compare two workers deterministically and return a single boolean.
Returns true iff any of the following hold:
- Both UK NHS Numbers parse and are equal.
- Both France NIRs parse and are equal.
- Both España TSIs parse and are equal.
- Both Éire IHIs parse and are equal.
- Both UK Northern Ireland H&C Numbers parse and are equal.
- Both US Social Security Numbers parse and are equal.
- Both Australia IHIs parse and are equal.
- Both Germany KVNRs parse and are equal.
- Both Italy Codice Fiscale values parse and are equal.
- Both Netherlands BSNs parse and are equal.
- Both Sweden Workernummer values parse and are equal.
- Both UK Scotland CHI Numbers parse and are equal.
- The workers share at least one
(country, number)passport-book pair after canonicalisation (seecrate::PassportBook). - Normalised given name matches, and normalised family name matches, and date of birth matches, and gender matches (or is missing on at least one side).
National identifiers from different schemes never cross-match: an NHS Number is only ever compared against another NHS Number, never against an H&C Number that happens to share the same 10 digits.
use worker_matcher::{MatchingEngine, Worker};
// Same NHS number, different formatting → match.
let a = Worker::builder().uk_nhs_number("943 476 5919").build();
let b = Worker::builder().uk_nhs_number("9434765919").build();
assert!(MatchingEngine::default_config().deterministic_match(&a, &b));Auto Trait Implementations§
impl Freeze for MatchingEngine
impl RefUnwindSafe for MatchingEngine
impl Send for MatchingEngine
impl Sync for MatchingEngine
impl Unpin for MatchingEngine
impl UnsafeUnpin for MatchingEngine
impl UnwindSafe for MatchingEngine
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more