Skip to main content

MatchConfig

Struct MatchConfig 

Source
pub struct MatchConfig {
Show 62 fields pub match_threshold: f64, pub uk_nhs_number_weight: f64, pub fr_nir_weight: f64, pub es_tsi_weight: f64, pub ie_ihi_weight: f64, pub uk_hc_number_weight: f64, pub us_ssn_weight: f64, pub au_ihi_weight: f64, pub de_kvnr_weight: f64, pub it_cf_weight: f64, pub nl_bsn_weight: f64, pub se_workernummer_weight: f64, pub uk_chi_number_weight: f64, pub be_nn_weight: f64, pub bg_egn_weight: f64, pub cz_rc_weight: f64, pub dk_cpr_weight: f64, pub ee_ik_weight: f64, pub es_dni_weight: f64, pub fi_hetu_weight: f64, pub hr_oib_weight: f64, pub is_kt_weight: f64, pub lt_ak_weight: f64, pub lv_pk_weight: f64, pub mt_id_weight: f64, pub no_fnr_weight: f64, pub pl_pesel_weight: f64, pub ro_cnp_weight: f64, pub si_emso_weight: f64, pub sk_rc_weight: f64, pub uk_nino_weight: f64, pub gr_dss_weight: f64, pub li_id_weight: f64, pub nl_id_weight: f64, pub pl_nip_weight: f64, pub pt_nif_weight: f64, pub br_cpf_weight: f64, pub cn_rrn_weight: f64, pub in_aadhaar_weight: f64, pub jp_my_number_weight: f64, pub mx_curp_weight: f64, pub nz_nhi_weight: f64, pub za_id_weight: f64, pub passport_book_weight: f64, pub given_name_weight: f64, pub family_name_weight: f64, pub date_of_birth_weight: f64, pub gender_weight: f64, pub blood_type_weight: f64, pub multiple_birth_weight: f64, pub address_weight: f64, pub birth_place_weight: f64, pub death_date_weight: f64, pub death_place_weight: f64, pub phone_weight: f64, pub email_weight: f64, pub use_phonetic_matching: bool, pub name_algorithm: SimilarityAlgorithm, pub strict_mode: bool, pub gmail_dot_folding: bool, pub nickname_table: NicknameTable, pub phone_default_country: Option<String>,
}
Expand description

Tunable configuration for the matching engine.

All weights are dimensionless and contribute to a renormalised weighted sum — they do not need to add to 1.0. The matching pipeline divides the weighted sum by the sum of participating weights so that missing fields neither contribute nor penalise. The score is then compared against MatchConfig::match_threshold to produce the is_match boolean.

Two presets cover most needs:

§Example

use worker_matcher::{MatchConfig, SimilarityAlgorithm};

let custom = MatchConfig {
    match_threshold: 0.80,
    uk_nhs_number_weight: 0.30,
    fr_nir_weight: 0.30,
    es_tsi_weight: 0.30,
    ie_ihi_weight: 0.30,
    uk_hc_number_weight: 0.30,
    us_ssn_weight: 0.30,
    au_ihi_weight: 0.30,
    de_kvnr_weight: 0.30,
    it_cf_weight: 0.30,
    nl_bsn_weight: 0.30,
    se_workernummer_weight: 0.30,
    uk_chi_number_weight: 0.30,
    be_nn_weight: 0.30,
    bg_egn_weight: 0.30,
    cz_rc_weight: 0.30,
    dk_cpr_weight: 0.30,
    ee_ik_weight: 0.30,
    es_dni_weight: 0.30,
    fi_hetu_weight: 0.30,
    hr_oib_weight: 0.30,
    is_kt_weight: 0.30,
    lt_ak_weight: 0.30,
    lv_pk_weight: 0.30,
    mt_id_weight: 0.30,
    no_fnr_weight: 0.30,
    pl_pesel_weight: 0.30,
    ro_cnp_weight: 0.30,
    si_emso_weight: 0.30,
    sk_rc_weight: 0.30,
    uk_nino_weight: 0.30,
    gr_dss_weight: 0.30,
    li_id_weight: 0.30,
    nl_id_weight: 0.30,
    pl_nip_weight: 0.30,
    pt_nif_weight: 0.30,
    br_cpf_weight: 0.30,
    cn_rrn_weight: 0.30,
    in_aadhaar_weight: 0.30,
    jp_my_number_weight: 0.30,
    mx_curp_weight: 0.30,
    nz_nhi_weight: 0.30,
    za_id_weight: 0.30,
    passport_book_weight: 0.30,
    given_name_weight: 0.15,
    family_name_weight: 0.20,
    date_of_birth_weight: 0.15,
    gender_weight: 0.05,
    blood_type_weight: 0.05,
    multiple_birth_weight: 0.05,
    address_weight: 0.025,
    birth_place_weight: 0.05,
    death_date_weight: 0.10,
    death_place_weight: 0.05,
    phone_weight: 0.025,
    email_weight: 0.05,
    use_phonetic_matching: true,
    name_algorithm: SimilarityAlgorithm::JaroWinkler,
    strict_mode: false,
    nickname_table: worker_matcher::NicknameTable::empty(),
    gmail_dot_folding: false,
    phone_default_country: Some("GB".into()),
};
assert_eq!(custom.match_threshold, 0.80);

Fields§

§match_threshold: f64

Threshold score for considering two workers a match (0.0..=1.0).

§uk_nhs_number_weight: f64

Weight for UK NHS Number match (only contributes if both parse).

§fr_nir_weight: f64

Weight for France NIR match (only contributes if both parse).

§es_tsi_weight: f64

Weight for España TSI / CIP-SNS match (only contributes if both parse).

§ie_ihi_weight: f64

Weight for Éire IHI match (only contributes if both parse).

§uk_hc_number_weight: f64

Weight for United Kingdom Northern Ireland H&C Number match (only contributes if both parse).

§us_ssn_weight: f64

Weight for United States Social Security Number match (only contributes if both parse).

§au_ihi_weight: f64

Weight for Australia IHI match (only contributes if both parse).

§de_kvnr_weight: f64

Weight for Germany KVNR match (only contributes if both parse).

§it_cf_weight: f64

Weight for Italy Codice Fiscale match (only contributes if both parse).

§nl_bsn_weight: f64

Weight for Netherlands BSN match (only contributes if both parse).

§se_workernummer_weight: f64

Weight for Sweden Workernummer match (only contributes if both parse).

§uk_chi_number_weight: f64

Weight for United Kingdom (Scotland) CHI Number match (only contributes if both parse).

§be_nn_weight: f64

Weight for Belgium National Number match (only contributes if both parse).

§bg_egn_weight: f64

Weight for Bulgaria EGN match (only contributes if both parse).

§cz_rc_weight: f64

Weight for Czech Rodné číslo match (only contributes if both parse).

§dk_cpr_weight: f64

Weight for Denmark CPR match (only contributes if both parse).

§ee_ik_weight: f64

Weight for Estonia Isikukood match (only contributes if both parse).

§es_dni_weight: f64

Weight for Spain DNI / NIE match (only contributes if both parse).

§fi_hetu_weight: f64

Weight for Finland HETU match (only contributes if both parse).

§hr_oib_weight: f64

Weight for Croatia OIB match (only contributes if both parse).

§is_kt_weight: f64

Weight for Iceland Kennitala match (only contributes if both parse).

§lt_ak_weight: f64

Weight for Lithuania Asmens kodas match (only contributes if both parse).

§lv_pk_weight: f64

Weight for Latvia Workeras kods match (only contributes if both parse).

§mt_id_weight: f64

Weight for Malta National ID match (only contributes if both parse).

§no_fnr_weight: f64

Weight for Norway Fødselsnummer match (only contributes if both parse).

§pl_pesel_weight: f64

Weight for Poland PESEL match (only contributes if both parse).

§ro_cnp_weight: f64

Weight for Romania CNP match (only contributes if both parse).

§si_emso_weight: f64

Weight for Slovenia EMŠO match (only contributes if both parse).

§sk_rc_weight: f64

Weight for Slovakia Rodné číslo match (only contributes if both parse).

§uk_nino_weight: f64

Weight for UK NINO match (only contributes if both parse).

§gr_dss_weight: f64

Weight for Greece DSS investor-share match (only contributes if both parse).

§li_id_weight: f64

Weight for Liechtenstein National ID match (only contributes if both parse).

§nl_id_weight: f64

Weight for Netherlands National ID match (only contributes if both parse).

§pl_nip_weight: f64

Weight for Poland NIP match (only contributes if both parse).

§pt_nif_weight: f64

Weight for Portugal NIF match (only contributes if both parse).

§br_cpf_weight: f64

Weight for Brazil CPF match (only contributes if both parse).

§cn_rrn_weight: f64

Weight for China Resident Identity Card match (only contributes if both parse).

§in_aadhaar_weight: f64

Weight for India Aadhaar match (only contributes if both parse).

§jp_my_number_weight: f64

Weight for Japan My Number match (only contributes if both parse).

§mx_curp_weight: f64

Weight for Mexico CURP match (only contributes if both parse).

§nz_nhi_weight: f64

Weight for New Zealand NHI match (only contributes if both parse).

§za_id_weight: f64

Weight for South Africa ID Number match (only contributes if both parse).

§passport_book_weight: f64

Weight for passport-book match (contributes when both workers have at least one crate::PassportBook recorded). See spec §6.4a / FR-51.

§given_name_weight: f64

Weight for given-name similarity.

§family_name_weight: f64

Weight for family-name similarity.

§date_of_birth_weight: f64

Weight for date-of-birth exact match.

§gender_weight: f64

Weight for gender exact match.

§blood_type_weight: f64

Weight for ABO+RhD blood-type exact match (see crate::BloodType). Defaults to 0.05 — blood type is a weak positive signal but a strong negative signal (disagreement is reliable evidence of non-match because blood type doesn’t change over a lifetime).

§multiple_birth_weight: f64

Weight for multiple-birth indicator exact match (FHIR Patient.multipleBirth). Defaults to 0.05 — a weak positive signal in general, but a strong negative signal for distinguishing identical twins who otherwise share name, DOB, and address.

§address_weight: f64

Weight for address similarity.

§birth_place_weight: f64

Weight for place-of-birth match (FHIR Patient.birthPlace). Defaults to 0.05 — stable for life so disagreement is informative, but agreement alone is weak because many people are born in the same place. Scored against the city and country sub-fields of crate::Address.

§death_date_weight: f64

Weight for date-of-death match (FHIR Patient.deceasedDateTime). Defaults to 0.10 — exact agreement on a recorded death date is strong evidence the records refer to the same worker, scored with the same DOB transposition heuristic as Self::date_of_birth_weight.

§death_place_weight: f64

Weight for place-of-death match. Defaults to 0.05 — analogous to Self::birth_place_weight: stable per record (someone only dies once) so disagreement is informative, but agreement alone is weak. Scored against the city and country sub-fields of crate::Address.

§phone_weight: f64

Weight for phone-number exact match (after normalisation).

§email_weight: f64

Weight for email-address exact match (after normalisation via crate::Normalizer::normalize_email).

§use_phonetic_matching: bool

Whether to add a phonetic-name bonus when both names sound alike.

§name_algorithm: SimilarityAlgorithm

Similarity algorithm to use when comparing given and family names.

§strict_mode: bool

Reserved flag for stricter deterministic enforcement. See spec OQ-5.

§gmail_dot_folding: bool

Fold Gmail-specific localpart dots and +tag suffixes during email normalisation. When true, addresses on gmail.com / googlemail.com have every . in the localpart removed and any +anything suffix dropped before comparison, mirroring Gmail’s documented routing rules. Defaults to false so non-Gmail addresses are unaffected and the canonical form is unsurprising.

§nickname_table: NicknameTable

Optional table of nickname equivalence classes consulted by name scoring.

When a name pair is equivalent under this table — e.g. ("Michael", "Mike") — the matcher lifts the given-name (and family-name) similarity score to at least 0.9, ensuring the table-driven equivalence is not undone by a low Jaro-Winkler / Levenshtein score on dissimilar forms. The boost never lowers a score.

Defaults to NicknameTable::empty so existing behaviour is preserved. Opt in with NicknameTable::english (a built-in English-language dictionary) or build a custom table via NicknameTable::with_class.

§phone_default_country: Option<String>

ISO 3166-1 alpha-2 country code applied to phone numbers that lack an explicit international marker (+CC or 00CC).

When Some(cc), the matcher converts each phone to E.164 form via crate::Normalizer::normalize_phone_e164 using cc as the fallback jurisdiction. Numbers from different countries with overlapping national digits will no longer collide. When None, only inputs carrying an explicit international marker reach E.164; every other comparison falls back to the legacy crate::Normalizer::normalize_phone form.

Defaults to Some("GB") to preserve the crate’s historical UK-centric behaviour. Set to the worker population’s predominant jurisdiction in production deployments.

Implementations§

Source§

impl MatchConfig

Source

pub fn strict() -> Self

A stricter preset: match_threshold = 0.95, strict_mode = true.

Use when a clinician must rely on the answer and false positives are more dangerous than false negatives.

use worker_matcher::MatchConfig;
let c = MatchConfig::strict();
assert!((c.match_threshold - 0.95).abs() < 1e-9);
assert!(c.strict_mode);
Source

pub fn lenient() -> Self

A more forgiving preset: match_threshold = 0.75, phonetic matching on.

Use when triaging large candidate sets where false negatives are worse than false positives.

use worker_matcher::MatchConfig;
let c = MatchConfig::lenient();
assert!((c.match_threshold - 0.75).abs() < 1e-9);
assert!(c.use_phonetic_matching);

Trait Implementations§

Source§

impl Clone for MatchConfig

Source§

fn clone(&self) -> MatchConfig

Returns a duplicate of the value. Read more
1.0.0 (const: unstable) · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for MatchConfig

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Default for MatchConfig

Source§

fn default() -> Self

Production-ready defaults tuned per spec §13.1.

use worker_matcher::{MatchConfig, SimilarityAlgorithm};
let c = MatchConfig::default();
assert!((c.match_threshold - 0.85).abs() < 1e-9);
assert!(c.use_phonetic_matching);
assert!(matches!(c.name_algorithm, SimilarityAlgorithm::Combined));
Source§

impl<'de> Deserialize<'de> for MatchConfig

Source§

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>
where __D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more
Source§

impl Serialize for MatchConfig

Source§

fn serialize<__S>(&self, __serializer: __S) -> Result<__S::Ok, __S::Error>
where __S: Serializer,

Serialize this value into the given Serde serializer. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,