Skip to main content

MatchingEngine

Struct MatchingEngine 

Source
pub struct MatchingEngine { /* private fields */ }
Expand description

Thing matcher engine.

The engine is immutable after construction and cheap to clone (it owns only a MatchConfig). Construct one and call its methods from any thread.

use thing_matcher::{MatchConfig, MatchingEngine};

let engine_a = MatchingEngine::default_config();
let engine_b = MatchingEngine::new(MatchConfig::strict());

Implementations§

Source§

impl MatchingEngine

Source

pub fn new(config: MatchConfig) -> Self

Construct an engine with the given configuration.

use thing_matcher::{MatchConfig, MatchingEngine};
let engine = MatchingEngine::new(MatchConfig::lenient());
Source

pub fn default_config() -> Self

Construct an engine with MatchConfig::default.

use thing_matcher::MatchingEngine;
let engine = MatchingEngine::default_config();
Source

pub fn match_things(&self, thing1: &Thing, thing2: &Thing) -> MatchResult

Compare two things probabilistically and return a MatchResult.

The score is the weight-renormalised sum of every component that scored on both records. Missing fields are skipped, not penalised.

use thing_matcher::{MatchingEngine, Thing};

let t = Thing::builder()
    .name("Eiffel Tower")
    .url("https://www.toureiffel.paris/")
    .build();

let result = MatchingEngine::default_config().match_things(&t, &t);
assert!(result.is_match);
assert!(result.score > 0.99);
Source

pub fn match_one_to_many( &self, query: &Thing, candidates: &[Thing], ) -> Vec<MatchResult>

Score a single query against many candidates. Returns one MatchResult per candidate, in the same order as the input slice.

The engine is immutable and Send + Sync, so call-sites that want parallel evaluation can wrap the call in rayon::par_iter or similar without further changes to this crate.

§Examples
use thing_matcher::{MatchingEngine, Thing};

let query = Thing::builder().name("Eiffel Tower").build();
let candidates = vec![
    Thing::builder().name("Eiffel Tower").build(),
    Thing::builder().name("Big Ben").build(),
];

let engine = MatchingEngine::default_config();
let results = engine.match_one_to_many(&query, &candidates);
assert_eq!(results.len(), 2);
assert!(results[0].is_match);
assert!(!results[1].is_match);

Empty candidates yield an empty result:

let q = Thing::builder().name("Solo").build();
let r = MatchingEngine::default_config().match_one_to_many(&q, &[]);
assert!(r.is_empty());
Source

pub fn rank_one_to_many( &self, query: &Thing, candidates: &[Thing], ) -> Vec<(usize, MatchResult)>

Score and rank: return (original_index, MatchResult) tuples sorted by descending score. Ties are broken by ascending original index, so the result is deterministic.

§Examples
use thing_matcher::{MatchingEngine, Thing};

let query = Thing::builder().name("Eiffel Tower").build();
let candidates = vec![
    Thing::builder().name("Big Ben").build(),                 // index 0
    Thing::builder().name("Eiffel Tower").build(),            // index 1 — best match
    Thing::builder().name("Statue of Liberty").build(),       // index 2
];

let ranked = MatchingEngine::default_config().rank_one_to_many(&query, &candidates);
assert_eq!(ranked.len(), 3);
assert_eq!(ranked[0].0, 1);
assert!(ranked[0].1.score >= ranked[1].1.score);
assert!(ranked[1].1.score >= ranked[2].1.score);
Source

pub fn deterministic_match(&self, thing1: &Thing, thing2: &Thing) -> bool

Compare two things deterministically and return a single boolean.

Returns true iff any of the following hold:

  • the things share any (property_id, value) pair in their identifiers lists;
  • the things share any sameAs URL after URL normalisation;
  • both have a url that normalises to the same string.
use thing_matcher::{Identifier, MatchingEngine, Thing};

let id = Identifier::new("wikidata", "Q243").unwrap();
let a = Thing::builder().name("Eiffel Tower").add_identifier(id.clone()).build();
let b = Thing::builder().name("Tour Eiffel").add_identifier(id).build();
assert!(MatchingEngine::default_config().deterministic_match(&a, &b));

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.