pub struct DuplicateDetector { /* private fields */ }Expand description
Detects potential duplicates in a dataset.
Implementations§
Source§impl DuplicateDetector
impl DuplicateDetector
Sourcepub fn new(similarity_threshold: f64, comparison_fields: Vec<String>) -> Self
pub fn new(similarity_threshold: f64, comparison_fields: Vec<String>) -> Self
Creates a new duplicate detector.
Sourcepub fn string_similarity(&self, a: &str, b: &str) -> f64
pub fn string_similarity(&self, a: &str, b: &str) -> f64
Calculates similarity between two strings (Jaccard similarity).
Sourcepub fn are_duplicates<T: Duplicatable>(&self, a: &T, b: &T) -> bool
pub fn are_duplicates<T: Duplicatable>(&self, a: &T, b: &T) -> bool
Checks if two records are potential duplicates.
Sourcepub fn find_duplicates<T: Duplicatable>(
&self,
records: &[T],
) -> Vec<(usize, usize, f64)>
pub fn find_duplicates<T: Duplicatable>( &self, records: &[T], ) -> Vec<(usize, usize, f64)>
Finds all duplicate pairs in a collection.
Auto Trait Implementations§
impl Freeze for DuplicateDetector
impl RefUnwindSafe for DuplicateDetector
impl Send for DuplicateDetector
impl Sync for DuplicateDetector
impl Unpin for DuplicateDetector
impl UnsafeUnpin for DuplicateDetector
impl UnwindSafe for DuplicateDetector
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
Source§fn to_subset(&self) -> Option<SS>
fn to_subset(&self) -> Option<SS>
The inverse inclusion map: attempts to construct
self from the equivalent element of its
superset. Read moreSource§fn is_in_subset(&self) -> bool
fn is_in_subset(&self) -> bool
Checks if
self is actually part of its subset T (and can be converted to it).Source§fn to_subset_unchecked(&self) -> SS
fn to_subset_unchecked(&self) -> SS
Use with care! Same as
self.to_subset but without any property checks. Always succeeds.Source§fn from_subset(element: &SS) -> SP
fn from_subset(element: &SS) -> SP
The inclusion map: converts
self to the equivalent element of its superset.