pub struct DuplicateDetector { /* private fields */ }Expand description
Detects potential duplicates in a dataset.
Implementations§
Source§impl DuplicateDetector
impl DuplicateDetector
Sourcepub fn new(similarity_threshold: f64, comparison_fields: Vec<String>) -> Self
pub fn new(similarity_threshold: f64, comparison_fields: Vec<String>) -> Self
Creates a new duplicate detector.
Sourcepub fn string_similarity(&self, a: &str, b: &str) -> f64
pub fn string_similarity(&self, a: &str, b: &str) -> f64
Calculates similarity between two strings (Jaccard similarity).
Sourcepub fn are_duplicates<T: Duplicatable>(&self, a: &T, b: &T) -> bool
pub fn are_duplicates<T: Duplicatable>(&self, a: &T, b: &T) -> bool
Checks if two records are potential duplicates.
Sourcepub fn find_duplicates<T: Duplicatable>(
&self,
records: &[T],
) -> Vec<(usize, usize, f64)>
pub fn find_duplicates<T: Duplicatable>( &self, records: &[T], ) -> Vec<(usize, usize, f64)>
Finds all duplicate pairs in a collection.
Auto Trait Implementations§
impl Freeze for DuplicateDetector
impl RefUnwindSafe for DuplicateDetector
impl Send for DuplicateDetector
impl Sync for DuplicateDetector
impl Unpin for DuplicateDetector
impl UnsafeUnpin for DuplicateDetector
impl UnwindSafe for DuplicateDetector
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more