pub struct ClusterGraph { /* private fields */ }Expand description
Undirected similarity graph over records.
Each node is a RecordId; each edge weight is the match_probability of
the AutoMatch pair that connected those two records.
Implementations§
Source§impl ClusterGraph
impl ClusterGraph
pub fn new() -> Self
Sourcepub fn add_pairs(&mut self, pairs: &[ScoredPair])
pub fn add_pairs(&mut self, pairs: &[ScoredPair])
Add AutoMatch pairs to the graph. Non-AutoMatch pairs are ignored.
Sourcepub fn compute_clusters(&self, config: &ClusterConfig) -> Vec<Vec<RecordId>>
pub fn compute_clusters(&self, config: &ClusterConfig) -> Vec<Vec<RecordId>>
Compute clusters using the two-phase chain-breaking algorithm:
- Weak-edge removal: remove all edges with weight <
config.within_cluster_minthen extract connected components. - Star pruning: for any component whose size exceeds
config.max_cluster_size, find the hub (highest-degree node in the original graph), remove all non-hub edges below the min threshold, and re-extract components from that sub-graph.
Returns only non-trivial components (size ≥ 2).
Trait Implementations§
Auto Trait Implementations§
impl Freeze for ClusterGraph
impl RefUnwindSafe for ClusterGraph
impl Send for ClusterGraph
impl Sync for ClusterGraph
impl Unpin for ClusterGraph
impl UnsafeUnpin for ClusterGraph
impl UnwindSafe for ClusterGraph
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more