Skip to main content

Crate evoc_rs

Crate evoc_rs 

Source
Expand description

EVoC - Embedding Vector Oriented Clustering

Efficient clustering of high-dimensional embedding vectors (CLIP, sentence transformers, etc.) by combining a UMAP-like node embedding with HDBSCAN-style density-based clustering and multi-layer persistence analysis. This is the Rust version/port which allows for different approximate nearest neighbour search algorithms (for details, see ann-search-rs). This code is based on the original code from Leland McInnes, see the Python implementation: evoc

Modules§

clustering
This module contains the clustering-related sub modules and functions, utilities namely the generation of minimum spanning trees (MST), KD trees and the core functions for the density-based clustering.
graph
This module contains the needed graph-related functions: kNN graph generation, fuzzy graph generation, label propagation and the embedding optimisation.
prelude
Re-exports of commonly used types, traits, structures and functions across the crate:
utils
Utility functions like shared traits, disjoint sets and sparse structures + matrix multiplications.

Structs§

EvocParams
Parameters for EVoC clustering.
EvocResult
Result of EVoC clustering.

Functions§

evoc
Run EVoC clustering on high-dimensional embedding data.
search_for_n_clusters
Binary-searches over min_cluster_size to find approximately target_k clusters.