pub struct Hdbscan<'a, T> { /* private fields */ }Expand description
The HDBSCAN clustering algorithm in Rust. Generic over floating point numeric types.
Implementations§
source§impl<'a, T: Float> Hdbscan<'a, T>
impl<'a, T: Float> Hdbscan<'a, T>
sourcepub fn new(data: &'a Vec<Vec<T>>, hyper_params: HdbscanHyperParams) -> Self
pub fn new(data: &'a Vec<Vec<T>>, hyper_params: HdbscanHyperParams) -> Self
Creates an instance of HDBSCAN clustering model using a custom hyper parameter configuration.
§Parameters
data- a reference to the data to cluster, a collection of vectors of floating points numbers. The vectors must all be of the same dimensionality and contain no infinite values.config- the hyper parameter configuration.
§Returns
- The HDBSCAN model instance.
§Examples
use hdbscan::{DistanceMetric, Hdbscan, HdbscanHyperParams, NnAlgorithm};
let data: Vec<Vec<f32>> = vec![
vec![1.3, 1.1],
vec![1.3, 1.2],
vec![1.0, 1.1],
vec![1.2, 1.2],
vec![0.9, 1.0],
vec![0.9, 1.0],
vec![3.7, 4.0],
vec![3.9, 3.9],
];
let config = HdbscanHyperParams::builder()
.min_cluster_size(3)
.min_samples(2)
.dist_metric(DistanceMetric::Manhattan)
.nn_algorithm(NnAlgorithm::BruteForce)
.build();
let clusterer = Hdbscan::new(&data, config);sourcepub fn default(data: &'a Vec<Vec<T>>) -> Hdbscan<'_, T>
pub fn default(data: &'a Vec<Vec<T>>) -> Hdbscan<'_, T>
Creates an instance of HDBSCAN clustering model using the default hyper parameters.
§Parameters
data- a reference to the data to cluster, a collection of vectors of floating points numbers. The vectors must all be of the same dimensionality and contain no infinite values.
§Returns
- The HDBSCAN model instance.
§Examples
use hdbscan::Hdbscan;
let data: Vec<Vec<f32>> = vec![
vec![1.3, 1.1],
vec![1.3, 1.2],
vec![1.0, 1.1],
vec![1.2, 1.2],
vec![0.9, 1.0],
vec![0.9, 1.0],
vec![3.7, 4.0],
vec![3.9, 3.9],
];
let clusterer = Hdbscan::default(&data);sourcepub fn cluster(&self) -> Result<Vec<i32>, HdbscanError>
pub fn cluster(&self) -> Result<Vec<i32>, HdbscanError>
Performs clustering on the list of vectors passed to the constructor.
§Returns
- A result that, if successful, contains a list of cluster labels, with a length equal to the numbe of samples passed to the constructor. Positive integers mean a data point belongs to a cluster of that label. -1 labels mean that a data point is noise and does not belong to any cluster. An Error will be returned if the dimensionality of the input vectors are mismatched, if any vector contains non-finite coordinates, or if the passed data set is empty.
§Examples
use std::collections::HashSet;
use hdbscan::Hdbscan;
let data: Vec<Vec<f32>> = vec![
vec![1.5, 2.2],
vec![1.0, 1.1],
vec![1.2, 1.4],
vec![0.8, 1.0],
vec![1.1, 1.0],
vec![3.7, 4.0],
vec![3.9, 3.9],
vec![3.6, 4.1],
vec![3.8, 3.9],
vec![4.0, 4.1],
vec![10.0, 10.0],
];
let clusterer = Hdbscan::default(&data);
let labels = clusterer.cluster().unwrap();
//First five points form one cluster
assert_eq!(1, labels[..5].iter().collect::<HashSet<_>>().len());
// Next five points are a second cluster
assert_eq!(1, labels[5..10].iter().collect::<HashSet<_>>().len());
// The final point is noise
assert_eq!(-1, labels[10]);sourcepub fn calc_centers(
&self,
center: Center,
labels: &[i32],
) -> Result<Vec<Vec<T>>, HdbscanError>
pub fn calc_centers( &self, center: Center, labels: &[i32], ) -> Result<Vec<Vec<T>>, HdbscanError>
Calculates the centers of the clusters just calculate.
§Parameters
center- the type of center to calculate. Currently only centroid (the element wise mean of all the data points in a cluster) is supported.labels- a reference to the labels calculated by a call toHdbscan::cluster.
§Returns
- A vector of the cluster centers, of shape num clusters by num dimensions/features.
§Panics
- If the labels are of different length to the data passed to the
Hdbscanconstructor
§Examples
use hdbscan::{Center, Hdbscan};
let data: Vec<Vec<f32>> = vec![
vec![1.5, 2.2],
vec![1.0, 1.1],
vec![1.2, 1.4],
vec![0.8, 1.0],
vec![1.1, 1.0],
vec![3.7, 4.0],
vec![3.9, 3.9],
vec![3.6, 4.1],
vec![3.8, 3.9],
vec![4.0, 4.1],
vec![10.0, 10.0],
];
let clusterer = Hdbscan::default(&data);
let labels = clusterer.cluster().unwrap();
let centroids = clusterer.calc_centers(Center::Centroid, &labels).unwrap();
assert_eq!(2, centroids.len());
assert!(centroids.contains(&vec![3.8, 4.0]) && centroids.contains(&vec![1.12, 1.34]));Trait Implementations§
impl<'a, T> StructuralPartialEq for Hdbscan<'a, T>
Auto Trait Implementations§
impl<'a, T> Freeze for Hdbscan<'a, T>
impl<'a, T> RefUnwindSafe for Hdbscan<'a, T>where
T: RefUnwindSafe,
impl<'a, T> Send for Hdbscan<'a, T>where
T: Sync,
impl<'a, T> Sync for Hdbscan<'a, T>where
T: Sync,
impl<'a, T> Unpin for Hdbscan<'a, T>
impl<'a, T> UnwindSafe for Hdbscan<'a, T>where
T: RefUnwindSafe,
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
source§default unsafe fn clone_to_uninit(&self, dst: *mut T)
default unsafe fn clone_to_uninit(&self, dst: *mut T)
🔬This is a nightly-only experimental API. (
clone_to_uninit)