sklears-clustering
Clustering algorithms for the sklears machine learning library.
Latest release:
0.1.0-beta.1(January 1, 2026). See the workspace release notes for highlights and upgrade guidance.
Overview
This crate provides implementations of clustering algorithms including:
- K-Means: Classic centroid-based clustering with k-means++ initialization
- Mini-Batch K-Means: Scalable variant for large datasets
- DBSCAN: Density-based clustering for arbitrary shaped clusters
- Hierarchical Clustering: Agglomerative clustering with various linkage methods
- Mean Shift: Mode-seeking clustering algorithm
Usage
[]
= { = "0.1.0-beta.1", = ["clustering"] }
Examples
K-Means Clustering
use KMeans;
use InitMethod;
let model = new
.init_method
.max_iter
.n_init
.random_state;
let fitted = model.fit?;
let labels = fitted.predict?;
let centers = fitted.cluster_centers;
DBSCAN
use DBSCAN;
let model = DBSCANnew
.eps
.min_samples
.metric;
let labels = model.fit_predict?;
// -1 indicates noise points
Performance Features
- SIMD Distance Calculations: Vectorized distance computations
- Parallel Assignment: Multi-threaded cluster assignment
- Efficient K-D Trees: For neighbor searches in DBSCAN
- Memory-Efficient Updates: In-place operations where possible
GPU-Accelerated Distances
Enable the optional gpu feature to experiment with WebGPU-powered distance kernels. GPU-backed tests are ignored by default because device discovery can be slow; run them explicitly when a compatible GPU is available:
Metrics
The crate includes clustering metrics:
- Silhouette Score
- Calinski-Harabasz Index
- Davies-Bouldin Index
- Inertia
License
Licensed under either of Apache License, Version 2.0 or MIT license at your option.