sklears-clustering

Clustering algorithms for the sklears machine learning library.

Latest release: 0.1.0-beta.1 (January 1, 2026). See the workspace release notes for highlights and upgrade guidance.

Overview

This crate provides implementations of clustering algorithms including:

K-Means: Classic centroid-based clustering with k-means++ initialization
Mini-Batch K-Means: Scalable variant for large datasets
DBSCAN: Density-based clustering for arbitrary shaped clusters
Hierarchical Clustering: Agglomerative clustering with various linkage methods
Mean Shift: Mode-seeking clustering algorithm

Usage

[dependencies]
sklears = { version = "0.1.0-beta.1", features = ["clustering"] }

Examples

K-Means Clustering

use sklears::cluster::KMeans;
use sklears::cluster::InitMethod;

let model = KMeans::new(3)
    .init_method(InitMethod::KMeansPlusPlus)
    .max_iter(300)
    .n_init(10)
    .random_state(42);

let fitted = model.fit(&data)?;
let labels = fitted.predict(&new_data)?;
let centers = fitted.cluster_centers();

DBSCAN

use sklears::cluster::DBSCAN;

let model = DBSCAN::new()
    .eps(0.5)
    .min_samples(5)
    .metric(Distance::Euclidean);

let labels = model.fit_predict(&data)?;
// -1 indicates noise points

Performance Features

SIMD Distance Calculations: Vectorized distance computations
Parallel Assignment: Multi-threaded cluster assignment
Efficient K-D Trees: For neighbor searches in DBSCAN
Memory-Efficient Updates: In-place operations where possible

GPU-Accelerated Distances

Enable the optional gpu feature to experiment with WebGPU-powered distance kernels. GPU-backed tests are ignored by default because device discovery can be slow; run them explicitly when a compatible GPU is available:

cargo test -p sklears-clustering --features gpu -- --ignored gpu_distances::gpu::tests::test_gpu_distance_computation

Metrics

The crate includes clustering metrics:

Silhouette Score
Calinski-Harabasz Index
Davies-Bouldin Index
Inertia

License

Licensed under either of Apache License, Version 2.0 or MIT license at your option.

sklears-clustering 0.1.0-beta.1