fastkmeans-rs 1.0.6

Installation

[dependencies]
fastkmeans-rs = "0.1"

Features

Feature	Platform	Description
`cuda`	NVIDIA GPU	Flash-accelerated CUDA with cuBLAS GEMM and warp-cooperative kernels
`metal_gpu`	macOS (Apple Silicon)	Metal Performance Shaders GPU acceleration
`accelerate`	macOS	Apple Accelerate BLAS for CPU
`mkl`	Linux (Intel/AMD)	Intel MKL for CPU (recommended for Linux, fastest)
`openblas`	Linux / Windows	OpenBLAS for CPU (requires `libopenblas-dev`)

# NVIDIA GPU (recommended for Linux)
fastkmeans-rs = { version = "0.1", features = ["cuda"] }

# Apple Silicon GPU
fastkmeans-rs = { version = "0.1", features = ["metal_gpu", "accelerate"] }

# CPU-only with BLAS
fastkmeans-rs = { version = "0.1", features = ["mkl"] }          # Linux (fastest)
fastkmeans-rs = { version = "0.1", features = ["accelerate"] }   # macOS
fastkmeans-rs = { version = "0.1", features = ["openblas"] }     # Linux (fallback)

When cuda or metal_gpu is enabled, FastKMeans automatically uses the GPU. No code changes needed.

Usage

use fastkmeans_rs::{FastKMeans, KMeansConfig};
use ndarray::Array2;
use ndarray_rand::RandomExt;
use ndarray_rand::rand_distr::Uniform;

// Generate data: 100K points, 128 dimensions
let data = Array2::random((100_000, 128), Uniform::new(-1.0f32, 1.0));

// Create model with 256 clusters
let config = KMeansConfig::new(256)
    .with_max_iters(25)
    .with_max_points_per_centroid(None);

let mut kmeans = FastKMeans::with_config(config);

// Train
kmeans.train(&data.view()).unwrap();

// Predict
let labels = kmeans.predict(&data.view()).unwrap();

// Or fit + predict in one call
let labels = kmeans.fit_predict(&data.view()).unwrap();

// Access centroids
let centroids = kmeans.centroids().unwrap(); // shape: (256, 128)

Benchmarks

All benchmarks run with 25 iterations.

fastkmeans-rs vs fast-kmeans vs flash-kmeans

Train 100K vectors, 128 dimensions, 25 iterations.

Benchmark

Compared against fast-kmeans and flash-kmeans (optimized Triton kernels). CUDA/GPU benchmarks on H100, Metal GPU on Apple Silicon. fastkmeans-rs is pure Rust with no Python dependency.

Acknowledgements

Based on fast-kmeans and flash-kmeans. Credit for the algorithm design goes to the original authors.

License

Apache-2.0 — see LICENSE.