[][src]Crate linfa_clustering

linfa-clustering aims to provide pure Rust implementations of popular clustering algorithms.

The big picture

linfa-clustering is a crate in the linfa ecosystem, a wider effort to bootstrap a toolkit for classical Machine Learning implemented in pure Rust, kin in spirit to Python's scikit-learn.

You can find a roadmap (and a selection of good first issues) here - contributors are more than welcome!

Current state

Right now linfa-clustering only provides a single algorithm, K-Means, with a couple of helper functions.

Implementation choices, algorithmic details and a tutorial can be found here.

Check here for extensive benchmarks against scikit-learn's K-means implementation.

Structs

Dbscan

DBSCAN (Density-based Spatial Clustering of Applications with Noise) clusters together points which are close together with enough neighbors labelled points which are sparsely neighbored as noise. As points may be part of a cluster or noise the predict method returns Array1<Option<usize>>

DbscanHyperParams

The set of hyperparameters that can be specified for the execution of the DBSCAN algorithm.

DbscanHyperParamsBuilder

Helper struct used to construct a set of hyperparameters for

GaussianMixtureModel

Gaussian Mixture Model (GMM) aims at clustering a dataset by finding normally distributed sub datasets (hence the Gaussian Mixture name) .

GmmHyperParams

The set of hyperparameters that can be specified for the execution of the GMM algorithm.

KMeans

K-means clustering aims to partition a set of unlabeled observations into clusters, where each observation belongs to the cluster with the nearest mean.

KMeansHyperParams

The set of hyperparameters that can be specified for the execution of the K-means algorithm.

KMeansHyperParamsBuilder

An helper struct used to construct a set of valid hyperparameters for the K-means algorithm (using the builder pattern).

Enums

GmmCovarType

A specifier for the type of the relation between components' covariances.

GmmError

An error when modeling a GMM algorithm

GmmInitMethod

A specifier for the method used for the initialization of the fitting algorithm of GMM

KMeansError

An error when modeling a KMeans algorithm

Functions

generate_blob

Generate blob_size data points (a "blob") around blob_centroid.

generate_blobs

Given an input matrix blob_centroids, with shape (n_blobs, n_features), generate blob_size data points (a "blob") around each of the blob centroids.