Crate linfa_clustering[−][src]
linfa-clustering aims to provide pure Rust implementations
of popular clustering algorithms.
The big picture
linfa-clustering is a crate in the linfa ecosystem, a wider effort to
bootstrap a toolkit for classical Machine Learning implemented in pure Rust,
kin in spirit to Python’s scikit-learn.
You can find a roadmap (and a selection of good first issues) here - contributors are more than welcome!
Current state
Right now linfa-clustering provides the following clustering algorithms:
Implementation choices, algorithmic details and tutorials can be found in the page dedicated to the specific algorithms.
Additionally, this crate provides the generate_blobs utility to quickly generate test datasets for clustering.
Check here for extensive benchmarks against scikit-learn’s K-means implementation.
Structs
| AppxDbscan | DBSCAN (Density-based Spatial Clustering of Applications with Noise)
clusters together neighbouring points, while points in sparse regions are labelled
as noise. Since points may be part of a cluster or noise the transform method returns
|
| AppxDbscanHyperParams | The set of hyperparameters that can be specified for the execution of the Approximated DBSCAN algorithm. |
| AppxDbscanHyperParamsBuilder | Helper struct used to construct a set of hyperparameters for the approximated DBSCAN algorithm |
| AppxDbscanLabeler | Struct that labels a set of points according to the Approximated DBSCAN algorithm |
| Dbscan | DBSCAN (Density-based Spatial Clustering of Applications with Noise)
clusters together points which are close together with enough neighbors
labelled points which are sparsely neighbored as noise. As points may be
part of a cluster or noise the predict method returns
|
| DbscanHyperParams | The set of hyperparameters that can be specified for the execution of the DBSCAN algorithm. |
| DbscanHyperParamsBuilder | Helper struct used to construct a set of hyperparameters for DBSCAN algorithm. |
| GaussianMixtureModel | Gaussian Mixture Model (GMM) aims at clustering a dataset by finding normally distributed sub datasets (hence the Gaussian Mixture name) . |
| GmmHyperParams | The set of hyperparameters that can be specified for the execution of the GMM algorithm. |
| KMeans | K-means clustering aims to partition a set of unlabeled observations into clusters, where each observation belongs to the cluster with the nearest mean. |
| KMeansHyperParams | The set of hyperparameters that can be specified for the execution of the K-means algorithm. |
| KMeansHyperParamsBuilder | An helper struct used to construct a set of valid hyperparameters for the K-means algorithm (using the builder pattern). |
Enums
| GmmCovarType | A specifier for the type of the relation between components’ covariances. |
| GmmError | An error when modeling a GMM algorithm |
| GmmInitMethod | A specifier for the method used for the initialization of the fitting algorithm of GMM |
| KMeansError | An error when modeling a KMeans algorithm |
| KMeansInit | Specifies centroid initialization algorithm for KMeans. |
Functions
| compute_inertia | We compute inertia defined as the sum of the squared distances of the closest centroid for all observations. |
| generate_blob | Generate |
| generate_blobs | Given an input matrix |