pub struct KMeans<T, const LANES: usize, D: DistanceFunction<T, LANES>>where
T: Primitive,
LaneCount<LANES>: SupportedLaneCount,
Simd<T, LANES>: SupportedSimdArray<T, LANES>,{ /* private fields */ }Expand description
Entrypoint of this crate’s API-Surface.
Create an instance of this struct, giving the samples you want to operate on. The primitive type
of the passed samples array will be the type used internaly for all calculations, as well as the result
as stored in the returned KMeansState structure.
§Supported variants
- k-Means clustering (Lloyd)
KMeans::kmeans_lloyd - Mini-Batch k-Means clustering
KMeans::kmeans_minibatch
§Supported initialization methods
- K-Mean++
KMeans::init_kmeanplusplus - Random-Sample
KMeans::init_random_sample - Random-Partition
KMeans::init_random_partition
§Generics
T: The type of primitive to work with (e.g. f32 of f64)LANES: The amount of SIMD lanes (values in one SIMD vector) to limit the generated code to Note that the generated code selects the appropriate instructions for every platformD: The distance function to use. Default is Euclidean distance.
Implementations§
Source§impl<T, const LANES: usize, D: DistanceFunction<T, LANES>> KMeans<T, LANES, D>where
T: Primitive,
LaneCount<LANES>: SupportedLaneCount,
Simd<T, LANES>: SupportedSimdArray<T, LANES>,
impl<T, const LANES: usize, D: DistanceFunction<T, LANES>> KMeans<T, LANES, D>where
T: Primitive,
LaneCount<LANES>: SupportedLaneCount,
Simd<T, LANES>: SupportedSimdArray<T, LANES>,
Sourcepub fn new(
samples: &[T],
sample_cnt: usize,
sample_dims: usize,
distance_fn: D,
) -> Self
pub fn new( samples: &[T], sample_cnt: usize, sample_dims: usize, distance_fn: D, ) -> Self
Create a new instance of the KMeans structure.
§Arguments
- samples: Slice of samples [row-major] = [
, , ,…] - sample_cnt: Amount of samples, contained in the passed samples vector
- sample_dims: Amount of dimensions each sample from the sample vector has
- distance_fn: Distance function to use for the calculation
Sourcepub fn kmeans_lloyd<F>(
&self,
k: usize,
max_iter: usize,
init: F,
config: &KMeansConfig<'_, T>,
) -> KMeansState<T>
pub fn kmeans_lloyd<F>( &self, k: usize, max_iter: usize, init: F, config: &KMeansConfig<'_, T>, ) -> KMeansState<T>
Normal K-Means algorithm implementation. This is the same algorithm as implemented in Matlab (one-phase). (see: https://uk.mathworks.com/help/stats/kmeans.html#bueq7aj-5 Section: More About)
§Arguments
- k: Amount of clusters to search for
- max_iter: Limit the maximum amount of iterations (just pass a high number for infinite)
- init: Initialization-Method to use for the initialization of the k centroids
- config:
KMeansConfiginstance, containing several configuration options for the calculation.
§Returns
Instance of KMeansState, containing the final state (result).
§Example
use kmeans::*;
let (sample_cnt, sample_dims, k, max_iter) = (20000, 200, 4, 100);
// Generate some random data
let mut samples = vec![0.0f64;sample_cnt * sample_dims];
samples.iter_mut().for_each(|v| *v = rand::random());
// Calculate kmeans, using kmean++ as initialization-method
// KMeans<_, 8> specifies to use f64 SIMD vectors with 8 lanes (e.g. AVX512)
let kmean: KMeans<_, 8, _> = KMeans::new(&samples, sample_cnt, sample_dims, EuclideanDistance);
let result = kmean.kmeans_lloyd(k, max_iter, KMeans::init_kmeanplusplus, &KMeansConfig::default());
println!("Centroids: {:?}", result.centroids);
println!("Cluster-Assignments: {:?}", result.assignments);
println!("Error: {}", result.distsum);Sourcepub fn kmeans_minibatch<F>(
&self,
batch_size: usize,
k: usize,
max_iter: usize,
init: F,
config: &KMeansConfig<'_, T>,
) -> KMeansState<T>where
for<'c> F: FnOnce(&KMeans<T, LANES, D>, &mut KMeansState<T>, &KMeansConfig<'c, T>),
T: Primitive,
LaneCount<LANES>: SupportedLaneCount,
Simd<T, LANES>: SupportedSimdArray<T, LANES>,
pub fn kmeans_minibatch<F>(
&self,
batch_size: usize,
k: usize,
max_iter: usize,
init: F,
config: &KMeansConfig<'_, T>,
) -> KMeansState<T>where
for<'c> F: FnOnce(&KMeans<T, LANES, D>, &mut KMeansState<T>, &KMeansConfig<'c, T>),
T: Primitive,
LaneCount<LANES>: SupportedLaneCount,
Simd<T, LANES>: SupportedSimdArray<T, LANES>,
Mini-Batch k-Means implementation. (see: https://dl.acm.org/citation.cfm?id=1772862)
§Arguments
- batch_size: Amount of samples to use per iteration (higher -> better approximation but slower)
- k: Amount of clusters to search for
- max_iter: Limit the maximum amount of iterations (just pass a high number for infinite)
- init: Initialization-Method to use for the initialization of the k centroids
- config:
KMeansConfiginstance, containing several configuration options for the calculation.
§Returns
Instance of KMeansState, containing the final state (result).
§Example
use kmeans::*;
let (sample_cnt, sample_dims, k, max_iter) = (20000, 200, 4, 100);
// Generate some random data
let mut samples = vec![0.0f64;sample_cnt * sample_dims];
samples.iter_mut().for_each(|v| *v = rand::random());
// Calculate kmeans, using kmean++ as initialization-method
// KMeans<_, 8> specifies to use f64 SIMD vectors with 8 lanes (e.g. AVX512)
let kmean: KMeans<_, 8, _> = KMeans::new(&samples, sample_cnt, sample_dims, EuclideanDistance);
let result = kmean.kmeans_minibatch(4, k, max_iter, KMeans::init_random_sample, &KMeansConfig::default());
println!("Centroids: {:?}", result.centroids);
println!("Cluster-Assignments: {:?}", result.assignments);
println!("Error: {}", result.distsum);Sourcepub fn init_kmeanplusplus(
kmean: &KMeans<T, LANES, D>,
state: &mut KMeansState<T>,
config: &KMeansConfig<'_, T>,
)
pub fn init_kmeanplusplus( kmean: &KMeans<T, LANES, D>, state: &mut KMeansState<T>, config: &KMeansConfig<'_, T>, )
K-Means++ initialization method, as implemented in Matlab
§Description
This initialization method starts by selecting one sample as first centroid. Proceeding from there, the method iteratively selects one new centroid (per iteration) by calculating each sample’s probability of “being a centroid”. This probability is bigger, the farther away a sample is from its centroid. Then, one sample is randomly selected, while taking their probability of being the next centroid into account. This leads to a tendency of selecting centroids, that are far away from their currently assigned cluster’s centroid. (see: https://uk.mathworks.com/help/stats/kmeans.html#bueq7aj-5 Section: More About)
§Note
This method is not meant for direct invocation. Pass a reference to it, to an instance-method of KMeans.
Sourcepub fn init_random_partition(
kmean: &KMeans<T, LANES, D>,
state: &mut KMeansState<T>,
config: &KMeansConfig<'_, T>,
)
pub fn init_random_partition( kmean: &KMeans<T, LANES, D>, state: &mut KMeansState<T>, config: &KMeansConfig<'_, T>, )
Random-Parition initialization method
§Description
This initialization method randomly partitions the samples into k partitions, and then calculates these partion’s means. These means are then used as initial clusters.
Sourcepub fn init_random_sample(
kmean: &KMeans<T, LANES, D>,
state: &mut KMeansState<T>,
config: &KMeansConfig<'_, T>,
)
pub fn init_random_sample( kmean: &KMeans<T, LANES, D>, state: &mut KMeansState<T>, config: &KMeansConfig<'_, T>, )
Sourcepub fn init_precomputed(
centroids: Vec<T>,
) -> impl Fn(&KMeans<T, LANES, D>, &mut KMeansState<T>, &KMeansConfig<'_, T>)
pub fn init_precomputed( centroids: Vec<T>, ) -> impl Fn(&KMeans<T, LANES, D>, &mut KMeansState<T>, &KMeansConfig<'_, T>)
Auto Trait Implementations§
impl<T, const LANES: usize, D> Freeze for KMeans<T, LANES, D>
impl<T, const LANES: usize, D> RefUnwindSafe for KMeans<T, LANES, D>
impl<T, const LANES: usize, D> Send for KMeans<T, LANES, D>
impl<T, const LANES: usize, D> Sync for KMeans<T, LANES, D>
impl<T, const LANES: usize, D> Unpin for KMeans<T, LANES, D>
impl<T, const LANES: usize, D> UnwindSafe for KMeans<T, LANES, D>
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more