[][src]Struct kmeans::KMeans

pub struct KMeans<T> where
    T: Primitive,
    [T; 8]: SimdArray,
    Simd<[T; 8]>: SimdWrapper<T>, 
{ /* fields omitted */ }

Entrypoint of this crate's API-Surface.

Create an instance of this struct, giving the samples you want to operate on. The primitive type of the passed samples array will be the type used internaly for all calculations, as well as the result as stored in the returned KMeansState structure.

Supported variants

Supported initialization methods

Implementations

impl<T> KMeans<T> where
    T: Primitive,
    [T; 8]: SimdArray,
    Simd<[T; 8]>: SimdWrapper<T>, 
[src]

pub fn new(samples: Vec<T>, sample_cnt: usize, sample_dims: usize) -> Self[src]

Create a new instance of the KMeans structure.

Arguments

  • samples: Vector of samples [row-major] = [,,,...]
  • sample_cnt: Amount of samples, contained in the passed samples vector
  • sample_dims: Amount of dimensions each sample from the sample vector has

pub fn kmeans_lloyd<'a, F>(
    &self,
    k: usize,
    max_iter: usize,
    init: F,
    config: &KMeansConfig<'a, T>
) -> KMeansState<T> where
    F: FnOnce(&KMeans<T>, &mut KMeansState<T>, &KMeansConfig<'c, T>), 
[src]

Normal K-Means algorithm implementation. This is the same algorithm as implemented in Matlab (one-phase). (see: https://uk.mathworks.com/help/stats/kmeans.html#bueq7aj-5 Section: More About)

Arguments

  • k: Amount of clusters to search for
  • max_iter: Limit the maximum amount of iterations (just pass a high number for infinite)
  • init: Initialization-Method to use for the initialization of the k centroids
  • config: KMeansConfig instance, containing several configuration options for the calculation.

Returns

Instance of KMeansState, containing the final state (result).

Example

use kmeans::*;
fn main() {
    let (sample_cnt, sample_dims, k, max_iter) = (20000, 200, 4, 100);
 
    // Generate some random data
    let mut samples = vec![0.0f64;sample_cnt * sample_dims];
    samples.iter_mut().for_each(|v| *v = rand::random());
 
    // Calculate kmeans, using kmean++ as initialization-method
    let kmean = KMeans::new(samples, sample_cnt, sample_dims);
    let result = kmean.kmeans_lloyd(k, max_iter, KMeans::init_kmeanplusplus, &KMeansConfig::default());
 
    println!("Centroids: {:?}", result.centroids);
    println!("Cluster-Assignments: {:?}", result.assignments);
    println!("Error: {}", result.distsum);
}

pub fn kmeans_minibatch<'a, F>(
    &self,
    batch_size: usize,
    k: usize,
    max_iter: usize,
    init: F,
    config: &KMeansConfig<'a, T>
) -> KMeansState<T> where
    F: FnOnce(&KMeans<T>, &mut KMeansState<T>, &KMeansConfig<'c, T>), 
[src]

Mini-Batch k-Means implementation. (see: https://dl.acm.org/citation.cfm?id=1772862)

Arguments

  • batch_size: Amount of samples to use per iteration (higher -> better approximation but slower)
  • k: Amount of clusters to search for
  • max_iter: Limit the maximum amount of iterations (just pass a high number for infinite)
  • init: Initialization-Method to use for the initialization of the k centroids
  • config: KMeansConfig instance, containing several configuration options for the calculation.

Returns

Instance of KMeansState, containing the final state (result).

Example

use kmeans::*;
fn main() {
    let (sample_cnt, sample_dims, k, max_iter) = (20000, 200, 4, 100);

    // Generate some random data
    let mut samples = vec![0.0f64;sample_cnt * sample_dims];
    samples.iter_mut().for_each(|v| *v = rand::random());

    // Calculate kmeans, using kmean++ as initialization-method
    let kmean = KMeans::new(samples, sample_cnt, sample_dims);
    let result = kmean.kmeans_minibatch(4, k, max_iter, KMeans::init_random_sample, &KMeansConfig::default());

    println!("Centroids: {:?}", result.centroids);
    println!("Cluster-Assignments: {:?}", result.assignments);
    println!("Error: {}", result.distsum);
}

pub fn init_kmeanplusplus<'a>(
    kmean: &KMeans<T>,
    state: &mut KMeansState<T>,
    config: &KMeansConfig<'a, T>
)
[src]

K-Means++ initialization method, as implemented in Matlab

Description

This initialization method starts by selecting one sample as first centroid. Proceeding from there, the method iteratively selects one new centroid (per iteration) by calculating each sample's probability of "being a centroid". This probability is bigger, the farther away a sample is from its centroid. Then, one sample is randomly selected, while taking their probability of being the next centroid into account. This leads to a tendency of selecting centroids, that are far away from their currently assigned cluster's centroid. (see: https://uk.mathworks.com/help/stats/kmeans.html#bueq7aj-5 Section: More About)

Note

This method is not meant for direct invocation. Pass a reference to it, to an instance-method of KMeans.

pub fn init_random_partition<'a>(
    kmean: &KMeans<T>,
    state: &mut KMeansState<T>,
    config: &KMeansConfig<'a, T>
)
[src]

Random-Parition initialization method

Description

This initialization method randomly partitions the samples into k partitions, and then calculates these partion's means. These means are then used as initial clusters.

pub fn init_random_sample<'a>(
    kmean: &KMeans<T>,
    state: &mut KMeansState<T>,
    config: &KMeansConfig<'a, T>
)
[src]

Random sample initialization method (a.k.a. Forgy)

Description

This initialization method randomly selects k centroids from the samples as initial centroids.

Note

This method is not meant for direct invocation. Pass a reference to it, to an instance-method of KMeans.

Auto Trait Implementations

impl<T> RefUnwindSafe for KMeans<T> where
    T: RefUnwindSafe

impl<T> Send for KMeans<T>

impl<T> Sync for KMeans<T>

impl<T> Unpin for KMeans<T> where
    T: Unpin

impl<T> UnwindSafe for KMeans<T> where
    T: UnwindSafe

Blanket Implementations

impl<T> Any for T where
    T: 'static + ?Sized
[src]

impl<T> Borrow<T> for T where
    T: ?Sized
[src]

impl<T> BorrowMut<T> for T where
    T: ?Sized
[src]

impl<T, U> Cast<U> for T where
    U: FromCast<T>, 
[src]

impl<T> From<T> for T[src]

impl<T> FromCast<T> for T[src]

impl<T, U> Into<U> for T where
    U: From<T>, 
[src]

impl<T, U> TryFrom<U> for T where
    U: Into<T>, 
[src]

type Error = Infallible

The type returned in the event of a conversion error.

impl<T, U> TryInto<U> for T where
    U: TryFrom<T>, 
[src]

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.

impl<V, T> VZip<V> for T where
    V: MultiLane<T>,