Crate ckmeans

Crate ckmeans 

Source
Expand description

Ckmeans clustering is an improvement on heuristic-based 1-dimensional (univariate) clustering approaches such as Jenks. The algorithm was developed by Haizhou Wang and Mingzhou Song (2011) as a dynamic programming approach to the problem of clustering numeric data into groups with the least within-group sum-of-squared-deviations.

§Example

use ckmeans::ckmeans;

let input = vec![
    1.0, 12.0, 13.0, 14.0, 15.0, 16.0, 2.0,
    2.0, 3.0, 5.0, 7.0, 1.0, 2.0, 5.0, 7.0,
    1.0, 5.0, 82.0, 1.0, 1.3, 1.1, 78.0,
];
let expected = vec![
    vec![
        1.0, 1.0, 1.0, 1.0, 1.1, 1.3, 2.0, 2.0,
        2.0, 3.0, 5.0, 5.0, 5.0, 7.0, 7.0,
    ],
    vec![12.0, 13.0, 14.0, 15.0, 16.0],
    vec![78.0, 82.0],
];

let result = ckmeans(&input, 3).unwrap();
assert_eq!(result, expected);

Structs§

ExternalArray
Wrapper for a void pointer to a sequence of floats representing data to be clustered using ckmeans, and the sequence length. Used for FFI.
InternalArray
Wrapper for a void pointer to a sequence of floats representing a single ckmeans result class, and the sequence length. Used for FFI.
WrapperArray
Wrapper for a void pointer to a sequence of InternalArrays, and the sequence length. Used for FFI.

Enums§

CkmeansErr
Ckmeans Errors

Traits§

CkNum
A trait that encompasses most common numeric types (integer and floating point)

Functions§

ckmeans
Minimizing the difference within groups – what Wang & Song refer to as withinss, or within sum-of-squares, means that groups are optimally homogenous within and the data is split into representative groups. This is very useful for visualization, where one may wish to represent a continuous variable in discrete colour or style groups. This function can provide groups – or “classes” – that emphasize differences between data.
ckmeans_ffi
An FFI wrapper for ckmeans. Data returned by this function must be freed by calling drop_ckmeans_result before exiting.
ckmeans_indices
Cluster data and return the sorted data with cluster index ranges. This avoids copying data into separate cluster vectors.
ckmeans_wasm
A WASM wrapper for ckmeans
drop_ckmeans_result
Drop data returned by ckmeans_ffi.
roundbreaks
The boundaries of the classes returned by ckmeans are “ugly” in the sense that the values returned are the lower bound of each cluster, which can’t be used for labelling, since they might have many decimal places. To create a legend, the values should be rounded — but the rounding might be either too loose (and would result in spurious decimal places), or too strict, resulting in classes ranging “from x to x”. A better approach is to choose the roundest number that separates the lowest point from a class from the highest point in the preceding class — thus giving just enough precision to distinguish the classes.
roundbreaks_wasm
A WASM wrapper for roundbreaks

Type Aliases§

ClusterIndices
Result type for ckmeans_indices: (sorted_data, cluster_ranges)