Expand description
Ckmeans clustering is an improvement on heuristic-based 1-dimensional (univariate) clustering approaches such as Jenks. The algorithm was developed by Haizhou Wang and Mingzhou Song (2011) as a dynamic programming approach to the problem of clustering numeric data into groups with the least within-group sum-of-squared-deviations.
§Example
use ckmeans::ckmeans;
let input = vec![
1.0, 12.0, 13.0, 14.0, 15.0, 16.0, 2.0,
2.0, 3.0, 5.0, 7.0, 1.0, 2.0, 5.0, 7.0,
1.0, 5.0, 82.0, 1.0, 1.3, 1.1, 78.0,
];
let expected = vec![
vec![
1.0, 1.0, 1.0, 1.0, 1.1, 1.3, 2.0, 2.0,
2.0, 3.0, 5.0, 5.0, 5.0, 7.0, 7.0,
],
vec![12.0, 13.0, 14.0, 15.0, 16.0],
vec![78.0, 82.0],
];
let result = ckmeans(&input, 3).unwrap();
assert_eq!(result, expected);Structs§
- External
Array - Wrapper for a void pointer to a sequence of floats representing data to be clustered using ckmeans, and the sequence length. Used for FFI.
- Internal
Array - Wrapper for a void pointer to a sequence of floats representing a single ckmeans result class, and the sequence length. Used for FFI.
- Wrapper
Array - Wrapper for a void pointer to a sequence of
InternalArrays, and the sequence length. Used for FFI.
Enums§
- Ckmeans
Err - Ckmeans Errors
Traits§
- CkNum
- A trait that encompasses most common numeric types (integer and floating point)
Functions§
- ckmeans
- Minimizing the difference within groups – what Wang & Song refer to as
withinss, or within sum-of-squares, means that groups are optimally homogenous within and the data is split into representative groups. This is very useful for visualization, where one may wish to represent a continuous variable in discrete colour or style groups. This function can provide groups – or “classes” – that emphasize differences between data. - ckmeans_
ffi - An FFI wrapper for ckmeans. Data returned by this function must be freed by calling
drop_ckmeans_resultbefore exiting. - ckmeans_
indices - Cluster data and return the sorted data with cluster index ranges. This avoids copying data into separate cluster vectors.
- ckmeans_
wasm - A WASM wrapper for ckmeans
- drop_
ckmeans_ result - Drop data returned by
ckmeans_ffi. - roundbreaks
- The boundaries of the classes returned by ckmeans are “ugly” in the sense that the values returned are the lower bound of each cluster, which can’t be used for labelling, since they might have many decimal places. To create a legend, the values should be rounded — but the rounding might be either too loose (and would result in spurious decimal places), or too strict, resulting in classes ranging “from x to x”. A better approach is to choose the roundest number that separates the lowest point from a class from the highest point in the preceding class — thus giving just enough precision to distinguish the classes.
- roundbreaks_
wasm - A WASM wrapper for roundbreaks
Type Aliases§
- Cluster
Indices - Result type for ckmeans_indices: (sorted_data, cluster_ranges)