Crate exact_clustering

Source
Expand description

Find optimal clusterings and hierarchical clusterings on up to 32 points. If you only need approximate clusterings, there are excellent other crates available that run significantly faster.

To create discrete (weighted or unweighted) clustering-problems, see Discrete. For continuous kmeans-clustering, see KMeans and WeightedKMeans. If you’d like to solve other clustering-problems, implement the Cost-trait (and feel free to submit a pull-request!), or submit an issue on GitHub.

Among others, the Cost-trait allows you to calculate:

§Example

use ndarray::prelude::*;
use std::collections::BTreeSet;
use exact_clustering::{Cost as _, Discrete, KMeans};

// Set of 2d-points looking like ⠥
let points = vec![
    array![0.0, 0.0],
    array![1.0, 0.0],
    array![0.0, 2.0],
];
// Instances are mutable to allow caching cluster-costs
let mut discrete_kmedian = Discrete::kmedian(&points).unwrap();
// All optimal clusterings are calculated at once to permit some speedups.
let (cost, clusters) = &discrete_kmedian.optimal_clusterings()[2];

assert_eq!(*cost, 1.0);
// Each cluster in the returned clustering is a set of point-indices:
assert_eq!(
    BTreeSet::from([BTreeSet::from([0, 1]), BTreeSet::from([2])]),
    clusters
        .iter()
        .cloned()
        .map(BTreeSet::from_iter)
        .collect(),
);

let price_of_hierarchy = discrete_kmedian.price_of_hierarchy().0;
assert_eq!(price_of_hierarchy, 1.0);

let price_of_greedy = discrete_kmedian.price_of_greedy().0;
assert_eq!(price_of_greedy, 1.0);

Structs§

Cluster
A compact representation of a cluster of points, using a bitset.
ClusterIter
An iterator over the point-indices in a cluster.
Discrete
A clustering-problem where each center must be one of the points that are to be clustered.
KMeans
A clustering-problem where each center can be any point in the metric space.
WeightedKMeans
A weighted clustering-problem where each center can be any point in the metric space.

Enums§

Error
An error-type for creating clustering-problems.

Constants§

MAX_POINT_COUNT
The maximum number of points we can cluster before we overflow Storage.

Traits§

Cost
A trait for cost-functions for a class of clustering-problems.

Functions§

cluster_from_iterator
Construct a cluster from an iterator of point-indices.

Type Aliases§

Clustering
A partition of a set of points into disjoint clusters.
Point
A single point.
Storage
The storage-medium for representing clusters compactly via a bitset.
WeightedPoint
A weighted point.