Crate larch

Crate larch 

Source
Expand description

§larch

This crate provides a generic, trait-based framework for clustering and sampling algorithms. The main goal of this library is to implemenent a hierarchical clustering method to determine k for seeding a kmeans clustering algorithm.

It supports:

  • Metric-based clustering via the [Metric] trait.
  • Hierachical seeding of kmeans through tree cutting agglomerative clustering algorithm.
  • Centroid-based clustering methods like k-means and hierarchical clustering through [ClusterCompare].

§Overview

The crate is designed for composability and generic use. Implement the [Metric] trait for your own data type, then derive additional clustering behavior automatically through the provided default methods.

§Example

use larch::prelude::*;

#[derive(Clone, Copy, Debug)]
struct Point(f64, f64);

impl Metric<Point> for Point {
    fn distance(&self, other: &Point) -> f64 {
        ((self.0 - other.0).powi(2) + (self.1 - other.1).powi(2)).sqrt()
    }
}

impl ClusterCompare<Point> for Point {
    fn compute_centroid(data: &Vec<Point>) -> Point {
        let (sx, sy): (f64, f64) = data.iter().fold((0.0, 0.0), |acc, p| (acc.0 + p.0, acc.1 + p.1));
        let n = data.len() as f64;
        Point(sx / n, sy / n)
    }
}


let points = vec![Point(0.0, 0.0), Point(1.0, 1.0), Point(10.0, 10.0)];
// normal k-means
let kmeans_clusters = Point::kmeans(&points, 2);
// determine k by hierarchical clustering + tree cutting then run kmeans
let seeded_clusters = Point::hierarchical_seeded_kmeans(&points);
println!("K-means clusters: {:?}", kmeans_clusters);
println!("Seeded K-means clusters: {:?}", seeded_clusters);

§Note

This library is not highly optmized and due to the Rust implementation of [BinaryHeap] the hierarchical clustering is not deterministic.

§License

larch is licensed under the BSD 3 license (see LICENSE in the main repository).

Modules§

cluster
prelude