Skip to main content

Crate torsh_cluster

Crate torsh_cluster 

Source
Expand description

Unsupervised learning and clustering algorithms for ToRSh

This crate provides PyTorch-compatible clustering algorithms built on top of the SciRS2 ecosystem, offering high-performance implementations of popular clustering methods with extensive customization options.

§Key Features

  • K-Means Clustering: Classic centroid-based clustering with multiple variants (Lloyd, Elkan, Mini-batch)
  • Gaussian Mixture Models: Probabilistic clustering with EM algorithm (Full, Diagonal, Spherical covariance)
  • Spectral Clustering: Graph-based clustering using eigendecomposition
  • DBSCAN: Density-based clustering for arbitrary-shaped clusters with noise detection
  • HDBSCAN: Hierarchical DBSCAN for varying density clusters
  • OPTICS: Ordering Points To Identify the Clustering Structure with reachability plots
  • Hierarchical Clustering: Agglomerative clustering with multiple linkage methods
  • Online K-Means: Incremental clustering for streaming data with concept drift detection
  • Evaluation Metrics: Comprehensive metrics including silhouette, ARI, NMI, Gap Statistic, and more

§SciRS2 Integration

All clustering algorithms are built on scirs2-cluster foundation:

  • Leverages scirs2-core for random number generation and array operations
  • Uses scirs2-stats for statistical computations
  • Integrates with scirs2-metrics for clustering evaluation
  • Employs scirs2-linalg for linear algebra operations

§Example Usage

use torsh_cluster::prelude::*;
use torsh_tensor::creation::randn;

// Create sample data
let data = randn::<f32>(&[100, 2])?;

// Perform K-means clustering
let kmeans = KMeans::new(3)
    .max_iters(100)
    .tolerance(1e-4);

let result = kmeans.fit(&data)?;
println!("Cluster centers: {:?}", result.centroids);
println!("Labels: {:?}", result.labels);

Re-exports§

pub use algorithms::dbscan::DBSCANConfig;
pub use algorithms::dbscan::DBSCANResult;
pub use algorithms::dbscan::HDBSCANConfig;
pub use algorithms::dbscan::HDBSCANResult;
pub use algorithms::dbscan::DBSCAN;
pub use algorithms::dbscan::HDBSCAN;
pub use algorithms::gaussian_mixture::GMConfig;
pub use algorithms::gaussian_mixture::GMResult;
pub use algorithms::gaussian_mixture::GaussianMixture;
pub use algorithms::hierarchical::AgglomerativeClustering;
pub use algorithms::hierarchical::HierarchicalResult;
pub use algorithms::hierarchical::Linkage;
pub use algorithms::incremental::IncrementalClustering;
pub use algorithms::incremental::OnlineKMeans;
pub use algorithms::incremental::OnlineKMeansConfig;
pub use algorithms::incremental::OnlineKMeansResult;
pub use algorithms::incremental::SlidingWindowConfig;
pub use algorithms::incremental::SlidingWindowKMeans;
pub use algorithms::incremental::SlidingWindowResult;
pub use algorithms::kmeans::InitMethod;
pub use algorithms::kmeans::KMeans;
pub use algorithms::kmeans::KMeansAlgorithm;
pub use algorithms::kmeans::KMeansConfig;
pub use algorithms::kmeans::KMeansResult;
pub use algorithms::optics::OPTICSConfig;
pub use algorithms::optics::OPTICSResult;
pub use algorithms::optics::OPTICS;
pub use algorithms::spectral::SpectralClustering;
pub use algorithms::spectral::SpectralConfig;
pub use algorithms::spectral::SpectralResult;
pub use evaluation::metrics::adjusted_mutual_info_score;
pub use evaluation::metrics::adjusted_rand_score;
pub use evaluation::metrics::calinski_harabasz_score;
pub use evaluation::metrics::davies_bouldin_score;
pub use evaluation::metrics::fowlkes_mallows_score;
pub use evaluation::metrics::homogeneity_score;
pub use evaluation::metrics::normalized_mutual_info_score;
pub use evaluation::metrics::silhouette_score;
pub use evaluation::metrics::v_measure_score;
pub use evaluation::ClusteringMetric;
pub use evaluation::EvaluationResult;
pub use initialization::forgy::Forgy;
pub use initialization::kmeans_plus_plus::KMeansPlusPlus;
pub use initialization::random_partition::RandomPartition;
pub use initialization::InitializationStrategy;
pub use traits::ClusteringAlgorithm;
pub use traits::ClusteringResult;
pub use traits::Fit;
pub use traits::FitPredict;
pub use traits::Transform;
pub use utils::adaptive::suggest_dbscan_params;
pub use utils::adaptive::suggest_epsilon;
pub use utils::distance::cosine_distance;
pub use utils::distance::euclidean_distance;
pub use utils::distance::manhattan_distance;
pub use utils::distance::DistanceMetric;
pub use utils::drift_detection::CompositeDriftDetector;
pub use utils::drift_detection::DriftStatus;
pub use utils::drift_detection::PageHinkleyTest;
pub use utils::drift_detection::ADWIN;
pub use utils::drift_detection::DDM;
pub use utils::memory_efficient::ChunkedDataProcessor;
pub use utils::memory_efficient::IncrementalCentroidUpdater;
pub use utils::memory_efficient::MemoryEfficientConfig;
pub use utils::preprocessing::normalize_features;
pub use utils::preprocessing::standardize_features;
pub use utils::preprocessing::PreprocessingMethod;
pub use utils::validation::validate_cluster_input;
pub use utils::validation::validate_n_clusters;
pub use utils::validation::ClusterValidation;
pub use error::ClusterError;
pub use error::ClusterResult;

Modules§

algorithms
Clustering algorithm implementations
error
Error types for clustering operations
evaluation
Clustering evaluation metrics and utilities
initialization
Initialization strategies for clustering algorithms
prelude
Prelude module for convenient imports
traits
Core traits for clustering algorithms
utils
Utility functions for clustering operations

Constants§

VERSION
Version information
VERSION_MAJOR
VERSION_MINOR
VERSION_PATCH