Expand description
Unsupervised learning and clustering algorithms for ToRSh
This crate provides PyTorch-compatible clustering algorithms built on top of the SciRS2 ecosystem, offering high-performance implementations of popular clustering methods with extensive customization options.
§Key Features
- K-Means Clustering: Classic centroid-based clustering with multiple variants (Lloyd, Elkan, Mini-batch)
- Gaussian Mixture Models: Probabilistic clustering with EM algorithm (Full, Diagonal, Spherical covariance)
- Spectral Clustering: Graph-based clustering using eigendecomposition
- DBSCAN: Density-based clustering for arbitrary-shaped clusters with noise detection
- HDBSCAN: Hierarchical DBSCAN for varying density clusters
- OPTICS: Ordering Points To Identify the Clustering Structure with reachability plots
- Hierarchical Clustering: Agglomerative clustering with multiple linkage methods
- Online K-Means: Incremental clustering for streaming data with concept drift detection
- Evaluation Metrics: Comprehensive metrics including silhouette, ARI, NMI, Gap Statistic, and more
§SciRS2 Integration
All clustering algorithms are built on scirs2-cluster foundation:
- Leverages scirs2-core for random number generation and array operations
- Uses scirs2-stats for statistical computations
- Integrates with scirs2-metrics for clustering evaluation
- Employs scirs2-linalg for linear algebra operations
§Example Usage
use torsh_cluster::prelude::*;
use torsh_tensor::creation::randn;
// Create sample data
let data = randn::<f32>(&[100, 2])?;
// Perform K-means clustering
let kmeans = KMeans::new(3)
.max_iters(100)
.tolerance(1e-4);
let result = kmeans.fit(&data)?;
println!("Cluster centers: {:?}", result.centroids);
println!("Labels: {:?}", result.labels);Re-exports§
pub use algorithms::dbscan::DBSCANConfig;pub use algorithms::dbscan::DBSCANResult;pub use algorithms::dbscan::HDBSCANConfig;pub use algorithms::dbscan::HDBSCANResult;pub use algorithms::dbscan::DBSCAN;pub use algorithms::dbscan::HDBSCAN;pub use algorithms::gaussian_mixture::GMConfig;pub use algorithms::gaussian_mixture::GMResult;pub use algorithms::gaussian_mixture::GaussianMixture;pub use algorithms::hierarchical::AgglomerativeClustering;pub use algorithms::hierarchical::HierarchicalResult;pub use algorithms::hierarchical::Linkage;pub use algorithms::incremental::IncrementalClustering;pub use algorithms::incremental::OnlineKMeans;pub use algorithms::incremental::OnlineKMeansConfig;pub use algorithms::incremental::OnlineKMeansResult;pub use algorithms::incremental::SlidingWindowConfig;pub use algorithms::incremental::SlidingWindowKMeans;pub use algorithms::incremental::SlidingWindowResult;pub use algorithms::kmeans::InitMethod;pub use algorithms::kmeans::KMeans;pub use algorithms::kmeans::KMeansAlgorithm;pub use algorithms::kmeans::KMeansConfig;pub use algorithms::kmeans::KMeansResult;pub use algorithms::optics::OPTICSConfig;pub use algorithms::optics::OPTICSResult;pub use algorithms::optics::OPTICS;pub use algorithms::spectral::SpectralClustering;pub use algorithms::spectral::SpectralConfig;pub use algorithms::spectral::SpectralResult;pub use evaluation::metrics::adjusted_mutual_info_score;pub use evaluation::metrics::adjusted_rand_score;pub use evaluation::metrics::calinski_harabasz_score;pub use evaluation::metrics::davies_bouldin_score;pub use evaluation::metrics::fowlkes_mallows_score;pub use evaluation::metrics::homogeneity_score;pub use evaluation::metrics::normalized_mutual_info_score;pub use evaluation::metrics::silhouette_score;pub use evaluation::metrics::v_measure_score;pub use evaluation::ClusteringMetric;pub use evaluation::EvaluationResult;pub use initialization::forgy::Forgy;pub use initialization::kmeans_plus_plus::KMeansPlusPlus;pub use initialization::random_partition::RandomPartition;pub use initialization::InitializationStrategy;pub use traits::ClusteringAlgorithm;pub use traits::ClusteringResult;pub use traits::Fit;pub use traits::FitPredict;pub use traits::Transform;pub use utils::adaptive::suggest_dbscan_params;pub use utils::adaptive::suggest_epsilon;pub use utils::distance::cosine_distance;pub use utils::distance::euclidean_distance;pub use utils::distance::manhattan_distance;pub use utils::distance::DistanceMetric;pub use utils::drift_detection::CompositeDriftDetector;pub use utils::drift_detection::DriftStatus;pub use utils::drift_detection::PageHinkleyTest;pub use utils::drift_detection::ADWIN;pub use utils::drift_detection::DDM;pub use utils::memory_efficient::ChunkedDataProcessor;pub use utils::memory_efficient::IncrementalCentroidUpdater;pub use utils::memory_efficient::MemoryEfficientConfig;pub use utils::preprocessing::normalize_features;pub use utils::preprocessing::standardize_features;pub use utils::preprocessing::PreprocessingMethod;pub use utils::validation::validate_cluster_input;pub use utils::validation::validate_n_clusters;pub use utils::validation::ClusterValidation;pub use error::ClusterError;pub use error::ClusterResult;
Modules§
- algorithms
- Clustering algorithm implementations
- error
- Error types for clustering operations
- evaluation
- Clustering evaluation metrics and utilities
- initialization
- Initialization strategies for clustering algorithms
- prelude
- Prelude module for convenient imports
- traits
- Core traits for clustering algorithms
- utils
- Utility functions for clustering operations
Constants§
- VERSION
- Version information
- VERSION_
MAJOR - VERSION_
MINOR - VERSION_
PATCH