Crate sklears_clustering

Crate sklears_clustering 

Source
Expand description

Clustering algorithms for sklears

This crate provides implementations of clustering algorithms including:

  • K-Means clustering with various initialization methods
  • X-Means for automatic cluster number selection
  • G-Means for Gaussian cluster detection with automatic number selection
  • Mini-batch K-Means for large datasets
  • Fuzzy C-Means clustering with membership degrees
  • DBSCAN (Density-Based Spatial Clustering)
  • Incremental DBSCAN for streaming data and large datasets
  • HDBSCAN (Hierarchical Density-Based Spatial Clustering)
  • OPTICS (Ordering Points To Identify Clustering Structure)
  • BIRCH (Balanced Iterative Reducing and Clustering using Hierarchies)
  • Hierarchical clustering
  • Mean Shift with adaptive bandwidth estimation
  • Density Peaks clustering for automatic cluster center detection
  • KDE Clustering using kernel density estimation for density-based clustering
  • Spectral Clustering
  • Gaussian Mixture Models with model selection criteria (AIC, BIC, ICL)
  • Dirichlet Process Mixture Models for infinite mixture modeling
  • Local Outlier Factor (LOF) for density-based outlier detection
  • CURE (Clustering Using REpresentatives) for large datasets with irregular shapes
  • ROCK (RObust Clustering using linKs) for categorical data clustering
  • Streaming clustering algorithms (Online K-Means, CluStream, Sliding Window K-Means)
  • Graph clustering algorithms (Modularity-based, Louvain, Label Propagation, Spectral)
  • Evolutionary and bio-inspired clustering algorithms (PSO, GA, ACO, ABC, Differential Evolution)
  • Comprehensive validation metrics for clustering evaluation including stability analysis

These implementations leverage scirs2’s cluster module for efficient computation.

Re-exports§

pub use birch::BIRCHConfig;
pub use birch::ClusteringFeature;
pub use birch::BIRCH;
pub use cure::CUREConfig;
pub use cure::CUREDistanceMetric;
pub use cure::CUREFitted;
pub use cure::CURE;
pub use dbscan::DBSCANConfig;
pub use dbscan::DBSCAN;
pub use dbscan::NOISE;
pub use density_peaks::DensityPeaks;
pub use density_peaks::DensityPeaksConfig;
pub use density_peaks::DistanceMetric as DensityPeaksDistanceMetric;
pub use dirichlet_process::DirichletProcessConfig;
pub use dirichlet_process::DirichletProcessMixture;
pub use dirichlet_process::PredictProbaDP;
pub use ensemble::BaggingClustering;
pub use ensemble::EnsembleConfig;
pub use ensemble::EnsembleConfigBuilder;
pub use ensemble::EnsembleMethod;
pub use ensemble::EnsembleResult;
pub use ensemble::EvidenceAccumulationClustering;
pub use ensemble::VotingEnsemble;
pub use evolutionary::PSOClustering;
pub use evolutionary::PSOClusteringBuilder;
pub use evolutionary::PSOClusteringFitted;
pub use feature_selection::FeatureSelectionConfig;
pub use feature_selection::FeatureSelectionConfigBuilder;
pub use feature_selection::FeatureSelectionMethod;
pub use feature_selection::FeatureSelectionResult;
pub use feature_selection::FeatureSelector;
pub use fuzzy_cmeans::FuzzyCMeans;
pub use fuzzy_cmeans::FuzzyCMeansConfig;
pub use fuzzy_cmeans::PredictMembership;
pub use gmm::BayesianGaussianMixture;
pub use gmm::CovarianceType;
pub use gmm::GaussianMixture;
pub use gmm::GaussianMixtureConfig;
pub use gmm::ModelSelectionCriterion;
pub use gmm::ModelSelectionResult;
pub use gmm::PredictProba;
pub use gmm::WeightInit;
pub use graph_clustering::Graph;
pub use graph_clustering::GraphClusteringResult;
pub use graph_clustering::LabelPropagationClustering;
pub use graph_clustering::LabelPropagationConfig as GraphLabelPropagationConfig;
pub use graph_clustering::LouvainClustering;
pub use graph_clustering::LouvainConfig;
pub use graph_clustering::LouvainResult;
pub use graph_clustering::ModularityClustering;
pub use graph_clustering::ModularityClusteringConfig;
pub use graph_clustering::SpectralGraphClustering;
pub use graph_clustering::SpectralGraphConfig;
pub use hdbscan::ClusterStat;
pub use hdbscan::HDBSCANConfig;
pub use hdbscan::HDBSCAN;
pub use hierarchical::AgglomerativeClustering;
pub use hierarchical::AgglomerativeClusteringConfig;
pub use hierarchical::Constraint;
pub use hierarchical::ConstraintSet;
pub use hierarchical::Dendrogram;
pub use hierarchical::DendrogramExport;
pub use hierarchical::DendrogramLinkExport;
pub use hierarchical::DendrogramNode;
pub use hierarchical::DendrogramNodeExport;
pub use hierarchical::MemoryStrategy;
pub use incremental_dbscan::DistanceMetric as IncrementalDistanceMetric;
pub use incremental_dbscan::IncrementalDBSCAN;
pub use incremental_dbscan::IncrementalDBSCANConfig;
pub use kde_clustering::BandwidthMethod;
pub use kde_clustering::KDEClustering;
pub use kde_clustering::KDEClusteringConfig;
pub use kde_clustering::KernelType;
pub use kmeans::GMeans;
pub use kmeans::GMeansConfig;
pub use kmeans::InformationCriterion;
pub use kmeans::KMeans;
pub use kmeans::KMeansConfig;
pub use kmeans::KMeansInit;
pub use kmeans::MiniBatchKMeans;
pub use kmeans::MiniBatchKMeansConfig;
pub use kmeans::XMeans;
pub use kmeans::XMeansConfig;
pub use locality_sensitive_hashing::LSHConfig;
pub use locality_sensitive_hashing::LSHFamily;
pub use locality_sensitive_hashing::LSHIndex;
pub use locality_sensitive_hashing::LSHIndexStats;
pub use locality_sensitive_hashing::MemoryUsage;
pub use locality_sensitive_hashing::TableStats;
pub use lof::DistanceMetric as LOFDistanceMetric;
pub use lof::LOFConfig;
pub use lof::LOF;
pub use mean_shift::MeanShift;
pub use mean_shift::MeanShiftConfig;
pub use memory_mapped::MemoryMappedConfig;
pub use memory_mapped::MemoryMappedDistanceMatrix;
pub use memory_mapped::MemoryStats;
pub use multi_view::ConsensusClustering;
pub use multi_view::ConsensusClusteringConfig;
pub use multi_view::ConsensusClusteringFitted;
pub use multi_view::ConsensusMethod;
pub use multi_view::MultiViewData;
pub use multi_view::MultiViewKMeans;
pub use multi_view::MultiViewKMeansConfig;
pub use multi_view::MultiViewKMeansFitted;
pub use multi_view::ViewWeighting;
pub use multi_view::WeightLearning;
pub use optics::Algorithm;
pub use optics::ClusterMethod;
pub use optics::DistanceMetric as OpticsDistanceMetric;
pub use optics::Optics;
pub use optics::OpticsConfig;
pub use optics::OpticsOrdering;
pub use out_of_core::ClusterSummary;
pub use out_of_core::OutOfCoreConfig;
pub use out_of_core::OutOfCoreDataLoader;
pub use out_of_core::OutOfCoreKMeans;
pub use rock::ROCKConfig;
pub use rock::ROCKFitted;
pub use rock::ROCKSimilarity;
pub use rock::ROCK;
pub use semi_supervised::ConstrainedKMeans;
pub use semi_supervised::ConstrainedKMeansConfig;
pub use semi_supervised::ConstrainedKMeansFitted;
pub use semi_supervised::ConstraintHandling;
pub use semi_supervised::ConstraintType;
pub use semi_supervised::LabelPropagation;
pub use semi_supervised::LabelPropagationConfig;
pub use semi_supervised::LabelPropagationFitted;
pub use simd_distances::simd_distance;
pub use simd_distances::simd_distance_batch;
pub use simd_distances::simd_k_nearest_neighbors;
pub use simd_distances::DistanceMetric;
pub use simd_distances::OptimizedDistanceComputer;
pub use simd_distances::SimdDistanceMetric;
pub use sparse_matrix::GraphStats;
pub use sparse_matrix::SparseDistanceMatrix;
pub use sparse_matrix::SparseEntry;
pub use sparse_matrix::SparseMatrixConfig;
pub use sparse_matrix::SparseMatrixStats;
pub use sparse_matrix::SparseNeighborhoodGraph;
pub use spectral::Affinity;
pub use spectral::EigenSolver;
pub use spectral::NormalizationMethod;
pub use spectral::SpectralClustering;
pub use spectral::SpectralClusteringConfig;
pub use streaming::CluStream;
pub use streaming::MicroCluster;
pub use streaming::OnlineKMeans;
pub use streaming::SlidingWindowKMeans;
pub use streaming::StreamingConfig;
pub use text_clustering::DocumentClustering;
pub use text_clustering::DocumentClusteringConfig;
pub use text_clustering::DocumentClusteringResult;
pub use text_clustering::SphericalInit;
pub use text_clustering::SphericalKMeans;
pub use text_clustering::SphericalKMeansConfig;
pub use text_clustering::SphericalKMeansFitted;
pub use time_series::CentroidAveraging;
pub use time_series::ChangeDetectionTest;
pub use time_series::DTWKMeans;
pub use time_series::DTWKMeansConfig;
pub use time_series::DTWKMeansFitted;
pub use time_series::RegimeChangeConfig;
pub use time_series::RegimeChangeDetector;
pub use time_series::RegimeChangeResult;
pub use time_series::ShapeClustering;
pub use time_series::ShapeClusteringConfig;
pub use time_series::ShapeClusteringFitted;
pub use time_series::ShapeDistanceMetric;
pub use time_series::TemporalSegmentationClustering;
pub use time_series::TemporalSegmentationConfig;
pub use time_series::TemporalSegmentationResult;
pub use validation::ClusteringValidator;
pub use validation::GapStatisticResult;
pub use validation::SilhouetteResult;
pub use validation::ValidationMetric;

Modules§

birch
BIRCH (Balanced Iterative Reducing and Clustering using Hierarchies) implementation
cure
CURE (Clustering Using REpresentatives) Algorithm
dbscan
DBSCAN (Density-Based Spatial Clustering of Applications with Noise) implementation using scirs2
density_peaks
Density Peaks Clustering algorithm
dirichlet_process
Dirichlet Process Mixture Models
ensemble
Ensemble Clustering Algorithms
evolutionary
Evolutionary and bio-inspired clustering algorithms.
feature_selection
Feature Selection for Clustering
fuzzy_cmeans
Fuzzy C-Means clustering implementation
gmm
Gaussian Mixture Models (GMM)
graph_clustering
Graph Clustering Algorithms
hdbscan
HDBSCAN (Hierarchical Density-Based Spatial Clustering of Applications with Noise) implementation using scirs2
hierarchical
Hierarchical clustering implementation using scirs2
incremental_dbscan
Incremental DBSCAN for Streaming Data
kde_clustering
Kernel Density Estimation (KDE) Clustering
kmeans
K-Means clustering implementations
locality_sensitive_hashing
Locality-Sensitive Hashing (LSH) for approximate distance computations
lof
Local Outlier Factor (LOF) implementation for density-based outlier detection
mean_shift
Mean Shift Clustering
memory_mapped
Memory-mapped distance matrix computation for large datasets
multi_view
Multi-View Clustering Algorithms
optics
OPTICS (Ordering Points To Identify Clustering Structure) implementation
out_of_core
Out-of-Core Clustering Algorithms
performance
Performance optimizations for clustering algorithms
prelude
Prelude module for convenient imports
rock
ROCK (RObust Clustering using linKs) Algorithm
semi_supervised
Semi-Supervised Clustering Algorithms
simd_distances
SIMD-optimized distance computations for clustering algorithms
sparse_matrix
Sparse matrix representations for large-scale clustering
spectral
Spectral Clustering
streaming
Streaming Clustering Algorithms
text_clustering
Text and High-Dimensional Clustering Algorithms
time_series
Time Series Clustering Algorithms
validation
Comprehensive Clustering Validation Framework

Enums§

DensityDistanceMetric
Distance metric enumeration for clustering algorithms
LinkageMethod
Linkage methods for hierarchical clustering
Metric
Distance metrics for hierarchical clustering