sklears-decomposition
High-performance matrix decomposition and dimensionality reduction algorithms for Rust, featuring streaming capabilities and 10-50x speedup over scikit-learn.
Latest release:
0.1.0-beta.1(January 1, 2026). See the workspace release notes for highlights and upgrade guidance.
Overview
sklears-decomposition provides state-of-the-art decomposition algorithms:
- Classic Methods: PCA, SVD, NMF, FastICA, Factor Analysis
- Advanced Algorithms: Kernel PCA, Sparse PCA, Mini-batch Dictionary Learning
- Streaming: Incremental PCA, Online NMF, Streaming SVD
- Specialized: Tensor decomposition, Robust PCA, Randomized algorithms
- Performance: SIMD optimization, GPU support (coming), memory efficiency
Quick Start
use ;
use array;
// Principal Component Analysis
let pca = PCAbuilder
.n_components
.whiten
.svd_solver
.build;
// Non-negative Matrix Factorization
let nmf = NMFbuilder
.n_components
.init
.solver
.build;
// Independent Component Analysis
let ica = builder
.n_components
.algorithm
.fun
.build;
// Fit and transform
let X = array!;
let fitted = pca.fit?;
let X_transformed = fitted.transform?;
let X_reconstructed = fitted.inverse_transform?;
Advanced Features
Kernel PCA
use ;
let kpca = builder
.n_components
.kernel
.fit_inverse_transform
.build;
// Non-linear dimensionality reduction
let X_kpca = kpca.fit_transform?;
Sparse PCA
use ;
let sparse_pca = builder
.n_components
.alpha // Sparsity parameter
.method
.build;
// Get sparse components
let fitted = sparse_pca.fit?;
let sparse_components = fitted.components; // Many zeros
Streaming Decomposition
use ;
// Incremental PCA for large datasets
let mut ipca = builder
.n_components
.batch_size
.build;
for batch in data_stream
// Online NMF
let mut online_nmf = builder
.n_components
.learning_rate
.build;
for batch in data_stream
Dictionary Learning
use ;
// Sparse coding with learned dictionary
let dict_learning = builder
.n_components
.alpha
.transform_algorithm
.build;
// Mini-batch version for large datasets
let mb_dict = builder
.n_components
.batch_size
.n_iter
.build;
Specialized Algorithms
Robust PCA
use RobustPCA;
// Separate low-rank and sparse components
let rpca = builder
.lambda
.max_iter
.build;
let = rpca.fit_transform?;
Factor Analysis
use ;
let fa = builder
.n_components
.rotation
.svd_method
.build;
let fitted = fa.fit?;
let noise_variance = fitted.noise_variance;
Tensor Decomposition
use ;
// 3D tensor decomposition
let tensor_pca = builder
.n_components
.build;
// Tucker decomposition
let tucker = builder
.ranks
.init
.build;
// PARAFAC/CP decomposition
let parafac = PARAFACbuilder
.rank
.init
.build;
Performance Optimizations
Randomized Algorithms
use ;
// Fast approximate SVD
let rsvd = builder
.n_components
.n_oversamples
.n_iter
.build;
// Handles massive matrices efficiently
let U_sigma_Vt = rsvd.fit_transform?;
Memory-Efficient Operations
use ;
// Out-of-core PCA for datasets larger than RAM
let ooc_pca = builder
.n_components
.chunk_size
.build;
// Process data from disk
ooc_pca.fit_from_files?;
Signal Processing
use ;
// Blind source separation
let signal_ica = builder
.contrast_function
.build;
// Empirical Mode Decomposition
let emd = EMDbuilder
.n_imfs
.build;
// Variational Mode Decomposition
let vmd = VMDbuilder
.n_modes
.alpha
.build;
Quality Metrics
use ;
// Assess decomposition quality
let pca = PCAnew.fit?;
let var_ratio = pca.explained_variance_ratio;
let cumsum = var_ratio.iter.scan.;
// Reconstruction error
let X_reduced = pca.transform?;
let X_reconstructed = pca.inverse_transform?;
let error = reconstruction_error;
Benchmarks
Performance on standard datasets:
| Algorithm | scikit-learn | sklears-decomposition | Speedup |
|---|---|---|---|
| PCA | 125ms | 8ms | 15.6x |
| NMF | 450ms | 35ms | 12.9x |
| FastICA | 280ms | 18ms | 15.6x |
| Sparse PCA | 890ms | 65ms | 13.7x |
| Kernel PCA | 1200ms | 95ms | 12.6x |
Architecture
sklears-decomposition/
├── linear/ # PCA, SVD, Factor Analysis
├── matrix_factorization/ # NMF, Dictionary Learning
├── ica/ # FastICA, JADE, InfoMax
├── sparse/ # Sparse PCA, Sparse coding
├── kernel/ # Kernel PCA variants
├── streaming/ # Incremental algorithms
├── tensor/ # Multi-dimensional decomposition
└── gpu/ # GPU kernels (WIP)
Status
- Core Algorithms: 90% complete
- Streaming Support: Fully implemented
- Advanced Methods: Tensor decomposition, robust PCA ✓
- GPU Acceleration: In development
- Compilation Issues: Being resolved
Contributing
Priority areas:
- GPU acceleration for matrix operations
- Additional tensor decomposition methods
- Distributed decomposition algorithms
- Performance optimizations
See CONTRIBUTING.md for guidelines.
License
Licensed under either of:
- Apache License, Version 2.0
- MIT license
Citation