Expand description
§sklears - Machine Learning in Rust
A comprehensive machine learning library inspired by scikit-learn’s intuitive API, combining it with Rust’s performance, safety guarantees, and fearless concurrency.
§Overview
sklears brings the familiar scikit-learn API to Rust with:
- >99% scikit-learn API coverage validated for version 0.1.0-beta.1
- 14-20x performance (validated) improvements over Python implementations (14-20x validated)
- Memory safety without garbage collection overhead
- Type-safe APIs that catch errors at compile time
- Zero-copy operations for efficient data handling
- Native parallelism with fearless concurrency via Rayon
- GPU acceleration with optional CUDA and WebGPU backends
§Quick Start
ⓘ
use sklears::linear::LinearRegression;
use sklears::traits::{Fit, Predict};
use scirs2_core::ndarray::Array2;
// Create training data
let x_train = Array2::from_shape_vec((100, 5), (0..500).map(|i| i as f64).collect()).unwrap();
let y_train = Array2::from_shape_vec((100, 1), (0..100).map(|i| i as f64).collect()).unwrap();
// Train a linear regression model
let model = LinearRegression::new();
let trained_model = model.fit(&x_train, &y_train).unwrap();
// Make predictions
let predictions = trained_model.predict(&x_train).unwrap();§Feature Flags
sklears uses feature flags to allow selective compilation of algorithm modules:
§Algorithm Modules
linear- Linear models (LinearRegression, Ridge, Lasso, LogisticRegression)clustering- Clustering algorithms (KMeans, DBSCAN, etc.)ensemble- Ensemble methods (RandomForest, GradientBoosting, AdaBoost)svm- Support Vector Machinestree- Decision treesneural- Neural networks (MLP, autoencoders)neighbors- K-Nearest Neighbors algorithmsdecomposition- Dimensionality reduction (PCA, NMF, ICA)naive-bayes- Naive Bayes classifiersgaussian-process- Gaussian Process models
§Utilities
preprocessing- Data preprocessing and transformersmetrics- Evaluation metricsmodel-selection- Cross-validation and hyperparameter searchdatasets- Dataset generators and loadersfeature-selection- Feature selection algorithmsfeature-extraction- Feature extraction methods
§Performance & Interop
parallel- Enable Rayon parallelism (enabled by default)serde- Serialization supportsimd- SIMD optimizationsgpu- GPU acceleration (CUDA/WebGPU)
§Architecture
sklears follows a three-layer architecture:
- Data Layer: Polars DataFrames for efficient data manipulation
- Computation Layer: NumRS2/ndarray arrays with BLAS/LAPACK backends
- Algorithm Layer: ML algorithms leveraging SciRS2’s scientific computing
§Type-Safe State Machines
Models use Rust’s type system to prevent common errors at compile time:
ⓘ
use sklears::linear::LinearRegression;
use sklears::traits::{Fit, Predict};
use scirs2_core::ndarray::{Array1, Array2};
let model = LinearRegression::new(); // Untrained state
// ❌ This won't compile - can't predict with untrained model:
// let predictions = model.predict(&x);
let x = Array2::zeros((10, 5));
let y = Array1::zeros(10);
// ✅ After fitting, model transitions to Trained state
let trained = model.fit(&x, &y).unwrap();
let predictions = trained.predict(&x).unwrap();§Performance
Benchmarks show significant speedups over scikit-learn:
| Operation | Dataset Size | scikit-learn | sklears | Speedup |
|---|---|---|---|---|
| Linear Regression | 1M × 100 | 2.3s | 0.52s | 4.4x |
| K-Means | 100K × 50 | 5.1s | 0.48s | 10.6x |
| Random Forest | 50K × 20 | 12.8s | 0.71s | 18.0x |
| StandardScaler | 1M × 100 | 0.84s | 0.016s | 52.5x |
§Integration with SciRS2
sklears is built on the SciRS2 ecosystem for scientific computing:
scirs2-core- Core array operations and random number generationscirs2-linalg- Linear algebra (SVD, QR, eigenvalues, BLAS/LAPACK)scirs2-optimize- Optimization algorithms (L-BFGS, gradient descent)scirs2-stats- Statistical functions and distributionsscirs2-neural- Neural network primitives and autograd
§Examples
See the examples/ directory for comprehensive examples:
- Basic linear regression
- Classification pipelines
- Cross-validation and hyperparameter tuning
- Custom estimators
- Neural network training
§Documentation
§Minimum Supported Rust Version (MSRV)
Rust 1.70 or later is required.
Re-exports§
pub use sklears_utils as utils;pub use sklears_linear as linear;pub use sklears_clustering as clustering;pub use sklears_neighbors as neighbors;pub use sklears_model_selection as model_selection;pub use sklears_metrics as metrics;
Modules§
- advanced_
array_ ops - advanced_
benchmarking - Advanced Benchmarking Suite with Performance Regression Detection
- algorithm_
markers - api_
analyzers - API Analysis Engines and Validation Components
- api_
data_ structures - Core Data Structures for API Reference Generation
- api_
formatters - Output Formatters and Document Generators
- api_
generator_ config - API Generator Configuration Module
- async_
traits - auto_
benchmark_ generation - Automatic Benchmark Generation System
- autodiff
- benchmarking
- code_
coverage - compatibility
- compile_
time_ macros - Compile-Time Model Verification and Macro System
- compile_
time_ validation - contribution
- dataset
- dependency_
audit - dependent_
types - Dependent Type Experiments for sklears-core
- derive_
macros - distributed
- distributed_
algorithms - Distributed Machine Learning Algorithms
- dsl_
impl - Domain-Specific Language (DSL) implementation for machine learning pipelines
- effect_
types - ensemble_
improvements - error
- exhaustive_
error_ handling - exotic_
hardware - Auto-generated module structure
- exotic_
hardware_ impls - Concrete Exotic Hardware Implementations
- fallback_
strategies - features
- formal_
verification - Formal Verification System for Machine Learning Algorithms
- format_
io - formatting
- input_
sanitization - interactive_
api_ reference - Interactive API Reference Generator
- interactive_
playground - Auto-generated module structure
- macros
- memory_
safety - mock_
objects - parallel
- performance_
profiling - Advanced Performance Profiling and Optimization Framework
- performance_
reporting - plugin
- Plugin System Module
- plugin_
marketplace_ impl - Concrete Plugin Marketplace Implementation
- prelude
- public
- refinement_
types - Refinement Types System for sklears-core
- search_
engines - streaming_
lifetimes - trait_
explorer - Trait Explorer Module
- traits
- tutorial_
examples - Concrete Tutorial Examples and Learning Paths
- tutorial_
system - types
- unsafe_
audit - validation
- validation_
examples - wasm_
playground_ impl - WebAssembly Playground Implementation
Macros§
- auto_
benchmark - Macro to automatically generate benchmarks for a type
- benchmark_
suite - Creates comprehensive benchmarking suite for ML algorithms
- cfg_
feature - Macro for conditional compilation based on feature flags
- cfg_
impl - Macro for feature-gated function implementations
- cfg_
type - Macro for conditional type definitions based on features
- define_
algorithm_ category - Macro for defining algorithm categories with compile-time checking
- define_
estimator - Advanced macro for creating ML estimators with builder pattern and validation
- define_
ml_ algorithm - Creates a complete ML algorithm with all necessary boilerplate
- define_
ml_ float_ bounds - Helper macro for creating trait bound combinations commonly used in ML
- destructure
- Macro for easy destructuring of complex types
- error_
context - Macro for adding location context automatically
- estimator_
test_ suite - Creates a test suite for an estimator implementation
- impl_
algorithm_ markers - Macro for implementing multiple marker traits at once
- impl_
default_ config - Helper macro for creating default trait implementations
- impl_
ml_ traits - Implements standard machine learning traits for an estimator
- io_
effect - parameter_
map - Creates a simple parameter mapping for algorithm configurations
- pattern_
guard - Macro for creating pattern guards with custom validation logic
- pure_
effect - Convenience macros for effect creation
- quick_
dataset - Advanced macro definitions for sklears-core
- random_
effect - refinement_
predicate - Macro to create a custom refinement predicate
- simd_
operations - Creates SIMD-optimized operation implementations
- validate
- Convenience macro for validation
- validate_
performance - Macro for performance validation
- validated_
param - Macro for creating compile-time validated parameters
- validation_
rules - Macro for creating type-safe validation rules
- verify_
dimensions - Macro for dimension verification
- verify_
model - Macro for model verification
- with_
fallback - Convenience macro for executing operations with fallback