Expand description
§RustyML - A Comprehensive Machine Learning and Deep Learning Library in Pure Rust
RustyML is a high-performance machine learning and deep learning library written entirely in Rust, leveraging Rust’s memory safety, concurrency features, and zero-cost abstractions to provide efficient implementations of classical ML algorithms, neural networks, and data processing utilities.
§Overview
This crate offers a complete ecosystem for machine learning tasks, from data preprocessing and feature engineering to model training and evaluation. All implementations are designed with production use in mind, featuring robust error handling, parallel processing optimization, and comprehensive input validation.
§Architecture
The library is organized into six main modules, each gated by feature flags:
§machine_learning
Classical machine learning algorithms for supervised and unsupervised learning:
- Regression: Linear Regression with L1/L2 regularization
- Classification: Logistic Regression, KNN, Decision Tree, SVC, Linear SVC, LDA
- Clustering: KMeans, DBSCAN, MeanShift
- Anomaly Detection: Isolation Forest
§neural_network
Complete neural network framework with flexible architecture design:
- Layers: Dense, RNN, LSTM, Convolution, Pooling, Dropout
- Optimizers: SGD, Adam, RMSProp, AdaGrad
- Loss Functions: MSE, MAE, Binary/Categorical Cross-Entropy
- Models: Sequential architecture for feed-forward networks
§utility
Data preprocessing and dimensionality reduction utilities:
- Dimensionality Reduction: PCA, Kernel PCA, LDA, t-SNE
- Preprocessing: Standardization, train-test splitting
- Kernel Functions: RBF, Linear, Polynomial, Sigmoid, Cosine
§metric
Comprehensive evaluation metrics for model performance assessment:
- Regression: MSE, RMSE, MAE, R² score
- Classification: Accuracy, Confusion Matrix, AUC-ROC, F1-score
- Clustering: Adjusted Rand Index, Normalized/Adjusted Mutual Information, Silhouette Score
§math
Mathematical utilities and statistical functions:
- Distance Metrics: Euclidean, Manhattan, Minkowski
- Impurity Measures: Entropy, Gini, Information Gain
- Statistical Functions: Variance, standard deviation, SST, SSE
- Activation Functions: Sigmoid, logistic loss
§dataset
Access to standardized datasets for experimentation:
- Iris, Diabetes, Boston Housing, Wine Quality, Titanic
- Pre-processed and ready for immediate use
§Quick Start
§Machine Learning Example
Add RustyML to your Cargo.toml:
[dependencies]
rustyml = { version = "*", features = ["machine_learning"] }
# Or use features = ["full"] to enable all modules
# Or use `features = ["default"]` to enable default modules (`machine_learning` and `neural_network`)In your Rust code, write:
use rustyml::machine_learning::linear_regression::*;
use ndarray::{Array1, Array2};
// Create a linear regression model
let mut model = LinearRegression::new(true, 0.01, 1000, 1e-6, None).unwrap();
// Prepare training data
let raw_x = vec![vec![1.0, 2.0], vec![2.0, 3.0], vec![3.0, 4.0]];
let raw_y = vec![6.0, 9.0, 12.0];
// Convert Vec to ndarray types
let x = Array2::from_shape_vec((3, 2), raw_x.into_iter().flatten().collect()).unwrap();
let y = Array1::from_vec(raw_y);
// Train the model
model.fit(&x.view(), &y.view()).unwrap();
// Make predictions
let new_data = Array2::from_shape_vec((1, 2), vec![4.0, 5.0]).unwrap();
let _predictions = model.predict(&new_data.view());
// Save the trained model to a file
model.save_to_path("linear_regression_model.json").unwrap();
// Load the model from the file
let loaded_model = LinearRegression::load_from_path("linear_regression_model.json").unwrap();
// Use the loaded model for predictions
let _loaded_predictions = loaded_model.predict(&new_data.view());
// Since Clone is implemented, the model can be easily cloned
let _model_copy = model.clone();
// Since Debug is implemented, detailed model information can be printed
println!("{:?}", model);§Neural Network Example
Add RustyML to your Cargo.toml:
[dependencies]
rustyml = { version = "*", features = ["neural_network"] }
# Or use `features = ["full"]` to enable all modules
# Or use `features = ["default"]` to enable default modules (`machine_learning` and `neural_network`)In your Rust code, write:
use rustyml::neural_network::{
sequential::Sequential,
layer::{Dense, ReLU, Softmax},
optimizer::Adam,
loss_function::CategoricalCrossEntropy,
};
use ndarray::Array;
// Create training data
let x = Array::ones((32, 784)).into_dyn(); // 32 samples, 784 features
let y = Array::ones((32, 10)).into_dyn(); // 32 samples, 10 classes
// Build a neural network
let mut model = Sequential::new();
model
.add(Dense::new(784, 128, ReLU::new()).unwrap())
.add(Dense::new(128, 64, ReLU::new()).unwrap())
.add(Dense::new(64, 10, Softmax::new()).unwrap())
.compile(Adam::new(0.001, 0.9, 0.999, 1e-8).unwrap(), CategoricalCrossEntropy::new());
// Display model structure
model.summary();
// Train the model
model.fit(&x, &y, 10).unwrap();
// Save model weights to file
model.save_to_path("model.json").unwrap();
// Create a new model with the same architecture
let mut new_model = Sequential::new();
new_model
.add(Dense::new(784, 128, ReLU::new()).unwrap())
.add(Dense::new(128, 64, ReLU::new()).unwrap())
.add(Dense::new(64, 10, Softmax::new()).unwrap());
// Load weights from file
new_model.load_from_path("model.json").unwrap();
// Compile before using (required for training, optional for prediction)
new_model.compile(Adam::new(0.001, 0.9, 0.999, 1e-8).unwrap(), CategoricalCrossEntropy::new());
// Make predictions with loaded model
let predictions = new_model.predict(&x);
println!("Predictions shape: {:?}", predictions.shape());§Feature Flags
The crate uses feature flags for modular compilation:
| Feature | Description |
|---|---|
machine_learning | Classical ML algorithms (depends on math) |
neural_network | Neural network framework |
utility | Data preprocessing and dimensionality reduction |
metric | Evaluation metrics |
math | Mathematical utilities |
dataset | Standard datasets |
default | Enables machine_learning and neural_network |
full | Enables all features |
Modules§
- dataset
- Module
datasetprovides access to standardized datasets for machine learning experimentation and algorithm benchmarking. - error
- Error handling module containing custom error types for machine learning operations.
- machine_
learning - Module
machine_learningprovides implementations of various machine learning algorithms and models. - math
- Module
mathcontains mathematical utility functions for statistical operations and model evaluation. - metric
- Module
metricprovides comprehensive evaluation metrics for statistical analysis and machine learning model performance assessment. - neural_
network - Module
neural_networkprovides components for building and training neural networks with flexible architecture design. - prelude
- Module
preludere-exports the most commonly used types and traits from this crate. - utility
- Module
utilityprovides a collection of utility functions and data processing tools to support machine learning operations.
Enums§
- Kernel
Type - Kernel function types for Support Vector Machine