Expand description
Aprender: Next-generation machine learning library in pure Rust.
Aprender provides production-grade ML algorithms with a focus on ergonomic APIs, comprehensive testing, and backend-agnostic compute.
§Quick Start
use aprender::prelude::*;
// Create training data (y = 2*x + 1)
let x = Matrix::from_vec(4, 1, vec![
1.0,
2.0,
3.0,
4.0,
]).unwrap();
let y = Vector::from_slice(&[3.0, 5.0, 7.0, 9.0]);
// Train linear regression
let mut model = LinearRegression::new();
model.fit(&x, &y).unwrap();
// Make predictions
let predictions = model.predict(&x);
let r2 = model.score(&x, &y);
assert!(r2 > 0.99);§Modules
primitives: Core Vector and Matrix typesdata: DataFrame for named columnslinear_model: Linear regression algorithmscluster: Clustering algorithms (K-Means)code: Code analysis and code2vec embeddingsclassification: Classification algorithms (Logistic Regression)tree: Decision tree classifiersmetrics: Evaluation metricsmining: Pattern mining algorithms (Apriori for association rules)model_selection: Cross-validation and train/test splittingpreprocessing: Data transformers (scalers, encoders)optim: Optimization algorithms (SGD, Adam)loss: Loss functions for training (MSE, MAE, Huber)serialization: Model serialization (SafeTensors format)stats: Traditional descriptive statistics (quantiles, histograms)graph: Graph construction and analysis (centrality, community detection)bayesian: Bayesian inference (conjugate priors, MCMC, variational inference)glm: Generalized Linear Models (Poisson, Gamma, Binomial families)decomposition: Matrix decomposition (ICA, PCA)text: Text processing and NLP (tokenization, stop words, stemming)time_series: Time series analysis and forecasting (ARIMA)index: Approximate nearest neighbor search (HNSW)recommend: Recommendation systems (content-based, collaborative filtering)synthetic: Synthetic data generation for AutoML (EDA, back-translation, MixUp)bundle: Model bundling and memory paging for large modelscache: Cache hierarchy and model registry for large model managementchaos: Chaos engineering configuration (from renacer)inspect: Model inspection tooling (header analysis, diff, quality scoring)loading: Model loading subsystem with WCET and cryptographic agilityscoring: 100-point model quality scoring systemzoo: Model zoo protocol for sharing and discoveryembed: Data embedding with test data and tiny model representationsnative: SIMD-native model format for zero-copy inferencestack: Sovereign AI Stack integration typesonline: Online learning and dynamic retraining infrastructure
Re-exports§
pub use error::AprenderError;pub use error::Result;pub use primitives::Matrix;pub use primitives::Vector;pub use traits::Estimator;pub use traits::Transformer;pub use traits::UnsupervisedEstimator;
Modules§
- active_
learning - Active Learning strategies for label-efficient training.
- autograd
- Reverse-mode automatic differentiation engine for neural network training.
- automl
- Automated Machine Learning (AutoML) module.
- bayesian
- Bayesian inference and probability methods.
- bench
- Model evaluation and benchmarking framework (spec §7.10)
Model Evaluation and Benchmarking Framework (
aprender::bench) - bundle
- Model Bundling and Memory Paging
- cache
- Model Cache and Registry
- calibration
- Model calibration for confidence estimation.
- chaos
- Chaos Engineering Configuration
- citl
- Compiler-in-the-Loop Learning (CITL) for transpiler support. Compiler-in-the-Loop Learning (CITL) module.
- classification
- Classification algorithms.
- cluster
- Clustering algorithms.
- code
- Code Analysis and Code2Vec Embeddings
- data
- DataFrame module for named column containers.
- decomposition
- Dimensionality reduction and matrix decomposition algorithms.
- embed
- Data embedding with test data and tiny model representations (spec §4) Data Embedding Module (spec §4)
- ensemble
- Mixture of Experts (MoE) ensemble learning (GH-101)
- error
- Error types for Aprender operations.
- format
- Aprender Model Format (.apr)
- glm
- Generalized Linear Models (GLM)
- gnn
- Graph Neural Network layers for learning on graph-structured data.
- graph
- Graph construction and analysis with cache-optimized CSR representation.
- index
- Indexing data structures for efficient nearest neighbor search.
- inspect
- Model inspection tooling (spec §7.2) Model Inspection Tooling
- interpret
- Model Interpretability and Explainability.
- linear_
model - Linear models for regression.
- loading
- Model loading subsystem with WCET and cryptographic agility (spec §7.1) APR Loading Subsystem
- loss
- Loss functions for training machine learning models.
- metaheuristics
- Derivative-free global optimization (metaheuristics).
- metrics
- Evaluation metrics for ML models.
- mining
- Pattern mining algorithms for association rule discovery.
- model_
selection - Model selection utilities for cross-validation and train/test splitting.
- monte_
carlo - Monte Carlo Simulation Framework
- native
- SIMD-native model format for zero-copy Trueno inference (spec §5) SIMD-Native Model Format (spec §5)
- nn
- Neural network modules for deep learning.
- online
- Online learning and dynamic retraining infrastructure Online Learning Infrastructure for Dynamic Model Retraining
- optim
- Optimization algorithms for gradient-based learning.
- prelude
- Convenience re-exports for common usage.
- preprocessing
- Preprocessing transformers for data standardization and normalization.
- primitives
- Core compute primitives (Vector, Matrix).
- qa
- Model Quality Assurance module (spec §7.9)
Model Quality Assurance Module (
aprender::qa) - recommend
- Recommendation systems.
- regularization
- Regularization techniques for neural network training.
- scoring
- 100-point model quality scoring system (spec §7) 100-Point Model Quality Scoring System (spec §7)
- serialization
- Model Serialization Module
- stack
- Sovereign AI Stack integration types (spec §9) Sovereign AI Stack Integration (spec §9)
- stats
- Traditional descriptive statistics for vector data.
- synthetic
- Synthetic Data Generation for AutoML.
- text
- Text processing and NLP utilities.
- time_
series - Time series analysis and forecasting.
- traits
- Core traits for ML estimators and transformers.
- transfer
- Transfer Learning module for cross-project knowledge sharing.
- tree
- Decision tree algorithms and ensemble methods.
- weak_
supervision - Weak Supervision and Label Model.
- zoo
- Model zoo protocol for sharing and discovery (spec §8) Model Zoo Protocol (spec §8)