Crate aprender

Crate aprender 

Source
Expand description

Aprender: Next-generation machine learning library in pure Rust.

Aprender provides production-grade ML algorithms with a focus on ergonomic APIs, comprehensive testing, and backend-agnostic compute.

§Quick Start

use aprender::prelude::*;

// Create training data (y = 2*x + 1)
let x = Matrix::from_vec(4, 1, vec![
    1.0,
    2.0,
    3.0,
    4.0,
]).unwrap();
let y = Vector::from_slice(&[3.0, 5.0, 7.0, 9.0]);

// Train linear regression
let mut model = LinearRegression::new();
model.fit(&x, &y).unwrap();

// Make predictions
let predictions = model.predict(&x);
let r2 = model.score(&x, &y);
assert!(r2 > 0.99);

§Modules

  • primitives: Core Vector and Matrix types
  • data: DataFrame for named columns
  • linear_model: Linear regression algorithms
  • cluster: Clustering algorithms (K-Means)
  • classification: Classification algorithms (Logistic Regression)
  • tree: Decision tree classifiers
  • metrics: Evaluation metrics
  • mining: Pattern mining algorithms (Apriori for association rules)
  • model_selection: Cross-validation and train/test splitting
  • preprocessing: Data transformers (scalers, encoders)
  • optim: Optimization algorithms (SGD, Adam)
  • loss: Loss functions for training (MSE, MAE, Huber)
  • serialization: Model serialization (SafeTensors format)
  • stats: Traditional descriptive statistics (quantiles, histograms)
  • graph: Graph construction and analysis (centrality, community detection)
  • bayesian: Bayesian inference (conjugate priors, MCMC, variational inference)
  • glm: Generalized Linear Models (Poisson, Gamma, Binomial families)
  • decomposition: Matrix decomposition (ICA, PCA)
  • text: Text processing and NLP (tokenization, stop words, stemming)
  • time_series: Time series analysis and forecasting (ARIMA)
  • index: Approximate nearest neighbor search (HNSW)
  • recommend: Recommendation systems (content-based, collaborative filtering)
  • synthetic: Synthetic data generation for AutoML (EDA, back-translation, MixUp)
  • bundle: Model bundling and memory paging for large models
  • chaos: Chaos engineering configuration (from renacer)

Re-exports§

pub use error::AprenderError;
pub use error::Result;
pub use primitives::Matrix;
pub use primitives::Vector;
pub use traits::Estimator;
pub use traits::Transformer;
pub use traits::UnsupervisedEstimator;

Modules§

active_learning
Active Learning strategies for label-efficient training.
autograd
Reverse-mode automatic differentiation engine for neural network training.
automl
Automated Machine Learning (AutoML) module.
bayesian
Bayesian inference and probability methods.
bundle
Model Bundling and Memory Paging
calibration
Model calibration for confidence estimation.
chaos
Chaos Engineering Configuration
citl
Compiler-in-the-Loop Learning (CITL) for transpiler support. Compiler-in-the-Loop Learning (CITL) module.
classification
Classification algorithms.
cluster
Clustering algorithms.
data
DataFrame module for named column containers.
decomposition
Dimensionality reduction and matrix decomposition algorithms.
ensemble
Mixture of Experts (MoE) ensemble learning (GH-101)
error
Error types for Aprender operations.
format
Aprender Model Format (.apr)
glm
Generalized Linear Models (GLM)
gnn
Graph Neural Network layers for learning on graph-structured data.
graph
Graph construction and analysis with cache-optimized CSR representation.
index
Indexing data structures for efficient nearest neighbor search.
interpret
Model Interpretability and Explainability.
linear_model
Linear models for regression.
loss
Loss functions for training machine learning models.
metaheuristics
Derivative-free global optimization (metaheuristics).
metrics
Evaluation metrics for ML models.
mining
Pattern mining algorithms for association rule discovery.
model_selection
Model selection utilities for cross-validation and train/test splitting.
nn
Neural network modules for deep learning.
optim
Optimization algorithms for gradient-based learning.
prelude
Convenience re-exports for common usage.
preprocessing
Preprocessing transformers for data standardization and normalization.
primitives
Core compute primitives (Vector, Matrix).
recommend
Recommendation systems.
regularization
Regularization techniques for neural network training.
serialization
Model Serialization Module
stats
Traditional descriptive statistics for vector data.
synthetic
Synthetic Data Generation for AutoML.
text
Text processing and NLP utilities.
time_series
Time series analysis and forecasting.
traits
Core traits for ML estimators and transformers.
transfer
Transfer Learning module for cross-project knowledge sharing.
tree
Decision tree algorithms and ensemble methods.
weak_supervision
Weak Supervision and Label Model.