Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
A pure Rust training framework providing autograd, LoRA/QLoRA fine-tuning, quantization (Int4/Int8), model merging, knowledge distillation, and Compiler-in-the-Loop (CITL) training. Built on trueno for SIMD-accelerated compute and aprender for ML algorithms.
Features | Installation | Usage | Architecture | Quality | Sovereign Stack | Documentation | License
Table of Contents
- What is entrenar?
- Installation
- Usage
- Features
- Architecture
- Quality
- Sovereign AI Stack
- Documentation
- Contributing
- License
What is entrenar?
Entrenar (Spanish: "to train") is a production-grade neural network training library in pure Rust. It provides everything needed to train, fine-tune, quantize, merge, and distill models -- with no Python dependency.
Core capabilities:
- Autograd Engine -- Tape-based reverse-mode automatic differentiation
- Optimizers -- SGD, Adam, AdamW with cosine scheduling and gradient clipping
- LoRA / QLoRA -- Parameter-efficient fine-tuning with 4-bit quantized base weights
- Quantization -- QAT, PTQ, GGUF-compatible Q4_0/Q8_0, NF4 training
- Model Merging -- TIES, DARE, SLERP algorithms
- Knowledge Distillation -- Multi-teacher, progressive layer-wise
- CITL -- Compiler-in-the-Loop training for transpiler optimization
- GPU Training -- WGPU backend (AMD/Intel/cross-platform), CUDA/cuBLAS (NVIDIA)
- Monitoring -- Real-time metrics, drift detection, Andon alerts
Part of the PAIML Sovereign AI Stack.
Installation
Library
Add to your Cargo.toml:
[]
= "0.7"
CLI
From source
Usage
Basic Training
use ;
use Adam;
use Tensor;
let params = vec!;
let optimizer = new;
let mut trainer = new;
trainer.set_loss;
trainer.add_callback;
let result = trainer.train;
println!;
Autograd
use ;
let y = matmul;
let s = softmax;
let n = layer_norm;
let a = attention;
LoRA / QLoRA Fine-Tuning
use ;
// Standard LoRA
let lora = new;
// QLoRA: 4-bit base + FP16 adapters (7B model: 28GB -> 3.5GB)
let qlora = new;
Quantization
use ;
let fq = new; // QAT with STE
let calibrator = percentile; // Post-training
let quantizer = q4_0; // GGUF export
Model Merging
use ;
let merged = new.merge;
let merged = new.merge;
let merged = new.merge;
Declarative Configuration
# train.yaml
model:
path: base-model.gguf
data:
train: train.parquet
batch_size: 8
optimizer:
name: adamw
lr: 0.0001
lora:
rank: 64
alpha: 16
training:
epochs: 10
grad_clip: 1.0
CLI Commands
Features
Autograd Engine
Tape-based reverse-mode automatic differentiation with verified gradients. Supports matmul, softmax, layer normalization, and scaled dot-product attention. All gradients validated against finite-difference reference implementations.
LoRA / QLoRA Fine-Tuning
Parameter-efficient fine-tuning with up to 99.75% parameter reduction. QLoRA combines 4-bit NF4 quantized base weights with FP16 low-rank adapters, reducing 7B model memory from 28GB to 3.5GB. PEFT-compatible adapter export for interoperability with HuggingFace tooling.
Quantization
Three quantization strategies: Quantization-Aware Training (QAT) with straight-through estimator, Post-Training Quantization (PTQ) with percentile calibration, and GGUF-compatible Q4_0/Q8_0 export for llama.cpp interoperability. NF4 training with cuBLAS backward pass support.
Model Merging
Three model merging algorithms for combining fine-tuned checkpoints: TIES (Trim, Elect Sign, Merge) for multi-model consolidation, DARE (Dropout And Rescale) for parameter-efficient merging, and SLERP (Spherical Linear Interpolation) for smooth two-model blending.
Knowledge Distillation
Temperature-scaled KD loss with configurable alpha weighting between hard and soft targets. Multi-teacher ensemble distillation with weighted aggregation. Progressive layer-wise distillation for large-to-small model transfer.
CITL (Compiler-in-the-Loop)
Training loop that incorporates compiler feedback for transpiler optimization. Uses RAG-based fix suggestions via trueno-rag to guide training toward compilable outputs. Designed for the depyler/bashrs/decy transpilation stack.
GPU Training
WGPU backend for cross-platform GPU training (AMD, Intel, Apple Silicon). NVIDIA CUDA/cuBLAS backend for dedicated GPU acceleration. NVML integration for real-time GPU monitoring. VRAM ledger with file-based locking for multi-process coordination.
Monitoring
Toyota Way-inspired quality monitoring with real-time metrics collection, drift detection (z-score based), and Andon alert system for automatic anomaly notification. NaN/Inf detection, gradient explosion guards, and loss divergence tracking.
Feature Flags
| Flag | Purpose |
|---|---|
gpu |
GPU-accelerated training via wgpu |
cuda |
NVIDIA CUDA/cuBLAS training |
citl |
Compiler-in-the-Loop with trueno-rag |
monitor |
Training monitoring with trueno-db persistence |
server |
REST/HTTP API server via axum |
parquet |
Parquet batch loading via alimentar |
hub |
HuggingFace Hub model fetching |
wasm |
Browser-compatible WASM build |
tracing |
Renacer distributed tracing integration |
nvml |
Real GPU monitoring via NVIDIA NVML |
Architecture
entrenar/
autograd/ Tape-based automatic differentiation
optim/ SGD, Adam, AdamW, schedulers
lora/ LoRA, QLoRA fine-tuning
quant/ QAT, PTQ, GGUF quantization
merge/ TIES, DARE, SLERP merging
distill/ Knowledge distillation
finetune/ ClassifyPipeline, ClassifyTrainer, evaluation
eval/ Classification metrics, drift detection, Andon
train/ Trainer, callbacks, metrics, WGPU transformer trainer
monitor/ Real-time monitoring, Andon alerts
config/ Declarative YAML configuration
io/ Model persistence (SafeTensors, APR)
Quality
| Metric | Value |
|---|---|
| Tests | 7,527+ passing |
| Coverage | 96% |
| TDG Score | A+ (96.8/100) |
| Critical Defects | 0 |
| Property Tests | 200K+ iterations |
| Gradient Checking | Finite-difference validated |
| Mutation Testing | >80% kill rate |
| MSRV | 1.87 |
Sovereign AI Stack
| Crate | Purpose | Version |
|---|---|---|
| trueno | SIMD/GPU compute primitives | 0.16.x |
| aprender | ML algorithms, APR v2 format | 0.27.x |
| entrenar | Training and optimization | 0.7.x |
| realizar | Inference engine (APR/GGUF/SafeTensors) | 0.8.x |
| repartir | Distributed compute (CPU/GPU/Remote) | 2.0.x |
| whisper-apr | Pure Rust Whisper ASR | 0.2.x |
| simular | Simulation engine | 0.3.x |
| batuta | Stack orchestration | 0.7.x |
Documentation
- API Reference -- Generated from source
- Book -- Comprehensive guide with examples
- Examples -- Runnable training, merging, and monitoring examples
Contributing
- Fork the repository
- Create your changes on the
masterbranch - Run quality gates:
make lint && make test - Run coverage:
make coverage - Submit a pull request
Cookbook
See entrenar-cookbook for examples and recipes.
License
MIT
Part of the Aprender monorepo — 70 workspace crates.