Quick Start
For release notes see GitHub Releases.
What is Aprender?
A complete ML framework in pure Rust. One cargo install, one apr binary,
the full model lifecycle — inference, training, quantization, profiling,
publishing — all backed by YAML provable contracts that fail CI on drift.
At HEAD
| Metric | Count | Source of truth |
|---|---|---|
| Workspace crates | 80 workspace crates | ls crates/ |
| Provable contracts | 1158 provable contracts | find contracts/ -name '*.yaml' |
| CLI commands | 103 CLI commands | apr --help |
These numbers are enforced by contracts/readme-claims-v1.yaml.
Drift between this table and live repo state fails bash scripts/check_readme_claims.sh
→ see FALSIFY-README-001..004.
Command surface
| Stage | Commands |
|---|---|
| Inference | apr run, apr chat, apr serve |
| Training | apr finetune, apr train, apr pretrain, apr distill |
| Model ops | apr convert, apr quantize, apr merge, apr export, apr compile |
| Inspection | apr inspect, apr validate, apr tensors, apr diff, apr trace, apr lint |
| Profiling | apr profile, apr bench, apr qa |
| Registry | apr pull, apr list, apr rm, apr publish, apr registry |
| GPU | apr gpu, apr parity, apr ptx |
| Observability | apr tui, apr monitor, apr cbtop |
Cookbook
End-to-end recipes (data prep → train → quantize → publish → serve) live in
paiml/apr-cookbook — 341 worked
examples with local book/src/ walkthroughs.
Install
A Qwen story
Eight beats, one narrative, every core command group. Anchored on the Qwen
series so the story scales from a 494-MB safetensors model to a 30 B-parameter
MoE GGUF. Every beat is a falsifier in
contracts/qwen-story-v1.yaml; the runnable
form is scripts/qwen-story.sh; nightly cron is
.github/workflows/qwen-story-daily.yml;
the dogfood gate is /dogfood Gate 18.
# Reproduce locally (uses ~/models cache; ~3-5 min on RTX 4090):
Beat 1 — Discover (Registry)
Beat 2 — Trust (QA gates)
Beat 3 — Explore (Inspection)
Beat 4 — Adapt (Model ops)
Beat 5 — Use (Inference)
Beat 6 — Serve (REST API)
# → {"choices":[{"message":{"content":"2 + 2 equals 4."}}],...}
Beat 7 — Operate (Profiling)
Beat 8 — Scale (MoE introspection)
Publish (separate flow)
# Publish a derived model to HuggingFace Hub (see SPEC-HF-PUBLISH-001 for the 12-file pipeline)
When run with PMAT_HUNT=1 (default), each beat emits a manifest of high-risk
untested code in the command modules it just exercised. A nightly cron opens an
issue when this manifest grows so untested branches in command handlers can't
accumulate quietly. See contracts/qwen-story-v1.yaml.
Publishing a model? See SPEC-HF-PUBLISH-001 for the 12-file integration pipeline, three-path verification protocol, and HF API gotchas (NDJSON commits, LFS batch sizing, Q4_K stride constraints).
Library usage
[]
= "0.35"
use LinearRegression;
use Estimator;
let model = new;
model.fit?;
let predictions = model.predict?;
Algorithms: Linear/Logistic Regression, Decision Trees, Random Forest, GBM, Naive Bayes, KNN, SVM, K-Means, PCA, ARIMA, ICA, GLMs, graph algorithms, Bayesian inference, text + audio processing.
Architecture
Monorepo, flat crates/aprender-* layout (same pattern as
Polars,
Burn,
Nushell):
paiml/aprender/
├── Cargo.toml # Workspace root + `cargo install aprender`
├── crates/
│ ├── aprender-core/ # ML library (use aprender::*)
│ ├── apr-cli/ # CLI logic (103 subcommands)
│ ├── aprender-compute/ # SIMD/GPU compute kernels
│ ├── aprender-gpu/ # CUDA PTX
│ ├── aprender-serve/ # Inference server
│ ├── aprender-train/ # Training loops
│ ├── aprender-orchestrate/ # Agents + RAG
│ ├── aprender-contracts/ # Provable contracts engine
│ ├── aprender-profile/ # Profiling
│ ├── aprender-db/ aprender-graph/ aprender-rag/
│ └── ... (80 crates total)
├── contracts/ # 1158 provable YAML contracts
└── book/ # mdBook documentation
Performance
| Model | Format | Speed | Hardware |
|---|---|---|---|
| Qwen2.5-Coder 1.5B | Q4_K | 40+ tok/s | CPU (AVX2) |
| Qwen2.5-Coder 7B | Q4_K | 225+ tok/s | RTX 4090 |
| TinyLlama 1.1B | Q4_0 | 17 tok/s | CPU (APR format) |
Reproduced from candle-vs-apr and ground-truth-apr-ludwig.
Provable contracts
Every CLI command and kernel is bound to a YAML contract with equations, preconditions, postconditions, and falsification tests:
equations:
validate_exit_code:
formula: exit_code = if score < 50 then 5 else 0
invariants:
- score < 50 implies exit_code != 0
falsification_tests:
- id: FALSIFY-CLI-001
prediction: apr validate bad-model.apr exits non-zero
1158 contracts across inference, training, quantization, attention, FFN, tokenization, model formats, CLI safety — and this README itself.
Migration from old crates
| Old | New | Status |
|---|---|---|
trueno = "0.18" |
aprender-compute = "0.33" |
Shim available |
entrenar = "0.7" |
aprender-train = "0.33" |
Shim available |
realizar = "0.8" |
aprender-serve = "0.33" |
Shim available |
batuta = "0.7" |
aprender-orchestrate = "0.33" |
Shim available |
Old repositories are archived. All development happens here.
Contributing
License
MIT