Crate realizar

Crate realizar 

Source
Expand description

§Realizar

Pure Rust, portable, high-performance ML library with unified CPU/GPU/WASM support.

Realizar (Spanish: “to accomplish, to achieve”) provides a unified API for machine learning operations that automatically dispatches to the optimal backend based on data size, operation complexity, and available hardware.

§Features

  • Unified API: Single interface for CPU SIMD, GPU, and WASM execution
  • Native Integration: First-class support for trueno and aprender
  • Memory Safe: Zero unsafe code in public API, leveraging Rust’s type system
  • Production Ready: EXTREME TDD, 85%+ coverage, zero tolerance for defects

§Example

use realizar::Tensor;

// Create tensors
let a = Tensor::from_vec(vec![3, 3], vec![
    1.0, 2.0, 3.0,
    4.0, 5.0, 6.0,
    7.0, 8.0, 9.0,
]).unwrap();

// Check tensor properties
assert_eq!(a.shape(), &[3, 3]);
assert_eq!(a.ndim(), 2);
assert_eq!(a.size(), 9);

§Future Operations (Phase 1+)

// Element-wise operations (SIMD-accelerated) - Coming in Phase 1
let sum = a.add(&b).unwrap();

// Matrix multiplication (GPU-accelerated for large matrices) - Coming in Phase 2
let product = a.matmul(&b).unwrap();

§Architecture

Realizar is built on top of:

  • Trueno: Low-level compute primitives with SIMD/GPU/WASM backends
  • Aprender: High-level ML algorithms (will be refactored to use Realizar)

§Quality Standards

Following EXTREME TDD methodology:

  • Test Coverage: ≥85%
  • Mutation Score: ≥80%
  • TDG Score: ≥90/100
  • Clippy Warnings: 0 (enforced)
  • Cyclomatic Complexity: ≤10 per function

Re-exports§

pub use error::RealizarError;
pub use error::Result;
pub use tensor::Tensor;

Modules§

api
HTTP API for model inference
apr
Aprender .apr format support (PRIMARY inference format)
apr_transformer
APR Transformer format for WASM-compatible LLM inference
audit
Audit trail and provenance logging
bench
Benchmark harness for model runner comparison
bench_preflight
Preflight validation protocol for deterministic benchmarking
cache
Model caching and warming for reduced latency
cli
CLI command implementations (extracted for testability) CLI command implementations
convert
GGUF to APR Transformer converter
error
Error types for Realizar
explain
Model explainability (SHAP, Attention)
format
Unified model format detection (APR, GGUF, SafeTensors)
generate
Text generation and sampling strategies
gguf
GGUF (GPT-Generated Unified Format) parser
gpu
GPU acceleration module (Phase 4: ≥100 tok/s target)
grammar
Grammar-constrained generation for structured output
inference
SIMD-accelerated inference engine using trueno
layers
Neural network layers for transformer models
memory
Memory management for hot expert pinning
metrics
Metrics collection and reporting for production monitoring
model_loader
Unified model loader for APR, GGUF, and SafeTensors
moe
Mixture-of-Experts (MOE) routing with Capacity Factor load balancing
observability
Observability: metrics, tracing, and A/B testing
paged_kv
PagedAttention KV cache management
parallel
Multi-GPU and Distributed Inference
quantize
Quantization and dequantization for model weights
registry
Model registry for multi-model serving
safetensors
Safetensors parser
scheduler
Continuous batching scheduler
speculative
Speculative decoding for LLM inference acceleration
stats
Statistical analysis for A/B testing with log-normal latency support
target
Multi-target deployment support (Lambda, Docker, WASM) Multi-Target Deployment Support
tensor
Tensor implementation
tokenizer
Tokenizer for text encoding and decoding
tui
TUI monitoring for inference performance TUI Monitoring for LLM Inference
uri
Pacha URI scheme support for model loading Pacha URI scheme support for model loading
viz
Benchmark visualization using trueno-viz.
warmup
Model warm-up and pre-loading Model Warm-up and Pre-loading

Constants§

VERSION
Library version