rust-ai-core 0.3.0

Unified AI engineering toolkit: orchestrates peft-rs, qlora-rs, unsloth-rs, axolotl-rs, bitnet-quantize, trit-vsa, vsa-optim-rs, and tritter-accel
Documentation

rust-ai-core

Crates.io PyPI Documentation License: MIT

Unified AI engineering toolkit that orchestrates the complete rust-ai ecosystem into a cohesive API for fine-tuning, quantization, and GPU-accelerated AI operations.

What is rust-ai-core?

rust-ai-core is the central hub for 8 specialized AI/ML crates, providing:

  • Unified API: Single entry point (RustAI) for all AI engineering tasks
  • Foundation Layer: Device selection, error handling, memory tracking
  • Ecosystem Re-exports: Direct access to all component crates

Integrated Ecosystem

Crate Purpose Version
peft-rs LoRA, DoRA, AdaLoRA adapters 1.0.3
qlora-rs 4-bit quantized LoRA 1.0.5
unsloth-rs Optimized transformer blocks 1.0
axolotl-rs YAML-driven fine-tuning 1.1
bitnet-quantize BitNet 1.58-bit quantization 0.2
trit-vsa Ternary VSA operations 0.2
vsa-optim-rs VSA-based optimization 0.1
tritter-accel Ternary GPU acceleration 0.1

Installation

Rust

[dependencies]
# Full ecosystem integration
rust-ai-core = "0.3"

# With CUDA support
rust-ai-core = { version = "0.3", features = ["cuda"] }

# With Python bindings
rust-ai-core = { version = "0.3", features = ["python"] }

# Everything
rust-ai-core = { version = "0.3", features = ["full"] }

Python

pip install rust-ai-core-bindings

Quick Start Guide

Option 1: Unified API (Recommended)

The RustAI facade provides a simplified interface for common tasks:

use rust_ai_core::{RustAI, RustAIConfig, AdapterType, QuantizeMethod};

fn main() -> rust_ai_core::Result<()> {
    // Initialize with sensible defaults
    let ai = RustAI::new(RustAIConfig::default())?;

    println!("Device: {:?}", ai.device());
    println!("Ecosystem crates: {:?}", ai.ecosystem());

    // Configure fine-tuning with LoRA
    let finetune = ai.finetune()
        .model("meta-llama/Llama-2-7b")
        .adapter(AdapterType::Lora)
        .rank(64)
        .alpha(16.0)
        .build()?;

    // Configure quantization
    let quant = ai.quantize()
        .method(QuantizeMethod::Nf4)
        .bits(4)
        .group_size(64)
        .build();

    // Configure VSA operations
    let vsa = ai.vsa()
        .dimension(10000)
        .build();

    Ok(())
}

Option 2: Direct Crate Access

Access individual crates through the ecosystem module:

use rust_ai_core::ecosystem::peft::{LoraConfig, LoraLinear};
use rust_ai_core::ecosystem::qlora::{QLoraConfig, QuantizedTensor};
use rust_ai_core::ecosystem::bitnet::{BitNetConfig, TernaryLinear};
use rust_ai_core::ecosystem::trit::{TritVector, PackedTritVec};

fn main() -> rust_ai_core::Result<()> {
    // Use peft-rs directly
    let lora_config = LoraConfig::new(64, 16.0);

    // Use trit-vsa directly
    let vec = TritVector::random(10000);

    Ok(())
}

Option 3: Foundation Layer Only

Use just the core utilities without ecosystem crates:

use rust_ai_core::{get_device, DeviceConfig, CoreError, Result};
use rust_ai_core::{estimate_tensor_bytes, MemoryTracker};

fn main() -> Result<()> {
    // CUDA-first device selection
    let device = get_device(&DeviceConfig::default())?;

    // Memory estimation
    let shape = [1, 4096, 4096];
    let bytes = estimate_tensor_bytes(&shape, candle_core::DType::F32);
    println!("Tensor size: {} MB", bytes / 1024 / 1024);

    // Memory tracking
    let tracker = MemoryTracker::new(8 * 1024 * 1024 * 1024); // 8 GB
    tracker.try_allocate(bytes)?;

    Ok(())
}

API Reference

Unified API (RustAI)

// Initialize
let ai = RustAI::new(RustAIConfig::default())?;

// Workflows
ai.finetune()   // -> FinetuneBuilder (LoRA, DoRA, AdaLoRA)
ai.quantize()   // -> QuantizeBuilder (NF4, FP4, BitNet, INT8)
ai.vsa()        // -> VsaBuilder (Vector Symbolic Architectures)
ai.train()      // -> TrainBuilder (Axolotl-style YAML config)

// Info
ai.device()     // Active device (CPU or CUDA)
ai.ecosystem()  // Ecosystem crate versions
ai.is_cuda()    // Whether CUDA is active
ai.info()       // Full environment info

Configuration Options

let config = RustAIConfig::new()
    .with_verbose(true)                    // Enable verbose logging
    .with_memory_limit(8 * 1024 * 1024 * 1024)  // 8 GB limit
    .with_cpu()                            // Force CPU execution
    .with_cuda_device(0);                  // Select CUDA device

Ecosystem Modules

Module Crate Key Types
ecosystem::peft peft-rs LoraConfig, LoraLinear, DoraConfig
ecosystem::qlora qlora-rs QLoraConfig, QuantizedTensor, Nf4Quantizer
ecosystem::unsloth unsloth-rs FlashAttention, SwiGLU, RMSNorm
ecosystem::axolotl axolotl-rs AxolotlConfig, TrainingPipeline
ecosystem::bitnet bitnet-quantize BitNetConfig, TernaryLinear
ecosystem::trit trit-vsa TritVector, PackedTritVec, HdcEncoder
ecosystem::vsa_optim vsa-optim-rs VsaOptimizer, GradientPredictor
ecosystem::tritter tritter-accel TritterRuntime, TernaryMatmul

Foundation Utilities

Function Purpose
get_device() CUDA-first device selection with fallback
warn_if_cpu() One-time warning when running on CPU
estimate_tensor_bytes() Memory estimation for tensor shapes
estimate_attention_memory() O(n^2) attention memory estimation
MemoryTracker::new() Thread-safe memory tracking with limits
init_logging() Initialize tracing with env filter

Environment Variables

Variable Description Default
RUST_AI_FORCE_CPU Force CPU execution false
RUST_AI_CUDA_DEVICE CUDA device ordinal 0
RUST_LOG Logging level info

Feature Flags

Feature Description Dependencies
cuda CUDA support via CubeCL cubecl, cubecl-cuda
python Python bindings via PyO3 pyo3, numpy
full All features enabled cuda, python

Design Philosophy

  • CUDA-first: GPU preferred, CPU fallback with warnings
  • Zero-cost abstractions: Traits compile to static dispatch
  • Fail-fast validation: Configuration errors caught at construction
  • Unified API: Single entry point for all AI engineering tasks
  • Direct access: Full crate APIs available when needed

Examples

See the examples/ directory:

  • device_selection.rs - Device configuration patterns
  • memory_tracking.rs - Memory estimation and tracking
  • error_handling.rs - Error handling patterns

Python Bindings

import rust_ai_core_bindings as rac

# Memory estimation
bytes = rac.estimate_tensor_bytes([1, 4096, 4096], "f32")
print(f"Tensor: {bytes / 1024**2:.1f} MB")

# Attention memory (for model planning)
attn_bytes = rac.estimate_attention_memory(1, 32, 4096, 128, "bf16")
print(f"Attention: {attn_bytes / 1024**2:.1f} MB")

# Device detection
if rac.cuda_available():
    info = rac.get_device_info()
    print(f"CUDA device: {info['name']}")

# Memory tracking
tracker = rac.create_memory_tracker(8 * 1024**3)  # 8 GB limit
rac.tracker_allocate(tracker, bytes)
print(f"Allocated: {rac.tracker_allocated_bytes(tracker)} bytes")

License

MIT License - see LICENSE-MIT

Contributing

Contributions welcome! Please ensure:

  • All public items have documentation
  • Tests pass: cargo test
  • Lints pass: cargo clippy --all-targets --all-features
  • Code is formatted: cargo fmt

See Also