Crate ruvllm_esp32

Crate ruvllm_esp32 

Source
Expand description

RuvLLM ESP32 - Tiny LLM Inference for Microcontrollers

This crate provides a minimal inference engine designed for ESP32 and similar resource-constrained microcontrollers.

§Constraints

  • ~520KB SRAM available
  • 4-16MB flash for model storage
  • No floating-point unit on base ESP32 (ESP32-S3 has one)
  • Single/dual core @ 240MHz

§Features

  • INT8 quantized inference
  • Fixed-point arithmetic option
  • Tiny transformer blocks
  • Memory-mapped model loading
  • Optional ESP32-S3 SIMD acceleration

Re-exports§

pub use micro_inference::MicroEngine;
pub use micro_inference::InferenceConfig;
pub use micro_inference::InferenceResult;
pub use quantized::QuantizedTensor;
pub use quantized::QuantizationType;
pub use model::TinyModel;
pub use model::ModelConfig;
pub use optimizations::BinaryVector;
pub use optimizations::BinaryEmbedding;
pub use optimizations::hamming_distance;
pub use optimizations::hamming_similarity;
pub use optimizations::ProductQuantizer;
pub use optimizations::PQCode;
pub use optimizations::SoftmaxLUT;
pub use optimizations::ExpLUT;
pub use optimizations::DistanceLUT;
pub use optimizations::MicroLoRA;
pub use optimizations::LoRAConfig;
pub use optimizations::SparseAttention;
pub use optimizations::AttentionPattern;
pub use optimizations::LayerPruner;
pub use optimizations::PruningConfig;
pub use federation::FederationConfig;
pub use federation::FederationMode;
pub use federation::FederationSpeedup;
pub use federation::PipelineNode;
pub use federation::PipelineConfig;
pub use federation::PipelineRole;
pub use federation::FederationMessage;
pub use federation::MessageType;
pub use federation::ChipId;
pub use federation::FederationCoordinator;
pub use federation::ClusterTopology;
pub use federation::MicroFastGRNN;
pub use federation::MicroGRNNConfig;
pub use federation::SpeculativeDecoder;
pub use federation::DraftVerifyConfig;

Modules§

attention
Attention mechanisms for ESP32
benchmark
Benchmark Suite for RuvLLM ESP32
diagnostics
Error Diagnostics with Fix Suggestions
embedding
Embedding operations for ESP32
federation
Federation Module for Multi-ESP32 Distributed Inference
micro_inference
Micro Inference Engine for ESP32
model
Model definition and loading for ESP32
models
Model Zoo - Pre-quantized Models for RuvLLM ESP32
optimizations
Advanced Optimizations from Ruvector
ota
Over-the-Air (OTA) Update System for RuvLLM ESP32
prelude
Prelude for common imports
quantized
Quantized tensor operations for memory-efficient inference
ruvector
RuVector Integration for ESP32

Enums§

Error
Error types for ESP32 inference
Esp32Variant
Memory budget for ESP32 variants

Type Aliases§

Result