Hodu, a user-friendly ML framework built in Rust.
Hodu (호두) is a Korean word meaning "walnut".
About Hodu
Hodu is a machine learning library built with user convenience at its core, designed for both rapid prototyping and seamless production deployment—including embedded environments.
Core Differentiators
Built on Rust's foundation of memory safety and zero-cost abstractions, Hodu offers unique advantages:
- Hybrid Execution Model: Seamlessly switch between dynamic execution for rapid prototyping and static computation graphs for optimized production deployment
- Memory Safety by Design: Leverage Rust's ownership system to eliminate common ML deployment issues like memory leaks and data races
- Embedded-First Architecture: Full
no_stdsupport enables ML inference on microcontrollers and resource-constrained devices - Zero-Cost Abstractions: High-level APIs that compile down to efficient machine code without runtime overhead
Execution Modes and Compilers
Dynamic Execution: Immediate tensor operations for rapid prototyping
- CPU operations
- Metal GPU support for macOS (with
metalfeature) - CUDA GPU acceleration (with
cudafeature)
Static Execution: Compiled computation graphs with two compiler backends
-
HODU Compiler: Self implementation with
no_stdsupport- Optimized constant caching eliminates repeated device transfers
- CPU, Metal, and CUDA device support
- Embedded-friendly for resource-constrained environments
-
XLA Compiler: JIT compilation via OpenXLA/PJRT (requires
std)- Advanced graph-level optimizations with compilation caching
- Production-grade performance comparable to JAX
- CPU and CUDA device support
[!WARNING]
This is a personal learning and development project. As such:
- The framework is under active development
- Features may be experimental or incomplete
- Functionality is not guaranteed for production use
It is recommended to use the latest version.
[!CAUTION]
Current Development Status:
- CUDA GPU support is not yet fully implemented and is under active development
Get started
Requirements
Required
- Rust 1.90.0 or later (latest stable version recommended)
Optional
-
OpenBLAS 0.3.30+ (recommended) - For optimized linear algebra operations on CPU
- macOS:
brew install openblas - Linux:
sudo apt install libopenblas-dev - Windows: Install via vcpkg or MinGW
- macOS:
-
LLVM/Clang - Required when building with the
xlafeature- macOS:
brew install llvm - Linux:
sudo apt install llvm clang - Windows: Install from LLVM releases
- macOS:
-
CUDA Toolkit - Required when using the
cudafeature- Download from NVIDIA CUDA Toolkit
-
Xcode Command Line Tools - Required when using the
metalfeature on macOSxcode-select --install
Examples
Here are some examples that demonstrate matrix multiplication using both dynamic execution and static computation graphs.
Dynamic Execution
This example shows direct tensor operations that are executed immediately:
use *;
With the cuda feature enabled, you can use CUDA in dynamic execution with the following setting:
- set_runtime_device(Device::CPU);
+ set_runtime_device(Device::CUDA(0));
Static Computation Graphs
For more complex workflows or when you need reusable computation graphs, you can use the Builder pattern:
use *;
With the cuda feature enabled, you can use CUDA in static computation graphs with the following setting:
let mut script = builder.build()?;
+ script.set_device(Device::CUDA(0));
With the xla feature enabled, you can use XLA in static computation graphs with the following setting:
let mut script = builder.build()?;
+ script.set_compiler(Compiler::XLA);
Features
Default Features
| Feature | Description | Dependencies |
|---|---|---|
std |
Standard library support | - |
serde |
Serialization/deserialization support | - |
rayon |
Parallel processing support | std |
Optional Features
| Feature | Description | Dependencies | Required Features |
|---|---|---|---|
cuda |
NVIDIA CUDA GPU support | CUDA toolkit | - |
metal |
Apple Metal GPU support | Metal framework | std |
xla |
Google XLA compiler backend | XLA libraries | std |
XLA Feature Requirements
Building with the xla feature requires:
- LLVM and Clang installed on your system
- RAM: 8GB+ free memory
- Disk Space: 20GB+ free storage
Optional Data Type Features
By default, Hodu supports these data types: bool, f8e4m3, bf16, f16, f32, u8, u32, i8, i32.
Additional data types can be enabled with feature flags to reduce compilation time:
| Feature | Description |
|---|---|
f8e5m2 |
Enable 8-bit floating point (E5M2) support |
f64 |
Enable 64-bit floating point support |
u16 |
Enable unsigned 16-bit integer support |
u64 |
Enable unsigned 64-bit integer support |
i16 |
Enable signed 16-bit integer support |
i64 |
Enable signed 64-bit integer support |
Compilation Performance: Disabling unused data types can reduce compilation time by up to 30-40%. If you don't need these specific data types, consider building without these features.
Supported Platforms
Standard Environments
| Target Triple | Backend | Device | Features | Status |
|---|---|---|---|---|
| x86_64-unknown-linux-gnu | HODU | CPU | std |
✅ Stable |
| HODU | CUDA | std, cuda |
🚧 In Development | |
| XLA | CPU | std, xla |
✅ Stable | |
| XLA | CUDA | std, xla, cuda |
🚧 In Development | |
| aarch64-unknown-linux-gnu | HODU | CPU | std |
✅ Stable |
| XLA | CPU | std, xla |
✅ Stable | |
| x86_64-apple-darwin | HODU | CPU | std |
🧪 Experimental |
| XLA | CPU | std, xla |
🚧 In Development | |
| aarch64-apple-darwin | HODU | CPU | std |
✅ Stable |
| HODU | Metal | std, metal |
🧪 Experimental | |
| XLA | CPU | std, xla |
✅ Stable | |
| x86_64-pc-windows-msvc | HODU | CPU | std |
✅ Stable |
| HODU | CUDA | std, cuda |
🚧 In Development | |
| XLA | CPU | std, xla |
🚧 In Development | |
| XLA | CUDA | std, xla, cuda |
🚧 In Development |
Embedded Environments
🧪 Experimental: Embedded platforms (ARM Cortex-M, RISC-V, Embedded Linux) are supported via no_std feature but are experimental and not extensively tested in production environments.
Note: Development should be done in a standard (std) host environment. Cross-compilation for embedded targets is supported.
ARM Cortex-M
Basic Build
With OpenBLAS (Optional)
For better performance, you can cross-compile OpenBLAS for ARM on your host machine:
- Build OpenBLAS for ARM on host (e.g., macOS/Linux):
# Install ARM cross-compiler
# macOS: brew install arm-none-eabi-gcc
# Linux: sudo apt install gcc-arm-none-eabi
# Clone and build OpenBLAS
- Build Hodu with the cross-compiled OpenBLAS:
# The OpenBLAS binaries are on host filesystem but built for ARM
Note: The build script runs on the host machine and accesses OpenBLAS from the host filesystem, even though the resulting binaries are for the target ARM architecture.
Environment Variables
OPENBLAS_DIR,OPENBLAS_INCLUDE_DIR,OPENBLAS_LIB_DIR- OpenBLAS paths for cross-compilationHODU_DISABLE_BLAS- Force disable OpenBLASHODU_DISABLE_NATIVE- Disable native CPU optimizationsHODU_DISABLE_SIMD- Disable SIMD auto-detection
Docs
CHANGELOG - Project changelog and version history
TODOS - Planned features and improvements
CONTRIBUTING - Contribution guide
Guide
- Tensor Creation Guide (Korean) - 텐서 생성 가이드
- Tensor Creation Guide (English) - Tensor creation guide
- Tensor Data Type Guide - Tensor data type guide
- Tensor Operations Guide - Tensor operations guide (only English)
- Neural Network Modules Guide (Korean) - 신경망 모듈 가이드
- Neural Network Modules Guide (English) - Neural network modules guide
- Tensor Utils Guide (Korean) - 텐서 유틸리티 가이드 (DataLoader, Dataset, Sampler)
- Tensor Utils Guide (English) - Tensor utilities guide (DataLoader, Dataset, Sampler)
- Builder/Script Guide (Korean) - Builder/Script 가이드
- Builder/Script Guide (English) - Builder/Script guide
- Gradient Tape Management Guide (Korean) - 그래디언트 테이프 관리 가이드
- Gradient Tape Management Guide (English) - Gradient tape management guide
Related Projects
Here are some other Rust ML frameworks you might find interesting:
- maidenx - The predecessor project to Hodu
- cetana - An advanced machine learning library empowering developers to build intelligent applications with ease.
Inspired by
Hodu draws inspiration from the following amazing projects:
- maidenx - The predecessor project to Hodu
- candle - Minimalist ML framework for Rust
- GoMlx - An Accelerated Machine Learning Framework For Go
Credits
Hodu Character Design: Created by Eira