docs.rs failed to build nnl-0.1.0
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Visit the last successful build:
nnl-0.1.6
NNL - Neural Network Library

A high-performance neural network library for Rust with comprehensive GPU and CPU support.
Features
- 🚀 Multi-backend Support: NVIDIA CUDA, AMD ROCm/Vulkan, and optimized CPU execution
- 🎯 Automatic Hardware Detection: Seamlessly selects the best available compute backend
- 🧠 Advanced Optimizers: Adam, SGD, AdaGrad, RMSprop, AdamW, LBFGS, and more
- 🏗️ Flexible Architecture: Dense layers, CNN, batch normalization, dropout, and custom layers
- 💾 Model Persistence: Save/load models with metadata in multiple formats (Binary, JSON, MessagePack)
- ⚡ Production Ready: SIMD optimizations, parallel processing, and zero-copy operations
- 🔧 Comprehensive Training: Learning rate scheduling, early stopping, metrics tracking
- 🎛️ Fine-grained Control: Custom loss functions, weight initialization, and gradient computation
Quick Start
Add this to your Cargo.toml:
[]
= "0.1.0"
Basic XOR Example
use *;
Installation
CPU-only (default)
[]
= "0.1.0"
With GPU Support
[]
= { = "0.1.0", = ["cuda"] } # NVIDIA CUDA
# or
= { = "0.1.0", = ["vulkan"] } # Vulkan (AMD/Intel/NVIDIA)
# or
= { = "0.1.0", = ["all-backends"] } # All GPU backends
System Requirements
- Rust: 1.70 or later
- CPU: Any modern x86_64 or ARM64 processor
- GPU (optional):
- CUDA: NVIDIA GPU with compute capability 3.5+, CUDA 11.0+
- Vulkan: Any Vulkan 1.2+ compatible GPU
- ROCm: AMD GPU with ROCm 4.0+ (experimental)
Examples
Run the included examples to see the library in action:
# Basic XOR problem (CPU)
# XOR with GPU acceleration
# MNIST digit classification
# Convolutional Neural Network
# CNN with GPU support
Available Examples
xor.rs- Solve XOR problem with a simple neural networkmnist.rs- MNIST handwritten digit classificationsimple_cnn.rs- Convolutional neural network example- GPU variants:
*_gpu.rs- Same examples with GPU acceleration
Core Concepts
Device Management
// Automatic device selection (CPU/GPU)
let device = auto_select?;
// Specific device types
let cpu_device = cpu?;
let cuda_device = cuda?; // GPU 0
let vulkan_device = vulkan?;
Tensors
// Create tensors
let zeros = zeros?;
let ones = ones?;
let from_data = from_slice?;
// Tensor operations
let a = randn?;
let b = randn?;
let result = a.add?; // Element-wise addition
let matmul = a.matmul?; // Matrix multiplication
Network Architecture
let network = new
.add_layer
.add_layer
.add_layer
.loss
.optimizer
.build?;
Training with Advanced Features
let config = TrainingConfig ;
let history = network.train?;
println!;
Model Persistence
// Save model
save_model?;
// Load model
let loaded_network = load_model?;
// Save with metadata
let metadata = ModelMetadata ;
save_model_with_metadata?;
Performance
Benchmarks
Performance comparison on common tasks (Intel i7-10700K, RTX 3080):
| Task | CPU (8 threads) | CUDA GPU | Speedup |
|---|---|---|---|
| Dense 1000x1000 MatMul | 12.5ms | 0.8ms | 15.6x |
| Conv2D 224x224x64 | 145ms | 8.2ms | 17.7x |
| MNIST Training (60k samples) | 45s | 3.2s | 14.1x |
Optimization Tips
- Use appropriate batch sizes: 32-256 for GPU, 8-32 for CPU
- Enable CPU optimizations: Use
features = ["cpu-optimized"]for Intel MKL - Memory management: Call
network.zero_grad()regularly to free unused memory - Data loading: Use parallel data loading for large datasets
- Mixed precision: Enable f16 on supported GPUs for 2x speedup
Feature Flags
| Feature | Description | Example |
|---|---|---|
default |
CPU-optimized backend | nnl = "0.1.0" |
cuda |
NVIDIA CUDA support | features = ["cuda"] |
vulkan |
Vulkan compute support | features = ["vulkan"] |
rocm |
AMD ROCm support (experimental) | features = ["rocm"] |
cpu-optimized |
Intel MKL/OpenBLAS acceleration | features = ["cpu-optimized"] |
all-backends |
All GPU backends | features = ["all-backends"] |
examples |
Example binaries | features = ["examples"] |
Troubleshooting
Common Issues
CUDA not found
# Install CUDA toolkit 11.0+
# Add to ~/.bashrc:
Vulkan not available
# Install Vulkan drivers
# Verify: vulkaninfo
Slow CPU performance
# Enable CPU optimizations
= { = "0.1.0", = ["cpu-optimized"] }
Out of memory on GPU
- Reduce batch size
- Use gradient accumulation
- Enable mixed precision training
API Documentation
For detailed API documentation, see docs.rs/nnl.
Key modules:
tensor- Tensor operations and data structuresnetwork- Neural network building and traininglayers- Layer implementations and configurationsoptimizers- Optimization algorithmsdevice- Device management and backend selection
Contributing
We welcome contributions! Please:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes with tests
- Run
cargo testandcargo clippy - Submit a pull request
For major changes, please open an issue first to discuss the proposed changes.
Development Setup
Roadmap
- Distributed Training: Multi-GPU and multi-node support
- Mobile Deployment: ARM optimization and model quantization
- Web Assembly: Browser-based inference
- Model Zoo: Pre-trained models for common tasks
- Auto-ML: Neural architecture search
- Graph Optimization: Operator fusion and memory optimization
License
This project is dual-licensed under either of:
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT License (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
Acknowledgments
- Inspired by PyTorch and TensorFlow APIs
- Built on excellent Rust ecosystem crates:
ndarray,rayon,vulkano,cudarc - Thanks to the Rust ML community and all contributors