SciRS2 Autograd
Production-Ready Automatic Differentiation for Rust (v0.1.0)
A high-performance automatic differentiation library for SciRS2, providing functionality comparable to PyTorch/TensorFlow's autograd systems with native Rust performance and safety guarantees.
ā ļø SciRS2 POLICY Migration: This module is currently being updated to follow the SciRS2 POLICY - migration from direct rand:: and ndarray:: usage to scirs2-core abstractions is in progress.
⨠Features
Core Automatic Differentiation
- Reverse-mode AD: Efficient gradient computation for machine learning workloads
- Dynamic Graphs: Runtime graph construction with flexible control flow support
- Higher-order Derivatives: Second and higher-order gradients with numerical stability
- Memory Optimization: Gradient checkpointing, memory pooling, and smart caching
Mathematical Operations
- Comprehensive Linear Algebra: Matrix decompositions (QR, LU, SVD, Cholesky) with gradients
- Matrix Functions: Inverse, determinant, exponential, logarithm, power operations
- Numerically Stable Implementations: Robust gradient computation for large matrices
- Broadcasting: NumPy-style tensor broadcasting for element-wise operations
Neural Network Infrastructure
- Activation Functions: ReLU variants, Sigmoid, Tanh, Softmax, Swish, GELU, Mish
- Loss Functions: MSE, cross-entropy, sparse categorical cross-entropy
- Convolution Layers: 2D convolutions, transposed convolutions, pooling operations
- Optimization: SGD, Adam, AdaGrad, AdamW with learning rate scheduling
Performance & Integration
- SIMD Acceleration: Vectorized operations for enhanced performance
- Parallel Processing: Multi-threaded computation with work-stealing thread pool
- BLAS Support: Optional acceleration with OpenBLAS, Intel MKL
- SciRS2 Integration: Seamless interoperability with the broader SciRS2 ecosystem
š¦ Installation
Add to your Cargo.toml:
[]
= "0.1.0"
Optional Features
Enable performance optimizations and additional backends:
[]
= { = "0.1.0", = ["blas", "simd"] }
Available Features:
blas- BLAS acceleration for linear algebra operationsopenblas- OpenBLAS backendintel-mkl- Intel MKL backend for maximum performancesimd- SIMD acceleration for element-wise operations
š Quick Start
Basic Automatic Differentiation
use ;
// Compute gradients of z = 2x² + 3y + 1
run;
Neural Network Training
use ;
use Adam;
// Build a 2-layer MLP for classification
let mut env = new;
let mut rng = default;
// Initialize network parameters
// Note: Use ndarray_ext::zeros for NdArray, not tensor_ops::zeros which returns Tensor
env.name.set;
env.name.set;
env.name.set;
env.name.set;
// Setup Adam optimizer
let adam = default;
// Training loop
for epoch in 0..100
ā ļø Important API Notes
ndarray_ext vs tensor_ops Functions
There are two sets of array creation functions with different return types:
| Module | Function | Returns | Use For |
|---|---|---|---|
ndarray_ext::zeros |
zeros(&[shape]) |
NdArray<T> |
Variable initialization, data storage |
tensor_ops::zeros |
zeros(&shape, ctx) |
Tensor<F> |
Computation graph operations |
Common Mistake:
// ā WRONG: tensor_ops::zeros returns Tensor, but .set() expects NdArray
use *;
env.name.set; // Type error!
// ā
CORRECT: Use ndarray_ext::zeros for variable initialization
use ndarray_ext;
env.name.set;
Rule of Thumb:
- Use
ndarray_ext::*for data (variable initialization, feeding values) - Use
tensor_ops::*for computations (insideenv.run(|ctx| { ... }))
šÆ Advanced Features
Mathematical Robustness
- Higher-Order Derivatives: Efficient Hessian computation for advanced optimization
- Numerical Stability: Carefully implemented gradients for matrix decompositions
- Large Matrix Support: Optimized algorithms for high-dimensional computations
- Custom Operations: Extensible framework for user-defined differentiable operations
Performance Engineering
- Memory Management: Smart gradient checkpointing reduces memory usage by 50-80%
- Computation Graph Optimization: Automatic fusion and simplification
- SIMD & Parallelization: Multi-core acceleration with work-stealing scheduler
- Zero-Copy Operations: Tensor views and in-place operations minimize allocations
Developer Experience
- Comprehensive Testing: 404+ tests ensure reliability and correctness
- Rich Debugging: Graph visualization and execution tracing tools
- Flexible APIs: Support for both eager and graph-based execution models
- SciRS2 Integration: Seamless interoperability across the scientific computing stack
Gradient Checkpointing
Gradient checkpointing is a memory optimization technique that trades additional computation time for reduced memory usage during backpropagation. This is especially useful for training large models under memory constraints.
How It Works
During standard backpropagation, all intermediate activations must be stored to compute gradients, which can lead to high memory usage in deep networks. Gradient checkpointing selectively discards intermediate activations during the forward pass and recomputes them during the backward pass as needed.
Benefits
- Significantly reduced memory usage (typically 50-80% reduction)
- Enables training of deeper/larger models that would otherwise not fit in memory
- Flexible strategies to balance memory usage vs. computation time
Usage Options
use ;
run;
Profiling Checkpoint Performance
You can measure the memory savings and performance impact of your checkpointing strategy:
// Start tracking memory usage
start_tracking;
// Your model with checkpointing
// ... (model code with checkpoint operations)
// Evaluate performance
println!;
println!;
// Reset for next test
reset_statistics;
Optimization Strategies
- Basic Strategy: Checkpoint every N layers (e.g., every other layer)
- Adaptive Strategy: Use automatic thresholds based on tensor size
- Targeted Strategy: Manually checkpoint only the largest tensors
- Segment Strategy: Checkpoint entire computation segments together
š Performance & Reliability
Test Coverage: 404 passing tests, 0 failures
Memory Efficiency: Up to 80% reduction with gradient checkpointing
Numerical Stability: Robust implementations for large-scale computations
Performance: SIMD and multi-threading optimizations throughout
š¤ Contributing & Support
- Documentation: docs.rs/scirs2-autograd
- Repository: github.com/cool-japan/scirs
- Issues: Report bugs and request features on GitHub
- Community: Join discussions in the SciRS2 community
š Production Readiness
SciRS2 Autograd v0.1.0 is Stable Release and is production-ready:
- ā Stable API: No breaking changes expected before v1.0
- ā Comprehensive Testing: All core functionality thoroughly tested
- ā Performance Optimized: SIMD, parallelization, and memory optimizations
- ā Documentation Complete: Full API documentation with examples
- ā Integration Ready: Seamless SciRS2 ecosystem compatibility
License
This project is dual-licensed under:
You can choose to use either license. See the LICENSE file for details.