RusTorch 🚀
🌐 多言語ドキュメント / Multilingual Documentation
Language | README | Jupyter Guide |
---|---|---|
🇺🇸 English | 📖 Main | 📓 Jupyter |
🇫🇷 Français | 📖 Principal | 📓 Jupyter |
🇮🇹 Italiano | 📖 Principale | 📓 Jupyter |
🇪🇸 Español | 📖 Principal | 📓 Jupyter |
🇨🇳 中文 | 📖 主要 | 📓 Jupyter |
🇰🇷 한국어 | 📖 메인 | 📓 Jupyter |
🇩🇪 Deutsch | 📖 Hauptseite | 📓 Jupyter |
🇷🇺 Русский | 📖 Основной | 📓 Jupyter |
A production-ready deep learning library in Rust with PyTorch-like API, GPU acceleration, and enterprise-grade performance
本番環境対応のRust製ディープラーニングライブラリ - PyTorchライクなAPI、GPU加速、エンタープライズグレードパフォーマンス
RusTorch is a fully functional deep learning library that leverages Rust's safety and performance. Phase 8 COMPLETED brings advanced tensor utilities with conditional operations, indexing, and statistical functions. Features comprehensive tensor operations, automatic differentiation, neural network layers, transformer architectures, multi-backend GPU acceleration (CUDA/Metal/OpenCL), advanced SIMD optimizations, enterprise-grade memory management, data validation & quality assurance, and comprehensive debug & logging systems.
✨ Features
- 🔥 Comprehensive Tensor Operations: Math operations, broadcasting, indexing, statistics, and Phase 8 advanced utilities
- 🤖 Transformer Architecture: Complete transformer implementation with multi-head attention
- 🧮 Matrix Decomposition: SVD, QR, eigenvalue decomposition with PyTorch compatibility
- 🧠 Automatic Differentiation: Tape-based computational graph for gradient computation
- 🚀 Dynamic Execution Engine: JIT compilation and runtime optimization
- 🏗️ Neural Network Layers: Linear, Conv1d/2d/3d, ConvTranspose, RNN/LSTM/GRU, BatchNorm, Dropout, and more
- ⚡ Cross-Platform Optimizations: SIMD (AVX2/SSE/NEON), platform-specific, and hardware-aware optimizations
- 🎮 GPU Integration: CUDA/Metal/OpenCL support with automatic device selection
- 🌐 WebAssembly Support: Complete browser ML with Neural Network layers, Computer Vision, and real-time inference
- 🎮 WebGPU Integration: Chrome-optimized GPU acceleration with CPU fallback for cross-browser compatibility
- 📁 Model Format Support: Safetensors, ONNX inference, PyTorch state dict compatibility
- ✅ Production Ready: 1060 tests passing, unified error handling system with
RusTorchError
- 📐 Enhanced Mathematical Functions: Complete set of mathematical functions (exp, ln, sin, cos, tan, sqrt, abs, pow)
- 🔧 Advanced Operator Overloads: Full operator support for tensors with scalar operations and in-place assignments
- 📈 Advanced Optimizers: SGD, Adam, AdamW, RMSprop, AdaGrad with learning rate schedulers
- 🚀 Phase 2 Optimization Framework: NAdam, RAdam, Adamax, Enhanced L-BFGS with 500%+ performance boost
- ⚡ World-Class Performance: Adamax 33,632 steps/sec, RAdam 21,939 steps/sec, NAdam 18,976 steps/sec
- 🔍 Data Validation & Quality Assurance: Statistical analysis, anomaly detection, consistency checking, real-time monitoring
- 🐛 Comprehensive Debug & Logging: Structured logging, performance profiling, memory tracking, automated alerts
- 🎯 Phase 8 Tensor Utilities: Conditional operations (where, masked_select, masked_fill), indexing operations (gather, scatter, index_select), statistical operations (topk, kthvalue), and advanced utilities (unique, histogram)
For detailed features, see Features Documentation.
🚀 Quick Start
📓 For complete Jupyter setup guide, see README_JUPYTER.md
Python Jupyter Lab Demo
📓 Complete Jupyter Setup Guide | Jupyter Guide (日本語)
Standard CPU Demo
Launch RusTorch with Jupyter Lab in one command:
WebGPU Accelerated Demo
Launch RusTorch with WebGPU support for browser-based GPU acceleration:
Both scripts will:
- 📦 Create virtual environment automatically
- 🔧 Build RusTorch Python bindings
- 🚀 Launch Jupyter Lab with demo notebook
- 📍 Open demo notebook ready to run
WebGPU Features:
- 🌐 Browser-based GPU acceleration
- ⚡ High-performance matrix operations in browser
- 🔄 Automatic fallback to CPU when GPU unavailable
- 🎯 Chrome/Edge optimized (recommended browsers)
Rust Kernel for Jupyter
Launch native Rust kernel in Jupyter (evcxr_jupyter):
This will:
- 🦀 Install evcxr_jupyter Rust kernel
- 📓 Create Rust kernel demo notebook
- 🚀 Launch Jupyter with native Rust support
- 📍 Direct tensor operations in Rust
Installation
Add this to your Cargo.toml
:
[]
= "0.5.15"
# Optional features
[]
= ["linalg"]
= ["rustorch/linalg"] # Linear algebra operations (SVD, QR, eigenvalue)
= ["rustorch/cuda"]
= ["rustorch/metal"]
= ["rustorch/opencl"]
= ["rustorch/safetensors"]
= ["rustorch/onnx"]
= ["rustorch/wasm"] # WebAssembly support for browser ML
= ["rustorch/webgpu"] # Chrome-optimized WebGPU acceleration
# To disable linalg features (avoid OpenBLAS/LAPACK dependencies):
= { = "0.5.15", = false }
Basic Usage
use Tensor;
use ;
use RusTorchResult;
WebAssembly Usage
For browser-based ML applications:
import init * as rustorch from './pkg/rustorch.js';
WebGPU Acceleration (Chrome Optimized)
For Chrome browsers with WebGPU support:
import init * as rustorch from './pkg/rustorch.js';
Run examples:
# Basic functionality examples
# Machine learning examples
# Linear algebra examples (requires linalg feature)
# Mathematical functions
For more examples, see Getting Started Guide and WebAssembly Guide.
📚 Documentation
- Getting Started - Basic usage and examples
- Features - Complete feature list and specifications
- Performance - Benchmarks and optimization details
- Architecture - System design and project structure
- Examples - Comprehensive code examples
- API Documentation - Detailed API reference
WebAssembly & Browser ML
- WebAssembly Guide - Complete WASM integration and API reference
- Jupyter WASM Guide - Step-by-step Jupyter Notebook setup (日本語)
- Jupyter WASM Guide (English) - Step-by-step Jupyter Notebook setup
- WebGPU Integration - Chrome-optimized GPU acceleration
- Browser Compatibility - Cross-browser support matrix
- WASM Performance - Benchmarking and optimization strategies
Production & Operations
- GPU Acceleration Guide - GPU setup and usage
- Production Guide - Deployment and scaling
- Data Validation Guide - Quality assurance and validation
- Debug & Logging Guide - Comprehensive debugging tools
📊 Performance
🏆 Phase 2 Performance Revolution - Up to 580% improvement:
Advanced Optimizer Performance (Release Mode)
Optimizer | Performance | Use Case | Status |
---|---|---|---|
Adamax | 33,632 steps/sec ⚡ | Sparse features, Embeddings | FASTEST |
RAdam | 21,939 steps/sec 🚀 | General deep learning, Stability | RECOMMENDED |
NAdam | 18,976 steps/sec ✨ | NLP, Fine-tuning | NLP-OPTIMIZED |
Core Operations Performance
Operation | Performance | Details |
---|---|---|
SVD Decomposition | ~1ms (8x8 matrix) | ✅ LAPACK-based |
QR Decomposition | ~24μs (8x8 matrix) | ✅ Fast decomposition |
Eigenvalue | ~165μs (8x8 matrix) | ✅ Symmetric matrices |
Complex FFT | 10-312μs (8-64 samples) | ✅ Cooley-Tukey optimized |
Neural Network | 1-7s training | ✅ Boston housing demo |
Activation Functions | <1μs | ✅ ReLU, Sigmoid, Tanh |
Phase 2 Architectural Benefits
- 🧹 50%+ code reduction through unified GenericAdamOptimizer
- 🔧 Consistent API across all Adam variants
- ⚡ Shared optimizations benefiting all optimizers
- 🛡️ Robust error handling with RusTorchResult
Run benchmarks:
# Basic benchmarks (no external dependencies)
# Linear algebra benchmarks (requires linalg feature)
# Phase 2 optimizer benchmarks (NEW!)
# Quick manual benchmark
For detailed performance analysis, see Performance Documentation.
⚠️ Version Notice - GPU Acceleration
GPU機能に関する重要な注意事項:
🚨 Yanked Versions (GPU回帰問題)
以下のバージョンはGPU加速が正常に動作しない問題があったため、crates.ioからyankされています:
❌ GPU使用不可: v0.5.0 〜 v0.5.14 (全15バージョン)
✅ GPU加速復活: v0.5.15以降
問題の詳細:
- 原因: Phase 4リファクタリング(コミット: 782f6cb)でGPU実装が退化
- 症状: 全GPU操作がCPUフォールバックに退化(Metal/CUDA/OpenCL)
- 期間: 約1ヶ月間(2025年8月28日〜2025年9月3日)
- 修復: v0.5.15でGPU加速を完全復活
推奨事項:
- 📌 v0.5.15以降を使用 してGPU機能をご利用ください
- 🔄 既存プロジェクトは
rustorch = "0.5.15"
に更新してください - 📋 詳細は GPU回帰分析レポート をご覧ください
🧪 Testing
1060 tests passing - Production-ready quality assurance with unified RusTorchError
error handling system.
# Run all tests (recommended for CI/development)
# Run tests with linear algebra features
# Run doctests
# Test specific modules
🚀 Production Deployment
Docker
# Production deployment
# GPU-enabled deployment
For complete deployment guide, see Production Guide.
🤝 Contributing
We welcome contributions! See areas where help is especially needed:
- 🎯 Special Functions Precision: Improve numerical accuracy
- ⚡ Performance Optimization: SIMD improvements, GPU optimization
- 🧪 Testing: More comprehensive test cases
- 📚 Documentation: Examples, tutorials, improvements
- 🌐 Platform Support: WebAssembly, mobile platforms
Development Setup
# Run tests
# Check formatting
# Run clippy
License
Licensed under either of:
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.