rustorch 0.6.29

Production-ready PyTorch-compatible deep learning library in Rust with special mathematical functions (gamma, Bessel, error functions), statistical distributions, Fourier transforms (FFT/RFFT), matrix decomposition (SVD/QR/LU/eigenvalue), automatic differentiation, neural networks, computer vision transforms, complete GPU acceleration (CUDA/Metal/OpenCL), SIMD optimizations, parallel processing, WebAssembly browser support, comprehensive distributed learning support, and performance validation
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
# RusTorch 🚀

[![Crates.io](https://img.shields.io/crates/v/rustorch)](https://crates.io/crates/rustorch)
[![Documentation](https://docs.rs/rustorch/badge.svg)](https://docs.rs/rustorch)
[![License](https://img.shields.io/badge/license-MIT%2FApache--2.0-blue.svg)](https://github.com/JunSuzukiJapan/rustorch)
[![Tests](https://img.shields.io/badge/tests-1060%20passing-brightgreen.svg)](#testing)
[![Build](https://img.shields.io/badge/build-passing-brightgreen.svg)](#testing)

## 🌐 多言語ドキュメント / Multilingual Documentation

| Language | README | Jupyter Guide |
|----------|---------|---------------|
| 🇺🇸 [English](docs/i18n/en/README.md) | [📖 Main](docs/i18n/en/README.md) | [📓 Jupyter](docs/i18n/en/jupyter-guide.md) |
| 🇯🇵 [日本語](docs/i18n/ja/README.md) | [📖 メイン](docs/i18n/ja/README.md) | [📓 Jupyter](docs/i18n/ja/jupyter-guide.md) |
| 🇫🇷 [Français](docs/i18n/fr/README.md) | [📖 Principal](docs/i18n/fr/README.md) | [📓 Jupyter](docs/i18n/fr/jupyter-guide.md) |
| 🇮🇹 [Italiano](docs/i18n/it/README.md) | [📖 Principale](docs/i18n/it/README.md) | [📓 Jupyter](docs/i18n/it/jupyter-guide.md) |
| 🇪🇸 [Español](docs/i18n/es/README.md) | [📖 Principal](docs/i18n/es/README.md) | [📓 Jupyter](docs/i18n/es/jupyter-guide.md) |
| 🇨🇳 [中文](docs/i18n/zh/README.md) | [📖 主要](docs/i18n/zh/README.md) | [📓 Jupyter](docs/i18n/zh/jupyter-guide.md) |
| 🇰🇷 [한국어](docs/i18n/ko/README.md) | [📖 메인](docs/i18n/ko/README.md) | [📓 Jupyter](docs/i18n/ko/jupyter-guide.md) |
| 🇩🇪 [Deutsch](docs/i18n/de/README.md) | [📖 Hauptseite](docs/i18n/de/README.md) | [📓 Jupyter](docs/i18n/de/jupyter-guide.md) |
| 🇷🇺 [Русский](docs/i18n/ru/README.md) | [📖 Основной](docs/i18n/ru/README.md) | [📓 Jupyter](docs/i18n/ru/jupyter-guide.md) |
| 🇵🇹 [Português](docs/i18n/pt/README.md) | [📖 Principal](docs/i18n/pt/README.md) | [📓 Jupyter](docs/i18n/pt/jupyter-guide.md) |

---

**A production-ready deep learning library in Rust with PyTorch-like API, GPU acceleration, and enterprise-grade performance**  
**本番環境対応のRust製ディープラーニングライブラリ - PyTorchライクなAPI、GPU加速、エンタープライズグレードパフォーマンス**

RusTorch is a fully functional deep learning library that leverages Rust's safety and performance. **Phase 8 COMPLETED** brings advanced tensor utilities with **conditional operations, indexing, and statistical functions**. Features comprehensive tensor operations, automatic differentiation, neural network layers, transformer architectures, multi-backend GPU acceleration (CUDA/Metal/OpenCL/CoreML), advanced SIMD optimizations, enterprise-grade memory management, data validation & quality assurance, and comprehensive debug & logging systems.

## ✨ Features

- 🔥 **Comprehensive Tensor Operations**: Math operations, broadcasting, indexing, statistics, and Phase 8 advanced utilities
- 🤖 **Transformer Architecture**: Complete transformer implementation with multi-head attention
- 🧮 **Matrix Decomposition**: SVD, QR, eigenvalue decomposition with PyTorch compatibility
- 🧠 **Automatic Differentiation**: Tape-based computational graph for gradient computation
- 🚀 **Dynamic Execution Engine**: JIT compilation and runtime optimization
- 🏗️ **Neural Network Layers**: Linear, Conv1d/2d/3d, ConvTranspose, RNN/LSTM/GRU, BatchNorm, Dropout, and more
- ⚡ **Cross-Platform Optimizations**: SIMD (AVX2/SSE/NEON), platform-specific, and hardware-aware optimizations
- 🎮 **GPU Integration**: CUDA/Metal/OpenCL/CoreML support with automatic device selection
- 🌐 **WebAssembly Support**: Complete browser ML with Neural Network layers, Computer Vision, and real-time inference
- 🎮 **WebGPU Integration**: Chrome-optimized GPU acceleration with CPU fallback for cross-browser compatibility
- 📁 **Model Format Support**: Safetensors, ONNX inference, PyTorch state dict compatibility
- ✅ **Production Ready**: 1060 tests passing, unified error handling system with `RusTorchError`
- 📐 **Enhanced Mathematical Functions**: Complete set of mathematical functions (exp, ln, sin, cos, tan, sqrt, abs, pow)
- 🔧 **Advanced Operator Overloads**: Full operator support for tensors with scalar operations and in-place assignments
- 📈 **Advanced Optimizers**: SGD, Adam, AdamW, RMSprop, AdaGrad with learning rate schedulers
- 🚀 **Phase 2 Optimization Framework**: NAdam, RAdam, Adamax, Enhanced L-BFGS with 500%+ performance boost
- ⚡ **World-Class Performance**: Adamax 33,632 steps/sec, RAdam 21,939 steps/sec, NAdam 18,976 steps/sec
- 🔍 **Data Validation & Quality Assurance**: Statistical analysis, anomaly detection, consistency checking, real-time monitoring
- 🐛 **Comprehensive Debug & Logging**: Structured logging, performance profiling, memory tracking, automated alerts
- 🎯 **Phase 8 Tensor Utilities**: Conditional operations (where, masked_select, masked_fill), indexing operations (gather, scatter, index_select), statistical operations (topk, kthvalue), and advanced utilities (unique, histogram)

For detailed features, see [Features Documentation](docs/core/features.md).

## 🚀 Quick Start

### 📓 Interactive Jupyter Demos

- **インストール (Install)**
```bash
curl -sSL https://raw.githubusercontent.com/JunSuzukiJapan/rustorch/main/scripts/install_jupyter.sh | bash
```

- **起動 (Launch)**
```bash
rustorch-jupyter          # Global command / グローバルコマンド
```
**あるいは (OR)**
```bash
./start_jupyter_quick.sh  # Interactive menu / 対話式メニュー
```

- **コーディング�����!(Start coding!)**

**🎉 That's it! Your browser will open with Jupyter ready to use RusTorch!**  
**🎉 これで完了!ブラウザでJupyterが開き、RusTorchを使う準備完了!**

---

**🚀 Manual Setup** - Or choose specific demo type:

| Demo Type | Setup Command | Description |
|-----------|---------------|-------------|
| 🦀🐍 **Hybrid** | `./start_jupyter_hybrid.sh` | Python + Rust dual-kernel environment |
| 🐍 **Python** | `./start_jupyter.sh` | Standard CPU-based ML demos |
| ⚡ **WebGPU** | `./start_jupyter_webgpu.sh` | Browser GPU acceleration (Chrome) |
| 🦀 **Rust Kernel** | `./scripts/quick_start_rust_kernel.sh` | Native Rust in Jupyter |
| 🌐 **Online** | [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/JunSuzukiJapan/rustorch/main?urlpath=lab) | No setup needed - run in browser |

📚 **Detailed Setup**: [Complete Jupyter Guide](README_JUPYTER.md) | [日本語ガイド](docs/jupyter-wasm-guide.md)

### Installation

Add this to your `Cargo.toml`:

```toml
[dependencies]
rustorch = "0.6.29"

# Optional features
[features]
default = ["linalg"]
linalg = ["rustorch/linalg"]           # Linear algebra operations (SVD, QR, eigenvalue)
cuda = ["rustorch/cuda"]
metal = ["rustorch/metal"]
opencl = ["rustorch/opencl"]
coreml = ["rustorch/coreml"]
safetensors = ["rustorch/safetensors"]
onnx = ["rustorch/onnx"]
wasm = ["rustorch/wasm"]                # WebAssembly support for browser ML
webgpu = ["rustorch/webgpu"]            # Chrome-optimized WebGPU acceleration

# To disable linalg features (avoid OpenBLAS/LAPACK dependencies):
rustorch = { version = "0.6.29", default-features = false }
```

### Basic Usage

```rust
use rustorch::{tensor, tensor::Tensor};
use rustorch::optim::{SGD, WarmupScheduler, OneCycleLR, AnnealStrategy};
use rustorch::error::RusTorchResult;

fn main() -> RusTorchResult<()> {
    // Create tensors with convenient macro syntax
    let a = tensor!([[1, 2], [3, 4]]);
    let b = tensor!([[5, 6], [7, 8]]);

    // Basic operations with operator overloads
    let c = &a + &b;  // Element-wise addition
    let d = &a - &b;  // Element-wise subtraction
    let e = &a * &b;  // Element-wise multiplication
    let f = &a / &b;  // Element-wise division

    // Scalar operations
    let g = &a + 10.0;  // Add scalar to all elements
    let h = &a * 2.0;   // Multiply by scalar

    // Mathematical functions
    let exp_result = a.exp();   // Exponential function
    let ln_result = a.ln();     // Natural logarithm
    let sin_result = a.sin();   // Sine function
    let sqrt_result = a.sqrt(); // Square root

    // Matrix operations
    let matmul_result = a.matmul(&b);  // Matrix multiplication

    // Linear algebra operations (requires linalg feature)
    #[cfg(feature = "linalg")]
    {
        let svd_result = a.svd();       // SVD decomposition
        let qr_result = a.qr();         // QR decomposition
        let eig_result = a.eigh();      // Eigenvalue decomposition
    }

    // Advanced optimizers with learning rate scheduling
    let optimizer = SGD::new(0.01);
    let mut scheduler = WarmupScheduler::new(optimizer, 0.1, 5); // Warmup to 0.1 over 5 epochs

    // One-cycle learning rate policy
    let optimizer2 = SGD::new(0.01);
    let mut one_cycle = OneCycleLR::new(optimizer2, 1.0, 100, 0.3, AnnealStrategy::Cos);

    println!("Shape: {:?}", c.shape());
    println!("Result: {:?}", c.as_slice());

    Ok(())
}
```

### WebAssembly Usage

For browser-based ML applications:

```javascript
import init, * as rustorch from './pkg/rustorch.js';

async function browserML() {
    await init();
    
    // Neural network layers
    const linear = new rustorch.WasmLinear(784, 10, true);
    const conv = new rustorch.WasmConv2d(3, 32, 3, 1, 1, true);
    
    // Enhanced mathematical functions
    const gamma_result = rustorch.WasmSpecial.gamma_batch([1.5, 2.0, 2.5]);
    const bessel_result = rustorch.WasmSpecial.bessel_i_batch(0, [0.5, 1.0, 1.5]);
    
    // Statistical distributions
    const normal_dist = new rustorch.WasmDistributions();
    const samples = normal_dist.normal_sample_batch(100, 0.0, 1.0);
    
    // Optimizers for training
    const sgd = new rustorch.WasmOptimizer();
    sgd.sgd_init(0.01, 0.9); // learning_rate, momentum
    
    // Image processing
    const resized = rustorch.WasmVision.resize(image, 256, 256, 224, 224, 3);
    const normalized = rustorch.WasmVision.normalize(resized, [0.485, 0.456, 0.406], [0.229, 0.224, 0.225], 3);
    
    // Forward pass
    const predictions = conv.forward(normalized, 1, 224, 224);
    console.log('Browser ML predictions:', predictions);
}
```

### WebGPU Acceleration (Chrome Optimized)

For Chrome browsers with WebGPU support:

```javascript
import init, * as rustorch from './pkg/rustorch.js';

async function webgpuML() {
    await init();
    
    // Initialize WebGPU engine
    const webgpu = new rustorch.WebGPUSimple();
    await webgpu.initialize();
    
    // Check WebGPU support
    const supported = await webgpu.check_webgpu_support();
    if (supported) {
        console.log('🚀 WebGPU acceleration enabled');
        
        // High-performance tensor operations
        const a = [1, 2, 3, 4];
        const b = [5, 6, 7, 8];
        const result = webgpu.tensor_add_cpu(a, b);
        
        // Matrix multiplication with GPU optimization
        const matrix_a = new Array(256 * 256).fill(0).map(() => Math.random());
        const matrix_b = new Array(256 * 256).fill(0).map(() => Math.random());
        const matrix_result = webgpu.matrix_multiply_cpu(matrix_a, matrix_b, 256, 256, 256);
        
        console.log('WebGPU accelerated computation complete');
    } else {
        console.log('⚠️ WebGPU not supported, using CPU fallback');
    }
    
    // Interactive demo interface
    const demo = new rustorch.WebGPUSimpleDemo();
    demo.create_interface(); // Creates browser UI for testing
}
```

**Run examples:**
```bash
# Basic functionality examples
cargo run --example activation_demo --no-default-features
cargo run --example complex_tensor_demo --no-default-features  
cargo run --example neural_network_demo --no-default-features
cargo run --example autograd_demo --no-default-features

# Machine learning examples
cargo run --example boston_housing_regression --no-default-features
cargo run --example vision_pipeline_demo --no-default-features
cargo run --example embedding_demo --no-default-features

# Linear algebra examples (requires linalg feature)
cargo run --example eigenvalue_demo --features linalg
cargo run --example svd_demo --features linalg
cargo run --example matrix_decomposition_demo --features linalg

# Mathematical functions
cargo run --example special_functions_demo --no-default-features
cargo run --example fft_demo --no-default-features
```

For more examples, see [Getting Started Guide](docs/core/getting-started.md) and [WebAssembly Guide](docs/specialized/wasm/WASM_GUIDE.md).

## 📚 Documentation

- **[Getting Started](docs/core/getting-started.md)** - Basic usage and examples
- **[Features](docs/core/features.md)** - Complete feature list and specifications
- **[Performance](docs/guides/performance.md)** - Benchmarks and optimization details
- **[Architecture](docs/core/architecture.md)** - System design and project structure
- **[Examples](docs/guides/examples.md)** - Comprehensive code examples
- **[API Documentation](https://docs.rs/rustorch)** - Detailed API reference

### WebAssembly & Browser ML
- **[WebAssembly Guide](docs/specialized/wasm/WASM_GUIDE.md)** - Complete WASM integration and API reference
- **[Jupyter Guide](docs/guides/jupyter-guide.md)** - Step-by-step Jupyter Notebook setup
- **[WebGPU Integration](docs/specialized/wasm/webgpu-integration.md)** - Chrome-optimized GPU acceleration
- **[Browser Compatibility](docs/specialized/compatibility/BROWSER_COMPATIBILITY.md)** - Cross-browser support matrix
- **[WASM Performance](docs/specialized/wasm/WASM_PERFORMANCE.md)** - Benchmarking and optimization strategies

### Production & Operations
- **[GPU Acceleration Guide](docs/specialized/gpu/GPU_ACCELERATION_GUIDE.md)** - GPU setup and usage
- **[Production Guide](docs/guides/production.md)** - Deployment and scaling

## 📊 Performance

**🏆 Phase 2 Performance Revolution - Up to 580% improvement:**

### Advanced Optimizer Performance (Release Mode)
| Optimizer | Performance | Use Case | Status |
|-----------|-------------|----------|--------|
| **Adamax** | **33,632 steps/sec** ⚡ | Sparse features, Embeddings | **FASTEST** |
| **RAdam** | **21,939 steps/sec** 🚀 | General deep learning, Stability | **RECOMMENDED** |
| **NAdam** | **18,976 steps/sec** ✨ | NLP, Fine-tuning | **NLP-OPTIMIZED** |

### Core Operations Performance
| Operation | Performance | Details |
|-----------|-------------|---------|
| **SVD Decomposition** | ~1ms (8x8 matrix) | ✅ LAPACK-based |
| **QR Decomposition** | ~24μs (8x8 matrix) | ✅ Fast decomposition |
| **Eigenvalue** | ~165μs (8x8 matrix) | ✅ Symmetric matrices |
| **Complex FFT** | 10-312μs (8-64 samples) | ✅ Cooley-Tukey optimized |
| **Neural Network** | 1-7s training | ✅ Boston housing demo |
| **Activation Functions** | <1μs | ✅ ReLU, Sigmoid, Tanh |

### Phase 2 Architectural Benefits
- **🧹 50%+ code reduction** through unified GenericAdamOptimizer
- **🔧 Consistent API** across all Adam variants
- **⚡ Shared optimizations** benefiting all optimizers
- **🛡️ Robust error handling** with RusTorchResult<T>

**Run benchmarks:**
```bash
# Basic benchmarks (no external dependencies)
cargo bench --no-default-features

# Linear algebra benchmarks (requires linalg feature) 
cargo bench --features linalg

# Phase 2 optimizer benchmarks (NEW!)
cargo run --bin quick_optimizer_bench --release

# Quick manual benchmark
cargo run --bin manual_quick_bench
```

For detailed performance analysis, see [Performance Documentation](docs/guides/performance.md).


## 🧪 Testing

**1060 tests passing** - Production-ready quality assurance with unified `RusTorchError` error handling system.

```bash
# Run all tests (recommended for CI/development)
cargo test --no-default-features

# Run tests with linear algebra features
cargo test --features linalg

# Run doctests
cargo test --doc --no-default-features

# Test specific modules
cargo test --no-default-features tensor::
cargo test --no-default-features complex::
```

## 🚀 Production Deployment

### Docker
```bash
# Production deployment
docker build -t rustorch:latest .
docker run -it rustorch:latest

# GPU-enabled deployment
docker build -f docker/Dockerfile.gpu -t rustorch:gpu .
docker run --gpus all -it rustorch:gpu
```

For complete deployment guide, see [Production Guide](docs/guides/production.md).

## 🤝 Contributing

We welcome contributions! See areas where help is especially needed:

- **🎯 Special Functions Precision**: Improve numerical accuracy
- **⚡ Performance Optimization**: SIMD improvements, GPU optimization  
- **🧪 Testing**: More comprehensive test cases
- **📚 Documentation**: Examples, tutorials, improvements
- **🌐 Platform Support**: WebAssembly, mobile platforms

### Development Setup

```bash
git clone https://github.com/JunSuzukiJapan/rustorch.git
cd rustorch

# Run tests
cargo test --all-features

# Check formatting
cargo fmt --check

# Run clippy
cargo clippy --all-targets --all-features
```

## 🔒 API Stability / API安定性

**English**: Starting from v0.6.0, RusTorch has reached a stable API milestone. We are committed to maintaining backward compatibility and will avoid major breaking changes in the near future. This ensures a reliable foundation for production applications and long-term projects.

**日本語**: v0.6.0以降、RusTorchは安定したAPIマイルストーンに達しました。後方互換性を維持することをお約束し、近い将来において大規模な破壊的変更は行いません。これにより、本番アプリケーションと長期プロジェクトに信頼性の高い基盤を提供します。

## License

Licensed under either of:

 * Apache License, Version 2.0 ([LICENSE-APACHE](LICENSE-APACHE) or http://www.apache.org/licenses/LICENSE-2.0)
 * MIT license ([LICENSE-MIT](LICENSE-MIT) or http://opensource.org/licenses/MIT)

at your option.