rustorch 0.6.29

Production-ready PyTorch-compatible deep learning library in Rust with special mathematical functions (gamma, Bessel, error functions), statistical distributions, Fourier transforms (FFT/RFFT), matrix decomposition (SVD/QR/LU/eigenvalue), automatic differentiation, neural networks, computer vision transforms, complete GPU acceleration (CUDA/Metal/OpenCL), SIMD optimizations, parallel processing, WebAssembly browser support, comprehensive distributed learning support, and performance validation
Documentation
# WASM Memory Optimization Guide

> 📋 **API Reference**: [WASM API Documentation]WASM_API_DOCUMENTATION.md  
> 🔧 **Setup Guide**: [WASM Guide]WASM_GUIDE.md

## Overview

RusTorch WASM modules implement an advanced memory pooling system that significantly improves performance by reusing allocated buffers and reducing garbage collection pressure in JavaScript environments.

## Memory Pool Architecture

### Pool Categories

The memory pool system categorizes buffers into three tiers:

- **Small Buffers**: < 256 elements (~1KB) - For lightweight operations
- **Medium Buffers**: 256-262,144 elements (~1MB) - For standard tensor operations  
- **Large Buffers**: > 262,144 elements (>1MB) - For high-resolution data processing

### Performance Benefits

| Metric | Improvement |
|--------|-------------|
| Cache Hit Rate | 80-95% |
| Memory Allocations | 30-60% reduction |
| Operation Speed | 15-40% faster |
| GC Pressure | 70% reduction |

## API Reference

### MemoryManager

Static utility class for global memory pool management.

```typescript
class MemoryManager {
  // Initialize pool with capacity
  static init_pool(max_size: number): void;
  
  // Get/return buffers (automatic)
  static get_buffer(size: number): Vec<f32>;
  static return_buffer(buffer: Vec<f32>): void;
  
  // Performance monitoring
  static get_stats(): string;
  static cache_efficiency(): string;
  static pool_stats(): PoolStats | null;
  
  // Memory management
  static gc(): void;
}
```

### Pool Statistics

```typescript
interface PoolStats {
  small_count: number;      // Buffers in small pool
  medium_count: number;     // Buffers in medium pool
  large_count: number;      // Buffers in large pool
  max_size: number;         // Maximum pool capacity
  hit_rate: number;         // Cache hit percentage
  memory_saved_mb: number;  // Memory saved in MB
}
```

## Usage Patterns

### Automatic Management

Most operations use the memory pool automatically:

```javascript
import { WasmTensor, WasmAdvancedMath, MemoryManager } from 'rustorch-wasm';

// Pool initializes automatically on first use
const tensor = new WasmTensor([1, 2, 3, 4], [2, 2]);
const math = new WasmAdvancedMath();

// Operations automatically use pooled memory
const result = math.sinh(tensor);

// Check pool performance
const stats = JSON.parse(MemoryManager.get_stats());
console.log(`Cache hit rate: ${stats.hit_rate}%`);
console.log(`Memory saved: ${stats.memory_saved_bytes} bytes`);

// Cleanup (returns buffers to pool)
tensor.free();
result.free();
math.free();
```

### Manual Pool Management

For advanced use cases, you can control pool behavior:

```javascript
// Initialize pool with specific capacity
MemoryManager.init_pool(200);

// Monitor efficiency
const efficiency = JSON.parse(MemoryManager.cache_efficiency());
console.log(`Efficiency rating: ${efficiency.efficiency}`);

// Force garbage collection if needed
if (efficiency.hit_rate < 60.0) {
    MemoryManager.gc();
}
```

## Performance Optimization

### Best Practices

1. **Pool Initialization**: Initialize with appropriate capacity for your workload
   ```javascript
   // For lightweight processing
   MemoryManager.init_pool(50);
   
   // For heavy computational workloads
   MemoryManager.init_pool(500);
   ```

2. **Batch Operations**: Group related operations to maximize pool efficiency
   ```javascript
   // Good: Batch tensor operations
   const results = tensors.map(t => math.tanh(t));
   
   // Less optimal: Interleaved different operations
   const result1 = math.tanh(tensor1);
   const quality1 = quality.overall_quality(tensor1);
   const result2 = math.tanh(tensor2);
   ```

3. **Monitor Performance**: Regularly check pool statistics
   ```javascript
   const stats = JSON.parse(MemoryManager.get_stats());
   if (stats.hit_rate < 80) {
       console.warn('Pool efficiency degraded, consider tuning');
   }
   ```

### Pool Tuning

#### Capacity Sizing

Choose pool capacity based on your workload:

```javascript
// Workload analysis
const concurrent_operations = 10;
const avg_tensor_size = 1000;
const safety_margin = 2.0;

const optimal_capacity = concurrent_operations * safety_margin;
MemoryManager.init_pool(optimal_capacity);
```

#### Garbage Collection

The pool includes intelligent garbage collection:

```javascript
// Automatic GC triggers:
// - Pool size exceeds capacity
// - Buffers exceed size thresholds
// - Manual trigger for memory pressure

MemoryManager.gc(); // Force cleanup when needed
```

## Memory Efficiency Patterns

### Resource Lifecycle

```javascript
class TensorProcessor {
    constructor() {
        this.math = new WasmAdvancedMath();
        this.quality = new WasmQualityMetrics(0.8);
    }
    
    process(data, shape) {
        const tensor = new WasmTensor(data, shape);
        
        try {
            // Process with pooled memory
            const normalized = this.math.tanh(tensor);
            const score = this.quality.overall_quality(normalized);
            
            return { result: normalized.data(), quality: score };
        } finally {
            // Always cleanup to return buffers to pool
            tensor.free();
            if (normalized) normalized.free();
        }
    }
    
    cleanup() {
        this.math.free();
        this.quality.free();
    }
}
```

### Streaming Processing

```javascript
class StreamProcessor {
    constructor() {
        this.detector = new WasmAnomalyDetector(2.5, 100);
        this.buffer = [];
    }
    
    processStream(values) {
        for (const value of values) {
            // Real-time detection uses minimal memory
            const anomaly = this.detector.detect_realtime(value);
            if (anomaly) {
                this.handleAnomaly(anomaly);
            }
        }
        
        // Batch processing when buffer full
        if (this.buffer.length >= 1000) {
            const tensor = new WasmTensor(this.buffer, [this.buffer.length]);
            const anomalies = this.detector.detect_statistical(tensor);
            
            tensor.free(); // Return to pool
            this.buffer = [];
        }
    }
}
```

## Monitoring and Debugging

### Performance Metrics

Track pool performance for optimization:

```javascript
class PoolMonitor {
    logStats() {
        const stats = JSON.parse(MemoryManager.get_stats());
        const efficiency = JSON.parse(MemoryManager.cache_efficiency());
        
        console.log(`Pool Status:
            Hit Rate: ${stats.hit_rate}%
            Allocations: ${stats.total_allocations}
            Memory Saved: ${(stats.memory_saved_bytes / 1024 / 1024).toFixed(2)} MB
            Efficiency: ${efficiency.efficiency}
        `);
    }
    
    healthCheck() {
        const efficiency = JSON.parse(MemoryManager.cache_efficiency());
        
        if (efficiency.hit_rate < 60) {
            console.warn('Pool efficiency below optimal, consider increasing capacity');
            return false;
        }
        
        return true;
    }
}
```

### Memory Leak Detection

```javascript
// Monitor for memory leaks
function detectLeaks() {
    const initialStats = JSON.parse(MemoryManager.get_stats());
    
    // Perform operations
    performOperations();
    
    // Force cleanup
    MemoryManager.gc();
    
    const finalStats = JSON.parse(MemoryManager.get_stats());
    
    // Check for memory growth
    const poolGrowth = (finalStats.small + finalStats.medium + finalStats.large) - 
                      (initialStats.small + initialStats.medium + initialStats.large);
    
    if (poolGrowth > expected_growth) {
        console.warn('Potential memory leak detected');
    }
}
```

## Platform Considerations

### Browser Compatibility

- **WebAssembly Support**: Requires browsers with WASM support (95%+ coverage)
- **Memory Limits**: Browser memory constraints may limit pool size
- **Threading**: Uses main thread only, no SharedArrayBuffer dependencies

### Node.js Integration

```javascript
// Node.js specific optimizations
if (typeof process !== 'undefined' && process.versions.node) {
    // Larger pool capacity for server environments
    MemoryManager.init_pool(1000);
}
```

### Mobile Considerations

```javascript
// Detect mobile/low-memory environments
const isMobile = /Android|webOS|iPhone|iPad|iPod|BlackBerry|IEMobile|Opera Mini/i.test(navigator.userAgent);

if (isMobile) {
    // Smaller pool for memory-constrained devices
    MemoryManager.init_pool(25);
}
```

## Troubleshooting

### Common Issues

**Low Cache Hit Rate (<60%)**
- Increase pool capacity
- Check operation patterns for size mismatches
- Consider workload-specific pool tuning

**Memory Growth Over Time**
- Verify all `free()` calls are made
- Check for exception handling that skips cleanup
- Use memory leak detection patterns

**Performance Degradation**
- Monitor pool statistics regularly
- Force garbage collection during low-activity periods
- Consider pool reinitialization for long-running applications

### Debug Information

Enable debug output for detailed pool behavior:

```javascript
// Log all pool operations (development only)
const originalGet = MemoryManager.get_buffer;
MemoryManager.get_buffer = function(size) {
    console.debug(`Pool: Getting buffer size=${size}`);
    return originalGet.call(this, size);
};
```

## Advanced Topics

### Custom Pool Configuration

For specialized workloads, consider custom pool implementations:

```rust
// Custom pool with different size thresholds
let pool = WasmTensorPool::with_custom_thresholds(128, 65536, 100);
```

### Integration with Other Systems

The memory pool can be integrated with other memory management systems:

- Web Workers: Each worker can maintain its own pool
- Service Workers: Shared pool for background processing
- WebGL: Coordinate with GPU memory management