Expand description
Performance optimization module Performance optimization utilities for neural networks
This module provides comprehensive performance optimizations for neural network operations including SIMD acceleration, memory-efficient processing, and parallel execution capabilities. The module is organized into three focused submodules:
- [
simd
] - SIMD-accelerated operations for vectorized computations - [
memory
] - Memory-efficient processing and optimization capabilities - [
threading
] - Thread pool management, profiling, and distributed training
§Quick Start
§SIMD Operations
use scirs2_neural::performance::simd::SIMDOperations;
use ndarray::Array;
let input = Array::ones((1000, 512)).into_dyn();
let result = SIMDOperations::simd_relu_f32(&input.view());
§Memory-Efficient Processing
use scirs2_neural::performance::memory::MemoryEfficientProcessor;
let processor = MemoryEfficientProcessor::new(Some(256), Some(1024));
// Process large tensors in manageable chunks
§Thread Pool Management
use scirs2_neural::performance::threading::ThreadPoolManager;
use ndarray::Array;
let a = Array::ones((100, 200)).into_dyn();
let b = Array::ones((200, 150)).into_dyn();
let pool = ThreadPoolManager::new(Some(8)).unwrap();
let result = pool.parallel_matmul(&a, &b).unwrap();
assert_eq!(result.shape(), &[100, 150]);
§Unified Performance Optimization
use scirs2_neural::performance::PerformanceOptimizer;
use ndarray::Array;
let a = Array::ones((100, 200)).into_dyn();
let b = Array::ones((200, 150)).into_dyn();
let mut optimizer = PerformanceOptimizer::new(
Some(256), // chunk_size
Some(1024), // max_memory_mb
Some(8), // num_threads
true // enable_profiling
).unwrap();
let result = optimizer.optimized_matmul(&a, &b).unwrap();
optimizer.profiler().print_summary();
Re-exports§
pub use simd::SIMDOperations;
pub use memory::MemoryEfficientProcessor;
pub use memory::MemoryMonitor;
pub use memory::MemoryPool;
pub use memory::MemoryPoolStats;
pub use memory::MemorySettings;
pub use memory::MemoryStats;
pub use memory::OptimizationCapabilities;
pub use memory::SIMDStats;
pub use threading::distributed::CommunicationBackend;
pub use threading::distributed::DistributedConfig;
pub use threading::distributed::DistributedManager;
pub use threading::distributed::DistributedStats;
pub use threading::distributed::DistributedStrategy;
pub use threading::distributed::GradientSyncMethod;
pub use threading::distributed::ProcessInfo;
pub use threading::PerformanceProfiler;
pub use threading::ProfilingStats;
pub use threading::ThreadPoolManager;
pub use threading::ThreadPoolStats;
Modules§
- memory
- Memory-efficient processing for neural networks
- simd
- SIMD-accelerated operations for neural networks
- threading
- Threading and parallel processing for neural networks
Structs§
- Benchmark
Results - Benchmark results for different optimization strategies
- Performance
Optimizer - Unified performance optimization manager
- Performance
Stats - Comprehensive performance statistics