Module performance

Expand description

Performance optimization module Performance optimization utilities for neural networks

This module provides comprehensive performance optimizations for neural network operations including SIMD acceleration, memory-efficient processing, and parallel execution capabilities. The module is organized into three focused submodules:

[simd] - SIMD-accelerated operations for vectorized computations
[memory] - Memory-efficient processing and optimization capabilities
[threading] - Thread pool management, profiling, and distributed training

§Quick Start

§SIMD Operations

use scirs2_neural::performance::simd::SIMDOperations;
use ndarray::Array;

let input = Array::ones((1000, 512)).into_dyn();
let result = SIMDOperations::simd_relu_f32(&input.view());

§Memory-Efficient Processing

use scirs2_neural::performance::memory::MemoryEfficientProcessor;

let processor = MemoryEfficientProcessor::new(Some(256), Some(1024));
// Process large tensors in manageable chunks

§Thread Pool Management

use scirs2_neural::performance::threading::ThreadPoolManager;
use ndarray::Array;

let a = Array::ones((100, 200)).into_dyn();
let b = Array::ones((200, 150)).into_dyn();
let pool = ThreadPoolManager::new(Some(8)).unwrap();
let result = pool.parallel_matmul(&a, &b).unwrap();
assert_eq!(result.shape(), &[100, 150]);

§Unified Performance Optimization

use scirs2_neural::performance::PerformanceOptimizer;
use ndarray::Array;

let a = Array::ones((100, 200)).into_dyn();
let b = Array::ones((200, 150)).into_dyn();
let mut optimizer = PerformanceOptimizer::new(
    Some(256),  // chunk_size
    Some(1024), // max_memory_mb
    Some(8),    // num_threads
    true        // enable_profiling
).unwrap();

let result = optimizer.optimized_matmul(&a, &b).unwrap();
optimizer.profiler().print_summary();

Re-exports§

pub use simd::SIMDOperations;
pub use memory::MemoryEfficientProcessor;
pub use memory::MemoryMonitor;
pub use memory::MemoryPool;
pub use memory::MemoryPoolStats;
pub use memory::MemorySettings;
pub use memory::MemoryStats;
pub use memory::OptimizationCapabilities;
pub use memory::SIMDStats;
pub use threading::distributed::CommunicationBackend;
pub use threading::distributed::DistributedConfig;
pub use threading::distributed::DistributedManager;
pub use threading::distributed::DistributedStats;
pub use threading::distributed::DistributedStrategy;
pub use threading::distributed::GradientSyncMethod;
pub use threading::distributed::ProcessInfo;
pub use threading::PerformanceProfiler;
pub use threading::ProfilingStats;
pub use threading::ThreadPoolManager;
pub use threading::ThreadPoolStats;

Modules§

memory: Memory-efficient processing for neural networks
simd: SIMD-accelerated operations for neural networks
threading: Threading and parallel processing for neural networks

Structs§

BenchmarkResults: Benchmark results for different optimization strategies
PerformanceOptimizer: Unified performance optimization manager
PerformanceStats: Comprehensive performance statistics

Module performanceCopy item path