Expand description
Performance Optimization Utilities
This module provides performance optimization utilities for model inference, including batch processing, memory optimization, and caching strategies.
Structs§
- Advanced
Performance Optimizer - Advanced performance optimizer with workload analysis
- Batch
Processor - Batch processor for efficient inference
- Batch
Statistics - Batch processing statistics
- Cache
Statistics - Cache statistics
- Cached
Tensor - Dynamic
Batch Manager - Dynamic batch manager
- GpuCache
Statistics - Comprehensive GPU cache statistics
- GpuMemory
Chunk - GpuMemory
Optimizer - GPU memory optimizer with intelligent recommendations
- GpuMemory
Pool - Advanced GPU Memory Management
- GpuMemory
Stats - GpuOptimization
Recommendations - GPU memory optimization recommendations
- GpuTensor
Cache - Advanced GPU tensor caching with memory-aware eviction
- LruCache
- LRU Cache implementation for tensors
- Memory
Optimizer - Memory optimization utilities
- Performance
Config - Configuration for performance optimization
- Performance
Monitor - Performance monitoring utilities
- Performance
Statistics - Performance statistics
- Workload
Analysis - Workload analysis summary
- Workload
Metrics - Workload metrics for optimization analysis
Enums§
- Batching
Strategy - Dynamic batching strategy