Module memory

Module memory 

Source
Expand description

Memory management for sparse inference.

This module provides weight quantization and neuron caching for efficient memory usage during inference.

Structsยง

CacheStats
Cache statistics.
NeuronCache
Neuron activation cache for hot/cold management.
QuantizedWeights
Quantized weight storage for reduced memory usage.