Expand description
Per-layer K/V cache for autoregressive decode (Whisper, Qwen, Gemma, …).
Structs§
- Layer
KvCache - Layer-wise past K/V tensors in row-major
[past_len * kv_dim]layout per layer.
Per-layer K/V cache for autoregressive decode (Whisper, Qwen, Gemma, …).
[past_len * kv_dim] layout per layer.