Skip to main content

Module kv_cache

Module kv_cache 

Source
Expand description

KV Cache module for efficient autoregressive generation

KV Cache caches key and value states to avoid recomputing them during autoregressive generation, significantly improving inference speed.

Structsยง

KVCache
KV Cache for caching key and value states during generation
PagedKVCache
Paged KV Cache for vLLM-style memory management