Module kv_cache

Expand description

KV-Cache abstraction with handle semantics and block management

This module provides a sentence-handle based abstraction for KV cache management, supporting both contiguous and paged attention patterns with zero-copy operations.

Structs§

AllocationRequest: KV cache allocation request
BlockTable: Block table for mapping logical to physical cache blocks
CacheConfig: Cache configuration
CacheGcStats: Garbage collection statistics
CacheHandleStats: Statistics for individual cache handle
CacheManagerStats: Cache manager statistics
CompressionStats: Cache compression statistics
LruEvictionPolicy: Least Recently Used eviction policy
MemoryPressureThresholds: Memory pressure threshold configuration
PrefixCacheConfig: Prefix caching configuration

Enums§

MemoryPressure: Memory pressure levels for adaptive management

Traits§

AdvancedKvCacheManager: Advanced KV cache capabilities
BlockAllocator: Block-based cache allocator
CacheEvictionPolicy: Cache eviction strategies
KvCacheHandle: KV cache handle providing access to cached key-value states
KvCacheManager: KV cache manager for allocation and lifecycle management
MultiDeviceCacheManager: Multi-device cache manager supporting GPU/CPU hierarchies

Module kv_cache

Module kv_cache Copy item path

Structs§

Enums§

Traits§

Module kv_cache