Expand description
§Ferrum KV Cache
MVP KV-Cache management implementation for Ferrum inference stack.
This crate provides block-based KV cache management, implementing the
interfaces defined in ferrum-interfaces::kv_cache.
Re-exports§
Modules§
Structs§
- Allocation
Request - KV cache allocation request
- Block
Table - Block table for mapping logical to physical cache blocks
- Cache
Config - Cache configuration
- Cache
Handle Stats - Statistics for individual cache handle
- Cache
Manager Stats - Cache manager statistics
- Cache
Stats - Cache statistics
- KvManager
Config - Internal KV Cache manager configuration
- LruEviction
Policy - Least Recently Used eviction policy
- Prefix
Cache Config - Prefix caching configuration
- Request
Id - Request identifier
Enums§
- Data
Type - Data type for tensors
- Device
- Device type for computation
- Ferrum
Error - Main error type for Ferrum operations
Traits§
- Cache
Eviction Policy - Cache eviction strategies
- KvCache
Handle Interface - KV cache handle providing access to cached key-value states
- KvCache
Manager Interface - KV cache manager for allocation and lifecycle management
Functions§
- default_
manager - Default KV cache manager factory
Type Aliases§
- Result
- Result type used throughout Ferrum