Expand description
Mock LLM scheduler and KV manager for testing.
This crate provides a mock implementation of an LLM scheduler that simulates KV cache management, request scheduling, and token generation timing without requiring actual GPU resources or a full distributed runtime.
Modulesยง
- cache
- Cache data structures for KV block management.
- common
- Shared components used across all engine implementations.
- kv_
manager - Pluggable KV cache block managers.
- scheduler
- Engine-specific scheduling implementations.