Crate ferrum_testkit

Expand description

Mock components for testing the ferrum inference engine without GPU.

Provides MockTensor, MockModelExecutor, MockKvCacheManager, MockTokenizer, and MockSampler — all hardware-independent, suitable for testing the scheduling and orchestration layers on any platform.

Re-exports§

pub use paged_executor::PagedAttentionExecutor;
pub use paged_executor::PagedExecutorConfig;

Modules§

paged_executor: Model executor that uses PagedAttention KV cache.

Structs§

MockKvCacheHandle: Mock KV cache handle — tracks block metadata without allocating real memory.
MockKvCacheManager: Mock KV cache manager — tracks allocations in memory, simulates block limits.
MockModelExecutor: Mock model executor that simulates prefill/decode with configurable latency. No model weights, no GPU — pure async simulation.
MockSampler: Greedy sampler — always picks the token with highest logit. Deterministic, no temperature or top-k.
MockTensor: A mock tensor that stores shape and optional f32 data. No GPU, no Candle — pure Rust.
MockTensorFactory: Mock tensor factory implementing TensorFactory without any ML backend.
MockTokenizer: Mock tokenizer: splits on whitespace, assigns sequential token IDs. EOS token is vocab_size - 1.

Crate ferrum_testkit

Crate ferrum_testkit Copy item path

Re-exports§

Modules§

Structs§

Crate ferrum_testkit