Skip to main content

Crate ferrum_testkit

Crate ferrum_testkit 

Source
Expand description

Testing utilities for the ferrum inference engine.

  • Mocks: MockModelExecutor, MockSampler, MockKvCacheManager, MockTokenizer
  • Configurable: ConfigurableModelExecutor (specific token sequences, EOS)
  • Bench: BenchmarkResult, percentile calculation, JSON output
  • Paged: PagedAttentionExecutor with real paged KV cache

All components are hardware-independent (CPU-only, no GPU required).

Re-exports§

pub use paged_executor::PagedAttentionExecutor;
pub use paged_executor::PagedExecutorConfig;

Modules§

bench
Benchmark result types and utilities.
paged_executor
Model executor that uses PagedAttention KV cache.

Structs§

ConfigurableModelExecutor
Model executor that produces a configurable sequence of tokens.
MockKvCacheHandle
Mock KV cache handle — tracks block metadata without allocating real memory.
MockKvCacheManager
Mock KV cache manager — tracks allocations in memory, simulates block limits.
MockModelExecutor
Mock model executor that simulates prefill/decode with configurable latency. No model weights, no GPU — pure async simulation.
MockSampler
Greedy sampler — always picks the token with highest logit. Deterministic, no temperature or top-k.
MockTensor
A mock tensor that stores shape and optional f32 data. No GPU, no Candle — pure Rust.
MockTensorFactory
Mock tensor factory implementing TensorFactory without any ML backend.
MockTokenizer
Mock tokenizer: splits on whitespace, assigns sequential token IDs. EOS token is vocab_size - 1.