abyo-speculate 0.5.0

Pure Rust Speculative Decoding library for local LLMs — vanilla SD + Medusa, Qwen2 + Llama, batch-1 optimised
Documentation
1
2
3
4
5
6
7
8
9
10
11
//! KV-cache primitives with rollback support.
//!
//! Speculative decoding requires the target model to *evaluate* `k` draft tokens
//! at once, then *roll back* its KV cache to wherever the first rejection
//! occurred. candle's stock KV-cache implementation (`candle_nn::kv_cache`)
//! supports append-and-truncate, but the SD-specific snapshot/restore lifecycle
//! is wrapped here so callers don't have to think about it.

pub mod rollback;

pub use rollback::{KvSnapshot, RollbackCache};