Skip to main content

Module sequence_pool

Module sequence_pool 

Source
Expand description

SSM runtime bridge — polymorphic sequence-state pool.

§Overview

OxiLLaMa supports two categories of model architectures:

  1. Attention-based (LLaMA, Qwen3, Mistral, Gemma, Phi, …): per-sequence state is the KV cache (a contiguous K/V buffer per layer).
  2. SSM-based (Mamba-2, …): per-sequence state is a set of per-layer recurrent hidden vectors; there is no KV cache.

The SequencePool enum abstracts over both kinds via the oxillama_arch::common::sequence_state::SequenceState trait. The engine picks the right pool variant at load time by examining the loaded architecture; both variants expose the same alloc / release / slot interface so the rest of the engine stays arch-agnostic.

§Design notes

  • Slots are identified by a usize index (same as Sequence::slot_id).
  • A slot is “live” when it holds a Box<dyn SequenceState>.
  • On release the state is reset (zeroed) and returned to the free pool.
  • Neither variant interacts with the KV cache from kv_cache/mod.rs; the KV-based pool manages its own separate per-slot state.
  • The SSM state pool owns the Box<dyn SequenceState> objects outright; the KV-based pool keeps a KvCachePool from which page indices are lent.

§Thread safety

SequencePool is not Send + Sync by itself; it is intended to be owned by a single-threaded engine or wrapped in a Mutex by the caller.

Structs§

SequenceSlot
A live sequence slot in the SsmStatePool.
SsmStatePool
A free-list pool of SequenceSlots for SSM-based models.

Enums§

PoolError
Errors produced by pool operations.
SequencePool
Dispatch-enum over the two pool backends.

Type Aliases§

PoolResult
Convenience alias.