Skip to main content

stack_block_tables

Function stack_block_tables 

Source
pub fn stack_block_tables<F: Fn(&str) -> Vec<u32>>(
    items: &[(String, Vec<u32>, usize, bool)],
    max_blocks_per_seq: usize,
    lookup: F,
) -> Vec<u32>
Expand description

Pack per-(seq, layer-0) page indices into the dense [num_seqs, max_blocks_per_seq] layout that the varlen attention kernel reads. Layer indexing is “first layer’s block table” because in ferrum’s paged-KV layout every layer shares the same block-table list (the layer-specific data lives inside each KV pool; the table itself is per-sequence).

lookup returns the block-indices slice for each item’s cache_id; the caller wires this to its model’s kv_caches.get(cid).