pub struct RopeMultiBufferPack {
pub params_buf: MlxBuffer,
pub rope_params_buf: MlxBuffer,
pub sections_buf: MlxBuffer,
}Expand description
Pre-built triple of small parameter buffers for a rope_multi dispatch.
Held in the per-thread [ROPE_PACK_CACHE] so callers that issue
repeated dispatches with stable shape (the qwen35 / qwen36 decode hot
path: identical head_dim, rope_dim, n_heads, seq_len=1,
freq_base, mode, sections every step) skip the per-call
allocation triplet (~208 µs/token measured on M5 Max in
cfa-20260426-adr015-wave2a-p3aprime). Decode-out-of-scope cases
(variable seq_len) populate one entry per seq_len value seen,
then reuse on re-encounter.
Fields§
§params_buf: MlxBuffer§rope_params_buf: MlxBuffer§sections_buf: MlxBufferAuto Trait Implementations§
impl Freeze for RopeMultiBufferPack
impl RefUnwindSafe for RopeMultiBufferPack
impl Send for RopeMultiBufferPack
impl Sync for RopeMultiBufferPack
impl Unpin for RopeMultiBufferPack
impl UnsafeUnpin for RopeMultiBufferPack
impl UnwindSafe for RopeMultiBufferPack
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more