Struct MlxBufferPool

Source

pub struct MlxBufferPool { /* private fields */ }

Expand description

Arena-style buffer pool that reuses Metal buffer allocations.

§Design

Buffers are bucketed by their allocated size rounded up to the nearest power of two. This reduces fragmentation at the cost of occasionally over-allocating by up to 2x.
release() returns a single buffer; reset() returns all outstanding buffers handed out since the last reset.
The MlxDevice is passed in at every [alloc] call (rather than stored in the pool). This keeps the pool free of lifetime parameters so it can be embedded in any owner struct (e.g. the per-decode-token DecodeBuffers cache in hf2q’s qwen35 forward path).

§Why an arena reset matters

In the per-decode-token hot path, each token allocates ~1750 Metal buffers for scratch / intermediate / parameter storage across attention, FFN, and linear-attention layers. Direct MlxDevice::alloc_buffer() calls hit Metal’s allocator each time (5-30 µs each); pooling reuses the underlying metal::Buffer objects across token boundaries so steady-state allocation cost amortizes to near zero. See ADR-012 §Optimize / Task #15 for the MoE dwq46 0.90× parity gap that motivated this work.

Struct MlxBufferPool Copy item path

§Design

§Why an arena reset matters

Implementations§

impl MlxBufferPool

pub fn new() -> Self

pub fn alloc( &mut self, device: &MlxDevice, byte_len: usize, dtype: DType, shape: Vec<usize>, ) -> Result<MlxBuffer>

pub fn alloc_batch<I>( &mut self, device: &MlxDevice, requests: I, ) -> Result<Vec<MlxBuffer>>where I: IntoIterator<Item = (usize, DType, Vec<usize>)>,

pub fn release(&mut self, buffer: MlxBuffer)

pub fn reset(&mut self)

§Caller contract

pub fn register_existing( &mut self, device: &MlxDevice, buffer: &MlxBuffer, ) -> Result<()>

§Why this exists

§Ownership semantics

§HF2Q_NO_RESIDENCY=1 escape hatch

§Idempotence

§Errors

pub fn free_count(&self) -> usize

pub fn free_bytes(&self) -> usize

pub fn in_use_count(&self) -> usize

pub fn clear(&mut self)

Trait Implementations§

impl Default for MlxBufferPool

fn default() -> Self

impl Drop for MlxBufferPool

fn drop(&mut self)

fn pin_drop(self: Pin<&mut Self>)

Auto Trait Implementations§

impl Freeze for MlxBufferPool

impl RefUnwindSafe for MlxBufferPool

impl Send for MlxBufferPool

impl Sync for MlxBufferPool

impl Unpin for MlxBufferPool

impl UnsafeUnpin for MlxBufferPool

impl UnwindSafe for MlxBufferPool

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> From<T> for T

fn from(t: T) -> T

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

type Error = Infallible

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Struct MlxBufferPool

pub fn alloc_batch<I>( &mut self, device: &MlxDevice, requests: I, ) -> Result<Vec<MlxBuffer>>
where I: IntoIterator<Item = (usize, DType, Vec<usize>)>,

§`HF2Q_NO_RESIDENCY=1` escape hatch

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T, U> Into<U> for T
where U: From<T>,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,