pub struct MlxBuffer { /* private fields */ }Expand description
A Metal GPU buffer annotated with element dtype and tensor shape.
On Apple Silicon the underlying memory is unified — contents_ptr() gives
direct CPU access without any copy or transfer.
§Thread Safety
MlxBuffer is Send + Sync because the inner metal::Buffer is.
§Residency-set lifecycle
Buffers produced by MlxDevice::alloc_buffer
on a residency-enabled device carry a shared
Arc<MlxBufferStorage> that owns the residency-set
reference and runs removeAllocation: (deferred — flushed at the next
CommandEncoder::commit* boundary) when the last clone is dropped.
Mirrors llama.cpp’s ggml-metal-device.m:1378-1382 pattern: batch
addAllocation: calls in a loop, commit ONCE.
Implementations§
Source§impl MlxBuffer
impl MlxBuffer
Sourcepub fn from_raw(inner: MetalBuffer, dtype: DType, shape: Vec<usize>) -> Self
pub fn from_raw(inner: MetalBuffer, dtype: DType, shape: Vec<usize>) -> Self
Create a new MlxBuffer wrapping an already-allocated Metal buffer.
§When to use
Use this to wrap Metal buffers obtained from external frameworks (e.g.
candle’s MetalStorage::buffer()) for zero-copy interop on Apple
Silicon unified memory. Both frameworks see the same physical memory.
§Safety contract
The caller must ensure that inner remains valid for the lifetime of
the returned MlxBuffer. If the buffer was obtained from another
framework, the caller must ensure that framework does not deallocate
the buffer while this MlxBuffer exists.
The returned buffer carries no residency-set guard — pool / external
callers that want residency tracking should go through
MlxDevice::alloc_buffer or
MlxBufferPool::register_existing.
Sourcepub fn slice_view(&self, byte_offset: u64, n_elements: usize) -> Self
pub fn slice_view(&self, byte_offset: u64, n_elements: usize) -> Self
Create a zero-copy slice view of this buffer.
Returns a new MlxBuffer that shares the same underlying Metal buffer
but starts at byte_offset bytes from the beginning and contains
n_elements elements of type dtype. No data is copied.
The slice view shares the parent’s residency-set guard via the
Arc<MlxBufferStorage>, so it does NOT trigger a second
addAllocation: and does NOT deregister the parent on drop.
When this view is bound to a kernel, the encoder passes the byte offset
to Metal’s setBuffer:offset:atIndex:, so the kernel sees only the
slice region.
§Panics
Panics if byte_offset + n_elements * dtype.size_of() > self.inner.length().
Sourcepub fn element_count(&self) -> usize
pub fn element_count(&self) -> usize
Number of elements (product of shape dimensions, or byte_len / dtype.size_of()).
Sourcepub fn contents_ptr(&self) -> *mut c_void
pub fn contents_ptr(&self) -> *mut c_void
Raw pointer to the buffer contents (CPU-accessible on Apple Silicon).
§Safety
The caller must ensure proper synchronization — do not read while a GPU command buffer that writes this buffer is in flight.
Sourcepub fn metal_buffer(&self) -> &MetalBuffer
pub fn metal_buffer(&self) -> &MetalBuffer
Reference to the underlying metal::Buffer for passing to the encoder.
Sourcepub fn byte_offset(&self) -> u64
pub fn byte_offset(&self) -> u64
Byte offset into the underlying Metal buffer (zero for non-slice buffers).
When passing this buffer to a Metal kernel via setBuffer:offset:atIndex:,
use this offset so the kernel sees only the intended sub-region.
Sourcepub fn as_slice<T: Pod>(&self) -> Result<&[T]>
pub fn as_slice<T: Pod>(&self) -> Result<&[T]>
View the buffer contents as a typed slice.
Returns an error if the buffer byte length is not an exact multiple of
size_of::<T>().
§Safety contract
The caller must ensure:
Tmatches the actual element type stored in the buffer.- No GPU command buffer that writes this buffer is currently in flight.
Sourcepub fn as_mut_slice<T: Pod>(&mut self) -> Result<&mut [T]>
pub fn as_mut_slice<T: Pod>(&mut self) -> Result<&mut [T]>
View the buffer contents as a mutable typed slice.
Same safety contract as as_slice, plus: the caller
must ensure exclusive access (no other references to this buffer’s memory
exist).
Sourcepub fn with_shape(&self, shape: Vec<usize>) -> Result<Self, MlxError>
pub fn with_shape(&self, shape: Vec<usize>) -> Result<Self, MlxError>
Produce a zero-copy clone of this buffer with a new logical shape.
The cloned buffer shares the same underlying Metal allocation
and residency-set guard via Arc::clone on the storage; only
the per-handle shape metadata is replaced. Both handles
continue to alias the same GPU memory — writes through one are
observed by the other.
Validates shape.iter().product() == self.element_count()
and self.dtype == dtype (dtype unchanged). Useful for
implementing zero-copy view/reshape ops in autograd tapes.
ADR-020 iter-13c: tape view op dependency.
Trait Implementations§
Source§impl Clone for MlxBuffer
impl Clone for MlxBuffer
Source§fn clone(&self) -> Self
fn clone(&self) -> Self
Increment the storage’s Arc ref-count and wrap it in a new
MlxBuffer. Both the original and the clone refer to the same
underlying GPU allocation AND share the residency-set membership
guard — no data is copied, no double-registration occurs.
This is safe because metal::Buffer wraps an MTLBuffer Objective-C
object whose lifetime is managed by ARC; Arc::clone increments the
Rust-side refcount, and the inner MlxBufferStorage Drop runs once
when the last clone is released.
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more