pub struct MlxDevice { /* private fields */ }Expand description
Wraps a Metal device and its command queue.
§Thread Safety
MlxDevice is Send + Sync — you can share it across threads. The
underlying Metal device and command queue are thread-safe on Apple Silicon.
Implementations§
Source§impl MlxDevice
impl MlxDevice
Sourcepub fn new() -> Result<Self>
pub fn new() -> Result<Self>
Initialize the Metal GPU device and create a command queue.
Returns Err(MlxError::DeviceNotFound) if no Metal device is available
(e.g. running on a non-Apple-Silicon machine or in a headless Linux VM).
Sourcepub fn command_encoder(&self) -> Result<CommandEncoder>
pub fn command_encoder(&self) -> Result<CommandEncoder>
Create a CommandEncoder for batching GPU dispatches.
The encoder wraps a fresh Metal command buffer from the device’s command
queue. Encode one or more kernel dispatches, then call
CommandEncoder::commit_and_wait to submit and block until completion.
ADR-015 iter8e (Phase 3b): the encoder is bound to the device’s
residency set so every commit* boundary flushes deferred
add/remove staging (one [set commit] per CB submission instead
of per-allocation). When residency sets are disabled
(HF2Q_NO_RESIDENCY=1, macOS<15) the binding is None and the
flush is a no-op.
Sourcepub fn encoder_session(&self) -> Result<Option<EncoderSession>>
pub fn encoder_session(&self) -> Result<Option<EncoderSession>>
Create an EncoderSession (ADR-019 Phase 0b iter89e2-A — bare
struct) for one transformer stage’s worth of GPU work.
Gated on HF2Q_ENCODER_SESSION=1 (default OFF). When the gate is
unset, returns Ok(None) so callers can fall back to
Self::command_encoder without an extra conditional. When set,
returns Ok(Some(EncoderSession)) carrying a fresh
CommandEncoder — same construction path as command_encoder(),
just wrapped in the session shell.
In iter89e2-A no production code path consumes this method; it
exists so the env-gate has a callable factory and the lifecycle
tests have a public entry point. Phase 1+ migrations
(forward_gpu.rs, gpu_full_attn.rs, gpu_delta_net.rs) opt in
per-call site.
§Errors
Surfaces any error from the underlying EncoderSession::new
— currently infallible past metal-rs’s new_command_buffer,
preserved for future-proofing.
Sourcepub fn alloc_buffer(
&self,
byte_len: usize,
dtype: DType,
shape: Vec<usize>,
) -> Result<MlxBuffer>
pub fn alloc_buffer( &self, byte_len: usize, dtype: DType, shape: Vec<usize>, ) -> Result<MlxBuffer>
Allocate a new GPU buffer with StorageModeShared.
§Arguments
byte_len— Size of the buffer in bytes. Must be > 0.dtype— Element data type for metadata tracking.shape— Tensor dimensions for metadata tracking.
§Errors
Returns MlxError::InvalidArgument if byte_len is zero.
Returns MlxError::BufferAllocationError if Metal cannot allocate.
Sourcepub fn metal_device(&self) -> &DeviceRef
pub fn metal_device(&self) -> &DeviceRef
Borrow the underlying metal::Device for direct Metal API calls
(e.g. kernel compilation in KernelRegistry).
Sourcepub fn metal_queue(&self) -> &CommandQueue
pub fn metal_queue(&self) -> &CommandQueue
Borrow the underlying metal::CommandQueue.
Sourcepub fn residency_sets_enabled(&self) -> bool
pub fn residency_sets_enabled(&self) -> bool
Return whether this device has an active Metal residency set.