Skip to main content

dispatch_strided_copy_f32

Function dispatch_strided_copy_f32 

Source
pub fn dispatch_strided_copy_f32(
    encoder: &mut CommandEncoder,
    registry: &mut KernelRegistry,
    device: &DeviceRef,
    src: &MlxBuffer,
    dst: &MlxBuffer,
    params: &StridedCopyParams,
) -> Result<()>
Expand description

Dispatch a strided copy operation on the GPU.

Copies a 2D strided tensor to contiguous layout: dst[row * cols + col] = src[row * stride_row + col * stride_col]

§Arguments

  • encoder - Command encoder to record the dispatch into.
  • registry - Kernel registry (must have strided_copy_f32 registered).
  • device - Metal device for pipeline compilation.
  • src - Source buffer (f32, strided layout).
  • dst - Destination buffer (f32, contiguous output).
  • params - Copy parameters (rows, cols, strides).

§Errors

Returns MlxError::InvalidArgument if dimensions are 0 or buffers are too small.