pub fn dispatch_cast_bf16_to_f32_with_encoder(
encoder: &mut CommandEncoder,
registry: &mut KernelRegistry,
device: &DeviceRef,
input: &MlxBuffer,
output: &MlxBuffer,
n_elements: u32,
) -> Result<()>Expand description
Cast bf16 to f32 using an externally-provided encoder (no commit).
Encodes the cast_bf16_to_f32 kernel into the given encoder without
committing or waiting. Use this to chain the cast into a mega-encoder
alongside other GPU work, avoiding CPU round-trips.
§Arguments
encoder- Command encoder to record the dispatch into.registry- Kernel registry (mutable for lazy pipeline compilation).device- Metal device for pipeline compilation.input- Input buffer (bf16).output- Output buffer (f32, pre-allocated withn_elements * 4bytes).n_elements- Number of elements to cast.
§Errors
Returns MlxError::InvalidArgument if n_elements is zero or buffers are
too small.