pub fn dispatch_cast_f32_to_bf16_with_encoder(
encoder: &mut CommandEncoder,
registry: &mut KernelRegistry,
device: &DeviceRef,
input: &MlxBuffer,
output: &MlxBuffer,
n_elements: u32,
) -> Result<()>Expand description
Cast f32 to bf16 using an externally-provided encoder (no commit).
Encodes the cast_f32_to_bf16 kernel into the given encoder without
committing or waiting. Use this to chain the cast into a mega-encoder
alongside other GPU work, avoiding CPU round-trips.
§Arguments
encoder- Command encoder to record the dispatch into.registry- Kernel registry (mutable for lazy pipeline compilation).device- Metal device for pipeline compilation.input- Input buffer (f32).output- Output buffer (bf16, pre-allocated withn_elements * 2bytes).n_elements- Number of elements to cast.
§Errors
Returns MlxError::InvalidArgument if n_elements is zero or buffers are
too small.