Skip to main content

dispatch_cast_f32_to_bf16_with_encoder

Function dispatch_cast_f32_to_bf16_with_encoder 

Source
pub fn dispatch_cast_f32_to_bf16_with_encoder(
    encoder: &mut CommandEncoder,
    registry: &mut KernelRegistry,
    device: &DeviceRef,
    input: &MlxBuffer,
    output: &MlxBuffer,
    n_elements: u32,
) -> Result<()>
Expand description

Cast f32 to bf16 using an externally-provided encoder (no commit).

Encodes the cast_f32_to_bf16 kernel into the given encoder without committing or waiting. Use this to chain the cast into a mega-encoder alongside other GPU work, avoiding CPU round-trips.

§Arguments

  • encoder - Command encoder to record the dispatch into.
  • registry - Kernel registry (mutable for lazy pipeline compilation).
  • device - Metal device for pipeline compilation.
  • input - Input buffer (f32).
  • output - Output buffer (bf16, pre-allocated with n_elements * 2 bytes).
  • n_elements - Number of elements to cast.

§Errors

Returns MlxError::InvalidArgument if n_elements is zero or buffers are too small.