Skip to main content

M_captureModelSync

Function M_captureModelSync 

Source
pub unsafe extern "C" fn M_captureModelSync(
    context: *const M_RuntimeContext,
    initializedModel: *mut M_AsyncModel,
    graphKeys: *const u64,
    numGraphKeys: usize,
    inputs: *const *mut M_AsyncTensor,
    numInputs: usize,
    numOutputs: *mut usize,
    status: *mut M_Status,
) -> *mut *mut M_AsyncTensor
Expand description

Captures model execution into a device graph for later replay.

This records model execution as a device graph (e.g. a CUDA graph) that can be replayed with M_replayModelSync() for faster repeated execution. The captured graph is associated with the provided keys for later lookup.

Graph keys identify the captured graph. A single key is broadcast to all capture devices; multiple keys provide one key per device.

The returned output tensors are updated in-place when the graph is replayed. Keep them alive and read from them after each M_replayModelSync() call to get the latest results.

@param context The runtime context, from M_newRuntimeContext(). @param initializedModel The model to capture, from M_initModel(). @param graphKeys Array of uint64_t graph keys. Pass one key to broadcast to all devices, or one key per device. @param numGraphKeys Number of graph keys in the array. @param inputs Array of input tensors in model input order. @param numInputs Number of input tensors. @param numOutputs Receives the number of output tensors on success. @param status The status used to report errors in the case of failures.

@returns A malloc-allocated array of M_AsyncTensor pointers, one per model output. The caller owns both the array and each tensor. Free each tensor with M_freeTensor() and the array itself with free(). Returns NULL on failure, with an error in status.