Skip to main content

launch_copy_perpendicular_ref

Function launch_copy_perpendicular_ref 

Source
pub fn launch_copy_perpendicular_ref<R: Runtime>(
    client: &ComputeClient<R>,
    input: TensorBinding<R>,
    output: TensorBinding<R>,
    dtype: StorageType,
)
Expand description

Launches the perpendicular contiguous kernel.

This is used when the input tensor’s memory layout is such that the last dimension is not the one with a stride of 1 (the vectorized dimension). It optimizes the copy by using hardware vectorization (Vectors) and an in-register transpose.