launch_into_contiguous_perpendicular_ref

Function launch_into_contiguous_perpendicular_ref 

Source
pub fn launch_into_contiguous_perpendicular_ref<R: Runtime>(
    client: &ComputeClient<R>,
    input: &TensorHandleRef<'_, R>,
    output: &TensorHandleRef<'_, R>,
    dtype: StorageType,
) -> Result<(), LaunchError>
Expand description

Launches the perpendicular contiguous kernel.

This is used when the input tensor’s memory layout is such that the last dimension is not the one with a stride of 1 (the vectorized dimension). It optimizes the copy by using hardware vectorization (Lines) and an in-register transpose.