Skip to main content

cusparseSpMMOp

Function cusparseSpMMOp 

Source
pub unsafe extern "C" fn cusparseSpMMOp(
    plan: cusparseSpMMOpPlan_t,
    externalBuffer: *mut c_void,
) -> cusparseStatus_t
Expand description

NOTE 1: NVRTC and nvJitLink are not currently available on Arm64 Android platforms.

NOTE 2: The routine does not support Android and Tegra platforms except Judy (sm87).

Experimental: The function performs the multiplication of a sparse matrix matA and a dense matrix matB with custom operators.

where

  • op(A) is a sparse matrix of size $m \times k$
  • op(B) is a dense matrix of size $k \times n$
  • C is a dense matrix of size $m \times n$
  • $\oplus$, $\otimes$, and $\text{epilogue}$ are custom add, mul, and epilogue operators respectively.

Also, for matrix A and B: $$ \operatorname{op}(A) = \begin{cases} A & \text{if } op(A) = \text{CUSPARSE_OPERATION_NON_TRANSPOSE} \ A^T & \text{if } op(A) = \text{CUSPARSE_OPERATION_TRANSPOSE} \end{cases} $$:

$$ \operatorname{op}(B) = \begin{cases} B & \text{if } op(B) = \text{CUSPARSE_OPERATION_NON_TRANSPOSE} \ B^T & \text{if } op(B) = \text{CUSPARSE_OPERATION_TRANSPOSE} \end{cases} $$

Only opA == CUSPARSE_OPERATION_NON_TRANSPOSE is currently supported

The function cusparseSpMMOp_createPlan() returns the size of the workspace and the compiled kernel needed by cusparseSpMMOp()

The operators must have the following signature and return type

<computetype> is one of float, double, cuComplex, cuDoubleComplex, or int,

cusparseSpMMOp supports the following sparse matrix formats:

cusparseSpMMOp supports the following index type for representing the sparse matrix matA:

cusparseSpMMOp supports the following data types:

Uniform-precision computation:

Mixed-precision computation:

cusparseSpMMOp supports the following algorithms:

AlgorithmNotes
CUSPARSE_SPMM_OP_ALG_DEFAULTDefault algorithm for any sparse matrix format

Performance notes:

  • Row-major layout provides higher performance than column-major.

cusparseSpMMOp() has the following properties:

  • The routine requires extra storage
  • The routine supports asynchronous execution
  • Provides deterministic (bit-wise) results for each run
  • The routine allows the indices of matA to be unsorted

cusparseSpMMOp() supports the following optimizations:

  • CUDA graph capture
  • Hardware Memory Compression

Please visit cuSPARSE Library Samples - cusparseSpMMOp.