pub unsafe extern "C" fn cusparseSpMMOp(
plan: cusparseSpMMOpPlan_t,
externalBuffer: *mut c_void,
) -> cusparseStatus_tExpand description
NOTE 1: NVRTC and nvJitLink are not currently available on Arm64 Android platforms.
NOTE 2: The routine does not support Android and Tegra platforms except Judy (sm87).
Experimental: The function performs the multiplication of a sparse matrix matA and a dense matrix matB with custom operators.
where
op(A)is a sparse matrix of size $m \times k$op(B)is a dense matrix of size $k \times n$Cis a dense matrix of size $m \times n$- $\oplus$, $\otimes$, and $\text{epilogue}$ are custom add, mul, and epilogue operators respectively.
Also, for matrix A and B:
$$
\operatorname{op}(A) =
\begin{cases}
A & \text{if } op(A) = \text{CUSPARSE_OPERATION_NON_TRANSPOSE} \
A^T & \text{if } op(A) = \text{CUSPARSE_OPERATION_TRANSPOSE}
\end{cases}
$$:
$$ \operatorname{op}(B) = \begin{cases} B & \text{if } op(B) = \text{CUSPARSE_OPERATION_NON_TRANSPOSE} \ B^T & \text{if } op(B) = \text{CUSPARSE_OPERATION_TRANSPOSE} \end{cases} $$
Only opA == CUSPARSE_OPERATION_NON_TRANSPOSE is currently supported
The function cusparseSpMMOp_createPlan() returns the size of the workspace and the compiled kernel needed by cusparseSpMMOp()
The operators must have the following signature and return type
<computetype> is one of float, double, cuComplex, cuDoubleComplex, or int,
cusparseSpMMOp supports the following sparse matrix formats:
cusparseSpMMOp supports the following index type for representing the sparse matrix matA:
- 32-bit indices (
cusparseIndexType_t::CUSPARSE_INDEX_32I) - 64-bit indices (
cusparseIndexType_t::CUSPARSE_INDEX_64I)
cusparseSpMMOp supports the following data types:
Uniform-precision computation:
A/B/ C/computeType |
|---|
cudaDataType_t::CUDA_R_32F |
cudaDataType_t::CUDA_R_64F |
cudaDataType_t::CUDA_C_32F |
cudaDataType_t::CUDA_C_64F |
Mixed-precision computation:
cusparseSpMMOp supports the following algorithms:
| Algorithm | Notes |
|---|---|
CUSPARSE_SPMM_OP_ALG_DEFAULT | Default algorithm for any sparse matrix format |
Performance notes:
- Row-major layout provides higher performance than column-major.
cusparseSpMMOp() has the following properties:
- The routine requires extra storage
- The routine supports asynchronous execution
- Provides deterministic (bit-wise) results for each run
- The routine allows the indices of
matAto be unsorted
cusparseSpMMOp() supports the following optimizations:
- CUDA graph capture
- Hardware Memory Compression
Please visit cuSPARSE Library Samples - cusparseSpMMOp.