pub unsafe extern "C" fn cusparseSbsrmv(
handle: cusparseHandle_t,
dirA: cusparseDirection_t,
transA: cusparseOperation_t,
mb: c_int,
nb: c_int,
nnzb: c_int,
alpha: *const f32,
descrA: cusparseMatDescr_t,
bsrSortedValA: *const f32,
bsrSortedRowPtrA: *const c_int,
bsrSortedColIndA: *const c_int,
blockDim: c_int,
x: *const f32,
beta: *const f32,
y: *mut f32,
) -> cusparseStatus_tExpand description
This function performs the matrix-vector operation
where $A\text{ is an }(mb \ast blockDim) \times (nb \ast blockDim)$ sparse matrix that is defined in BSR storage format by the three arrays bsrVal, bsrRowPtr, and bsrColInd); x and y are vectors; $\alpha\text{ and }\beta$ are scalars; and:
$$
\operatorname{op}(A) =
\begin{cases}
A & \text{if } trans = \text{CUSPARSE_OPERATION_NON_TRANSPOSE} \
A^T & \text{if } trans = \text{CUSPARSE_OPERATION_TRANSPOSE} \
A^H & \text{if } trans = \text{CUSPARSE_OPERATION_CONJUGATE_TRANSPOSE}
\end{cases}
$$
bsrmv() has the following properties:
- The routine requires no extra storage.
- The routine supports asynchronous execution.
- The routine supports CUDA graph capture.
Several comments on bsrmv():
-
Only
blockDim > 1is supported -
Only
cusparseOperation_t::CUSPARSE_OPERATION_NON_TRANSPOSEis supported, that is -
Only
cusparseMatrixType_t::CUSPARSE_MATRIX_TYPE_GENERALis supported. -
The size of vector
xshould be $(nb \ast blockDim)$ at least, and the size of vectoryshould be $(mb \ast blockDim)$ at least; otherwise, the kernel may returncusparseStatus_t::CUSPARSE_STATUS_EXECUTION_FAILEDbecause of an out-of-bounds array.
For example, suppose the user has a CSR format and wants to try bsrmv(), the following code demonstrates how to use csr2bsr() conversion and bsrmv() multiplication in single precision.