Skip to main content

cusparseSbsrxmv

Function cusparseSbsrxmv 

Source
pub unsafe extern "C" fn cusparseSbsrxmv(
    handle: cusparseHandle_t,
    dirA: cusparseDirection_t,
    transA: cusparseOperation_t,
    sizeOfMask: c_int,
    mb: c_int,
    nb: c_int,
    nnzb: c_int,
    alpha: *const f32,
    descrA: cusparseMatDescr_t,
    bsrSortedValA: *const f32,
    bsrSortedMaskPtrA: *const c_int,
    bsrSortedRowPtrA: *const c_int,
    bsrSortedEndPtrA: *const c_int,
    bsrSortedColIndA: *const c_int,
    blockDim: c_int,
    x: *const f32,
    beta: *const f32,
    y: *mut f32,
) -> cusparseStatus_t
Expand description

This function performs a bsrmv and a mask operation

where $A\text{ is an }(mb \ast blockDim) \times (nb \ast blockDim)$ sparse matrix that is defined in BSRX storage format by the four arrays bsrVal, bsrRowPtr, bsrEndPtr, and bsrColInd); x and y are vectors; $\alpha\text{and}\beta$ are scalars; and: $$ \operatorname{op}(A) = \begin{cases} A & \text{if } trans = \text{CUSPARSE_OPERATION_NON_TRANSPOSE} \ A^T & \text{if } trans = \text{CUSPARSE_OPERATION_TRANSPOSE} \ A^H & \text{if } trans = \text{CUSPARSE_OPERATION_CONJUGATE_TRANSPOSE} \end{cases} $$

The mask operation is defined by array bsrMaskPtr which contains updated block row indices of $y$. If row $i$ is not specified in bsrMaskPtr, then bsrxmv() does not touch row block $i$ of $A$ and $y$.

For example, consider the $2 \times 3$ block matrix $A$:

and its one-based BSR format (three vector form) is:

Suppose we want to do the following bsrmv operation on a matrix $\bar{A}$ which is slightly different from $A$.

We don’t need to create another BSR format for the new matrix $\bar{A}$, all that we should do is to keep bsrVal and bsrColInd unchanged, but modify bsrRowPtr and add an additional array bsrEndPtr which points to the last nonzero elements per row of $\bar{A}$ plus 1.

For example, the following bsrRowPtr and bsrEndPtr can represent matrix $\bar{A}$:

Further we can use a mask operator (specified by array bsrMaskPtr) to update particular block row indices of $y$ only because $y_{1}$ is never changed. In this case, bsrMaskPtr$=$ [2] and sizeOfMask=1.

The mask operator is equivalent to the following operation:

If a block row is not present in the bsrMaskPtr, then no calculation is performed on that row, and the corresponding value in y is unmodified. The question mark “?” is used to inidcate row blocks not in bsrMaskPtr.

In this case, first row block is not present in bsrMaskPtr, so bsrRowPtr\[0\] and bsrEndPtr\[0\] are not touched also.

bsrxmv() has the following properties:

  • The routine requires no extra storage.
  • The routine supports asynchronous execution.
  • The routine supports CUDA graph capture.

A couple of comments on bsrxmv():