Skip to main content

cusparseSbsrsm2_solve

Function cusparseSbsrsm2_solve 

Source
pub unsafe extern "C" fn cusparseSbsrsm2_solve(
    handle: cusparseHandle_t,
    dirA: cusparseDirection_t,
    transA: cusparseOperation_t,
    transXY: cusparseOperation_t,
    mb: c_int,
    n: c_int,
    nnzb: c_int,
    alpha: *const f32,
    descrA: cusparseMatDescr_t,
    bsrSortedVal: *const f32,
    bsrSortedRowPtr: *const c_int,
    bsrSortedColInd: *const c_int,
    blockSize: c_int,
    info: bsrsm2Info_t,
    B: *const f32,
    ldb: c_int,
    X: *mut f32,
    ldx: c_int,
    policy: cusparseSolvePolicy_t,
    pBuffer: *mut c_void,
) -> cusparseStatus_t
Expand description

This function performs the solve phase of the solution of a sparse triangular linear system:

A is an (mb*blockDim)x(mb*blockDim) sparse matrix that is defined in BSR storage format by the three arrays bsrValA, bsrRowPtrA, and bsrColIndA); B and X are the right-hand-side and the solution matrices; $\alpha$ is a scalar, and

image9

and

image6

Only op(A)=A is supported.

op(B) and op(X) must be performed in the same way. In other words, if op(B)=B, op(X)=X.

The block of BSR format is of size blockDim*blockDim, stored as column-major or row-major as determined by parameter dirA, which is either cusparseDirection_t::CUSPARSE_DIRECTION_ROW or cusparseDirection_t::CUSPARSE_DIRECTION_COLUMN. The matrix type must be cusparseMatrixType_t::CUSPARSE_MATRIX_TYPE_GENERAL, and the fill mode and diagonal type are ignored. Function bsrsm02_solve() can support an arbitrary blockDim.

This function may be executed multiple times for a given matrix and a particular operation type.

This function requires the buffer size returned by bsrsm2_bufferSize(). The address of pBuffer must be multiple of 128 bytes. If it is not, cusparseStatus_t::CUSPARSE_STATUS_INVALID_VALUE is returned.

Although bsrsm2_solve() can be done without level information, the user still needs to be aware of consistency. If bsrsm2_analysis() is called with policy cusparseSolvePolicy_t::CUSPARSE_SOLVE_POLICY_USE_LEVEL, bsrsm2_solve() can be run with or without levels. On the other hand, if bsrsm2_analysis() is called with cusparseSolvePolicy_t::CUSPARSE_SOLVE_POLICY_NO_LEVEL, bsrsm2_solve() can only accept cusparseSolvePolicy_t::CUSPARSE_SOLVE_POLICY_NO_LEVEL; otherwise, cusparseStatus_t::CUSPARSE_STATUS_INVALID_VALUE is returned.

Function bsrsm02_solve() has the same behavior as bsrsv02_solve(), reporting the first numerical zero, including a structural zero. The user must call cusparseXbsrsm2_query_zero_pivot() to know where the numerical zero is.

The motivation of transpose(X) is to improve the memory access of matrix X. The computational pattern of transpose(X) with matrix X in column-major order is equivalent to X with matrix X in row-major order.

In-place is supported and requires that B and X point to the same memory block, and ldb=ldx.

The function supports the following properties if pBuffer != NULL:

  • The routine requires no extra storage.
  • The routine supports asynchronous execution.
  • The routine supports CUDA graph capture.