Skip to main content

cusparseCbsrsm2_analysis

Function cusparseCbsrsm2_analysis 

Source
pub unsafe extern "C" fn cusparseCbsrsm2_analysis(
    handle: cusparseHandle_t,
    dirA: cusparseDirection_t,
    transA: cusparseOperation_t,
    transXY: cusparseOperation_t,
    mb: c_int,
    n: c_int,
    nnzb: c_int,
    descrA: cusparseMatDescr_t,
    bsrSortedVal: *const cuComplex,
    bsrSortedRowPtr: *const c_int,
    bsrSortedColInd: *const c_int,
    blockSize: c_int,
    info: bsrsm2Info_t,
    policy: cusparseSolvePolicy_t,
    pBuffer: *mut c_void,
) -> cusparseStatus_t
Expand description

This function performs the analysis phase of bsrsm2(), a new sparse triangular linear system op(A)*op(X) =$\alpha$op(B).

A is an (mb*blockDim)x(mb*blockDim) sparse matrix that is defined in BSR storage format by the three arrays bsrValA, bsrRowPtrA, and bsrColIndA); B and X are the right-hand-side and the solution matrices; $\alpha$ is a scalar; and

$\operatorname{op}(A) = \text{CUSPARSE_OPERATION_NON_TRANSPOSE}$

and: $$ \operatorname{op}(X) = \begin{cases} X & \text{if } transX = \text{CUSPARSE_OPERATION_NON_TRANSPOSE} \ X^T & \text{if } transX = \text{CUSPARSE_OPERATION_TRANSPOSE} \ X^H & \text{if } transX = \text{CUSPARSE_OPERATION_CONJUGATE_TRANSPOSE (not supported)} \end{cases} $$

and op(B) and op(X) are equal.

The block of BSR format is of size blockDim*blockDim, stored in column-major or row-major as determined by parameter dirA, which is either cusparseDirection_t::CUSPARSE_DIRECTION_ROW or cusparseDirection_t::CUSPARSE_DIRECTION_COLUMN. The matrix type must be cusparseMatrixType_t::CUSPARSE_MATRIX_TYPE_GENERAL, and the fill mode and diagonal type are ignored.

It is expected that this function will be executed only once for a given matrix and a particular operation type.

This function requires the buffer size returned by bsrsm2_bufferSize(). The address of pBuffer must be multiple of 128 bytes. If not, cusparseStatus_t::CUSPARSE_STATUS_INVALID_VALUE is returned.

Function bsrsm2_analysis() reports a structural zero and computes the level information stored in opaque structure info. The level information can extract more parallelism during a triangular solver. However bsrsm2_solve() can be done without level information. To disable level information, the user needs to specify the policy of the triangular solver as cusparseSolvePolicy_t::CUSPARSE_SOLVE_POLICY_NO_LEVEL.

Function bsrsm2_analysis() always reports the first structural zero, even if the parameter policy is cusparseSolvePolicy_t::CUSPARSE_SOLVE_POLICY_NO_LEVEL. Besides, no structural zero is reported if cusparseDiagType_t::CUSPARSE_DIAG_TYPE_UNIT is specified, even if block A(j,j) is missing for some j. The user must call cusparseXbsrsm2_query_zero_pivot() to know where the structural zero is.

If bsrsm2_analysis() reports a structural zero, the solve will return a numerical zero in the same position as the structural zero but this result X is meaningless.

  • This function requires temporary extra storage that is allocated internally.
  • The routine supports asynchronous execution if the Stream Ordered Memory Allocator is available.
  • The routine supports CUDA graph capture if the Stream Ordered Memory Allocator is available.