Skip to main content

cusparseCbsrilu02_analysis

Function cusparseCbsrilu02_analysis 

Source
pub unsafe extern "C" fn cusparseCbsrilu02_analysis(
    handle: cusparseHandle_t,
    dirA: cusparseDirection_t,
    mb: c_int,
    nnzb: c_int,
    descrA: cusparseMatDescr_t,
    bsrSortedVal: *mut cuComplex,
    bsrSortedRowPtr: *const c_int,
    bsrSortedColInd: *const c_int,
    blockDim: c_int,
    info: bsrilu02Info_t,
    policy: cusparseSolvePolicy_t,
    pBuffer: *mut c_void,
) -> cusparseStatus_t
Expand description

This function performs the analysis phase of the incomplete-LU factorization with 0 fill-in and no pivoting.

$A \approx LU$

A is an (mb*blockDim)×(mb*blockDim) sparse matrix that is defined in BSR storage format by the three arrays bsrValA, bsrRowPtrA, and bsrColIndA. The block in BSR format is of size blockDim*blockDim, stored as column-major or row-major as determined by parameter dirA, which is either cusparseDirection_t::CUSPARSE_DIRECTION_COLUMN or cusparseDirection_t::CUSPARSE_DIRECTION_ROW. The matrix type must be cusparseMatrixType_t::CUSPARSE_MATRIX_TYPE_GENERAL, and the fill mode and diagonal type are ignored.

This function requires a buffer size returned by bsrilu02_bufferSize(). The address of pBuffer must be multiple of 128 bytes. If it is not, cusparseStatus_t::CUSPARSE_STATUS_INVALID_VALUE is returned.

Function bsrilu02_analysis() reports a structural zero and computes level information stored in the opaque structure info. The level information can extract more parallelism during incomplete LU factorization. However bsrilu02() can be done without level information. To disable level information, the user needs to specify the parameter policy of bsrilu02[_analysis| ] as cusparseSolvePolicy_t::CUSPARSE_SOLVE_POLICY_NO_LEVEL.

Function bsrilu02_analysis() always reports the first structural zero, even with parameter policy is cusparseSolvePolicy_t::CUSPARSE_SOLVE_POLICY_NO_LEVEL. The user must call cusparseXbsrilu02_zeroPivot to know where the structural zero is.

It is the user’s choice whether to call bsrilu02() if bsrilu02_analysis() reports a structural zero. In this case, the user can still call bsrilu02(), which will return a numerical zero at the same position as the structural zero. However the result is meaningless.

  • This function requires temporary extra storage that is allocated internally.
  • The routine supports asynchronous execution if the Stream Ordered Memory Allocator is available.
  • The routine supports CUDA graph capture if the Stream Ordered Memory Allocator is available.