Skip to main content

cusparseZcsr2gebsr

Function cusparseZcsr2gebsr 

Source
pub unsafe extern "C" fn cusparseZcsr2gebsr(
    handle: cusparseHandle_t,
    dirA: cusparseDirection_t,
    m: c_int,
    n: c_int,
    descrA: cusparseMatDescr_t,
    csrSortedValA: *const cuDoubleComplex,
    csrSortedRowPtrA: *const c_int,
    csrSortedColIndA: *const c_int,
    descrC: cusparseMatDescr_t,
    bsrSortedValC: *mut cuDoubleComplex,
    bsrSortedRowPtrC: *mut c_int,
    bsrSortedColIndC: *mut c_int,
    rowBlockDim: c_int,
    colBlockDim: c_int,
    pBuffer: *mut c_void,
) -> cusparseStatus_t
Expand description

This function converts a sparse matrix A in CSR format (that is defined by arrays csrValA, csrRowPtrA, and csrColIndA) into a sparse matrix C in general BSR format (that is defined by the three arrays bsrValC, bsrRowPtrC, and bsrColIndC).

The matrix A is an :math: m times n sparse matrix and matrix C is a (mb*rowBlockDim)*(nb*colBlockDim) sparse matrix, where mb(=(m+rowBlockDim-1)/rowBlockDim) is the number of block rows of C, and nb(=(n+colBlockDim-1)/colBlockDim) is the number of block columns of C.

The block of C is of size rowBlockDim*colBlockDim. If m is not multiple of rowBlockDim or n is not multiple of colBlockDim, zeros are filled in.

The implementation adopts a two-step approach to do the conversion. First, the user allocates bsrRowPtrC of mb+1 elements and uses function cusparseXcsr2gebsrNnz to determine the number of nonzero block columns per block row. Second, the user gathers nnzb (number of nonzero block columns of matrix C) from either (nnzb=*nnzTotalDevHostPtr) or (nnzb=bsrRowPtrC\[mb\]-bsrRowPtrC\[0\]) and allocates bsrValC of nnzb*rowBlockDim*colBlockDim elements and bsrColIndC of nnzb integers. Finally function cusparse\[S|D|C|Z\]csr2gebsr() is called to complete the conversion.

The user must obtain the size of the buffer required by csr2gebsr() by calling csr2gebsr_bufferSize(), allocate the buffer, and pass the buffer pointer to csr2gebsr().

The general procedure is as follows:

The routine cusparseXcsr2gebsrNnz has the following properties:

  • The routine requires no extra storage.
  • The routine supports asynchronous execution if the Stream Ordered Memory Allocator is available.
  • The routine supports CUDA graph capture if the Stream Ordered Memory Allocator is available.

The routine cusparse<t>csr2gebsr() has the following properties:

  • The routine requires no extra storage if pBuffer != NULL.
  • The routine supports asynchronous execution.
  • The routine supports CUDA graph capture.