Function cusparseSgebsr2gebsr

Source

pub unsafe extern "C" fn cusparseSgebsr2gebsr(
    handle: cusparseHandle_t,
    dirA: cusparseDirection_t,
    mb: c_int,
    nb: c_int,
    nnzb: c_int,
    descrA: cusparseMatDescr_t,
    bsrSortedValA: *const f32,
    bsrSortedRowPtrA: *const c_int,
    bsrSortedColIndA: *const c_int,
    rowBlockDimA: c_int,
    colBlockDimA: c_int,
    descrC: cusparseMatDescr_t,
    bsrSortedValC: *mut f32,
    bsrSortedRowPtrC: *mut c_int,
    bsrSortedColIndC: *mut c_int,
    rowBlockDimC: c_int,
    colBlockDimC: c_int,
    pBuffer: *mut c_void,
) -> cusparseStatus_t

Expand description

This function converts a sparse matrix in general BSR format that is defined by the three arrays bsrValA, bsrRowPtrA, and bsrColIndA into a sparse matrix in another general BSR format that is defined by arrays bsrValC, bsrRowPtrC, and bsrColIndC.

If rowBlockDimA=1 and colBlockDimA=1, cusparse\[S|D|C|Z\]gebsr2gebsr() is the same as cusparse\[S|D|C|Z\]csr2gebsr().

If rowBlockDimC=1 and colBlockDimC=1, cusparse\[S|D|C|Z\]gebsr2gebsr() is the same as cusparse\[S|D|C|Z\]gebsr2csr().

A is an m*n sparse matrix where m(=mb*rowBlockDim) is the number of rows of A, and n(=nb*colBlockDim) is the number of columns of A. The general BSR format of A contains nnzb(=bsrRowPtrA\[mb\] - bsrRowPtrA\[0\]) nonzero blocks. The matrix C is also general BSR format with a different block size, rowBlockDimC*colBlockDimC. If m is not a multiple of rowBlockDimC, or n is not a multiple of colBlockDimC, zeros are filled in. The number of block rows of C is mc(=(m+rowBlockDimC-1)/rowBlockDimC). The number of block rows of C is nc(=(n+colBlockDimC-1)/colBlockDimC). The number of nonzero blocks of C is nnzc.

The implementation adopts a two-step approach to do the conversion. First, the user allocates bsrRowPtrC of mc+1 elements and uses function cusparseXgebsr2gebsrNnz to determine the number of nonzero block columns per block row of matrix C. Second, the user gathers nnzc (number of non-zero block columns of matrix C) from either (nnzc=*nnzTotalDevHostPtr) or (nnzc=bsrRowPtrC\[mc\]-bsrRowPtrC\[0\]) and allocates bsrValC of nnzc*rowBlockDimC*colBlockDimC elements and bsrColIndC of nnzc integers. Finally the function cusparse\[S|D|C|Z\]gebsr2gebsr() is called to complete the conversion.

The user must call gebsr2gebsr_bufferSize() to know the size of the buffer required by gebsr2gebsr(), allocate the buffer, and pass the buffer pointer to gebsr2gebsr().

The general procedure is as follows:

The routines require no extra storage if pBuffer != NULL
The routine supports asynchronous execution if the Stream Ordered Memory Allocator is available
The routines do not support CUDA graph capture.

cusparseSgebsr2gebsr

Function cusparseSgebsr2gebsr Copy item path

Function cusparseSgebsr2gebsr