Struct IrsParams

Source

pub struct IrsParams { /* private fields */ }

Implementations§

Source §

impl IrsParams

Source

pub fn create() -> Result<Self>

Creates and initializes the parameter structure for IRS solvers such as xgesv and xgels.

The returned parameter structure can be reused across calls to the same IRS solver or to different IRS solvers.

In CUDA 10.2, the behavior was different and a new parameter structure was required for each IRS solve call.

You can also reconfigure the parameters between solves, but only after the previous IRS call has completed.

§Errors

Returns an error if cuSOLVER cannot allocate the required resources or does not return a valid handle.

Source

pub fn set_refinement_solver(&mut self, refinement: IrsRefinement) -> Result<()>

Sets the refinement solver used by IRS operations such as xgesv and xgels.

Configure the refinement algorithm before the first IRS solve. Newly created IrsParams do not set one by default.

The supported values are described below.

IrsRefinement::NotSet: Solver is not set. The IRS solver returns an error if this value is used.

IrsRefinement::None: No refinement solver; the IRS solver performs a factorization followed by a solve without any refinement. For example, if the IRS solver was xgesv, this is equivalent to an xgesv solve without refinement, with the factorization carried out in the lowest configured precision. If both the main and lowest precision are PrecisionType::R64F, the solve is effectively performed in f64.

IrsRefinement::Classical: Classical iterative refinement solver. Similar to the value used in LAPACK operations.

IrsRefinement::Gmres: GMRES (Generalized Minimal Residual) based iterative refinement solver. Recent studies use GMRES as a refinement solver that can outperform classical iterative refinement. Recommended setting based on cuSOLVER experimentation.

IrsRefinement::ClassicalGmres: Classical iterative refinement solver that uses the GMRES (Generalized Minimal Residual) internally to solve the correction equation at each iteration. The classical refinement iteration is the outer iteration, and GMRES is the inner iteration. If the tolerance of the inner GMRES is set very low, for example near machine precision, then the outer classical refinement iteration performs only one iteration and this option behaves like IrsRefinement::Gmres.

IrsRefinement::GmresGmres: GMRES-based iterative refinement solver that uses another GMRES solve internally for the preconditioned system.

§Errors

Returns an error if cuSOLVER rejects the parameter structure.

Source

pub fn set_main_precision(&mut self, precision: PrecisionType) -> Result<()>

Sets the main precision for the Iterative Refinement Solver (IRS).

The main precision is the type of the input and output data. Configure both the main and lowest precision before the first IRS solve. Those values are not inferred when the parameter structure is created because they depend on the input/output data type and the requested solver configuration. You can set them independently or together with IrsParams::set_solver_precisions.

§Errors

Returns an error if cuSOLVER rejects the parameter structure.

Source

pub fn set_lowest_precision(&mut self, precision: PrecisionType) -> Result<()>

Sets the lowest precision that the IRS solver may use.

The lowest precision is the minimum compute precision used during the LU factorization process.

Configure both the main and lowest precision before the first IRS solve. They are not inferred when creating the parameter structure because they depend on the input and output data types and the requested solver configuration. Usually the lowest precision defines the speedup that can be achieved. The ratio between the performance of the lowest precision and the main precision gives an approximate upper bound on the speedup. More precisely, it depends on many factors, but for large matrices it is often tied to the performance ratio of large GEMM-like kernels. For instance, if the input/output precision is real double precision PrecisionType::R64F and the lowest precision is PrecisionType::R32F, then a speedup of at most about 2x is expected for large problem sizes. If the lowest precision is PrecisionType::R16F, expect 3x-4x. A reasonable strategy accounts for the number of right-hand sides, the matrix size, and the convergence rate.

§Errors

Returns an error if cuSOLVER rejects the parameter structure.

Source

pub fn set_solver_precisions( &mut self, main_precision: PrecisionType, lowest_precision: PrecisionType, ) -> Result<()>

Sets both the main and lowest precision for the Iterative Refinement Solver (IRS).

The main precision is the precision of the input and output data. The lowest precision is the minimum compute precision used during the LU factorization process.

Configure both values before the first IRS solve. They are not inferred when creating the parameter structure because they depend on the input and output data types and the requested solver configuration.

Convenience wrapper around IrsParams::set_main_precision and IrsParams::set_lowest_precision. All possible combinations of main/lowest precision are described in the table below. Usually the lowest precision defines the speedup that can be achieved. The ratio between the performance of the lowest precision and the main precision gives an approximate upper bound on the speedup. More precisely, it depends on many factors, but for large matrices it is often tied to the performance ratio of large GEMM-like kernels. For instance, if the input/output precision is real double precision PrecisionType::R64F and the lowest precision is PrecisionType::R32F, then a speedup of at most about 2x is expected for large problem sizes. If the lowest precision is PrecisionType::R16F, expect 3x-4x. A reasonable strategy accounts for the number of right-hand sides, the matrix size, and the convergence rate.

Supported input/output data type and lower precision for the IRS solver

input/output Data Type (for example, main precision)	Supported values for the lowest precision
`PrecisionType::C64F`	`PrecisionType::C64F`, `PrecisionType::C32F`, `PrecisionType::C16F`, `PrecisionType::C16Bf`, `PrecisionType::CTf32`
`PrecisionType::C32F`	`PrecisionType::C32F`, `PrecisionType::C16F`, `PrecisionType::C16Bf`, `PrecisionType::CTf32`
`PrecisionType::R64F`	`PrecisionType::R64F`, `PrecisionType::R32F`, `PrecisionType::R16F`, `PrecisionType::R16Bf`, `PrecisionType::RTf32`
`PrecisionType::R32F`	`PrecisionType::R32F`, `PrecisionType::R16F`, `PrecisionType::R16Bf`, `PrecisionType::RTf32`

§Errors

Returns an error if cuSOLVER rejects the parameter structure.

Source

pub fn set_tolerance(&mut self, tolerance: f64) -> Result<()>

Sets the tolerance for the refinement solver. By default it is such that all the RHS satisfy:

RNRM < SQRT(N)*XNRM*ANRM*EPS*BWDMAX where

RNRM is the infinity-norm of the residual
XNRM is the infinity-norm of the solution
ANRM is the infinity-operator-norm of the matrix A
EPS is the machine epsilon for the input/output data type that matches LAPACK xLAMCH('Epsilon')
BWDMAX, the value BWDMAX is fixed to 1.0

Use this to set the tolerance to a lower or higher value. The tolerance value is always stored in real double precision, regardless of the input and output data type.

§Errors

Returns an error if cuSOLVER rejects the parameter structure.

Source

pub fn set_inner_tolerance(&mut self, tolerance: f64) -> Result<()>

Sets the tolerance for the inner refinement solver when the refinement solver consists of two levels, for example IrsRefinement::ClassicalGmres or IrsRefinement::GmresGmres. Ignored for one-level refinement solvers such as IrsRefinement::Classical or IrsRefinement::Gmres. The default value is 1e-4. This sets the tolerance for the inner solver, such as the inner GMRES. For example, if the refinement solver is IrsRefinement::ClassicalGmres, setting this tolerance means that the inner GMRES solver converges to that tolerance at each outer iteration of the classical refinement solver. The tolerance value is always stored in real double precision, regardless of the input and output data type.

§Errors

Returns an error if cuSOLVER rejects the parameter structure.

Source

pub fn set_max_iterations(&mut self, max_iterations: i32) -> Result<()>

Sets the total number of allowed refinement iterations before the solver stops. The total is the sum of the outer and inner iterations. Inner iterations are meaningful when a two-level refinement solver is configured. The default value is 50.

§Errors

Returns an error if cuSOLVER rejects the parameter structure.

Source

pub fn set_max_inner_iterations(&mut self, max_iterations: i32) -> Result<()>

Sets the maximum number of iterations allowed for the inner refinement solver. Ignored for one-level refinement solvers such as IrsRefinement::Classical or IrsRefinement::Gmres. The inner refinement solver stops after reaching either the inner tolerance or MaxItersInner. The default value is 50. Cannot be larger than MaxIters because MaxIters is the total number of allowed iterations. If IrsParams::set_max_iterations is called after this method, it has priority and overwrites MaxItersInner with min(MaxIters, MaxItersInner).

§Errors

Returns an error if max_iterations is larger than MaxIters, or if cuSOLVER rejects the parameter structure.

Source

pub fn max_iterations(&self) -> Result<i32>

Returns the current maximum-iteration setting in this parameter structure. Current parameter configuration, distinct from IrsInfos::max_iterations, which returns the maximum number of iterations allowed for a particular IRS solver call. The parameter structure can be reused across many IRS solver calls. The allowed MaxIters value can change between calls, while the Infos structure contains information about one particular call and cannot be reused for different calls.

§Errors

Returns an error if cuSOLVER rejects the parameter structure.

Source

pub fn enable_fallback(&mut self) -> Result<()>

Enables fallback to the main precision if the Iterative Refinement Solver (IRS) fails to converge. If the IRS solver fails to converge, it returns a non-convergence code such as niter < 0. With fallback disabled, it returns the non-convergent solution as-is. With fallback enabled, it falls back to the main precision, which is the input/output data precision, and solves the problem again from scratch. This fallback is the default behavior.

§Errors

Returns an error if cuSOLVER rejects the parameter structure.

Source

pub fn disable_fallback(&mut self) -> Result<()>

Disables fallback to the main precision if the Iterative Refinement Solver (IRS) fails to converge. If the IRS solver fails to converge, it returns a non-convergence code such as niter < 0. With fallback disabled, the returned solution is whatever the refinement solver reached before returning. Disabling fallback does not guarantee that the solution is accurate. Re-enable fallback with IrsParams::enable_fallback.

§Errors

Returns an error if cuSOLVER rejects the parameter structure.

Source

pub fn as_raw(&self) -> cusolverDnIRSParams_t

Source

pub unsafe fn from_raw(handle: cusolverDnIRSParams_t) -> Result<Self>

Takes ownership of a raw cuSOLVER IRS params handle.

§Safety

handle must be a valid cusolverDnIRSParams_t created by cuSOLVER. The returned wrapper takes ownership and will destroy it with cusolverDnIRSParamsDestroy; no other owner may destroy or keep using it.

Source