kryst
High-performance Krylov subspace and preconditioned iterative solvers for dense and sparse linear systems, with advanced preconditioning strategies and automated parameter optimization.
Features
Iterative Solvers
- Krylov Methods: CG, PCG, GMRES, FGMRES, BiCGStab, CGS, QMR, TFQMR, MINRES, CGNR
- Direct Methods: LU and QR factorization via PREONLY solver type
- Parallel Support: Shared-memory (Rayon) and distributed-memory (MPI) parallelism
Preconditioners
Basic Preconditioners
- Jacobi: Diagonal scaling preconditioner
- Block Jacobi: Block-wise diagonal preconditioning
- SOR/SSOR: Successive Over-Relaxation methods
- None: No preconditioning (identity)
Incomplete Factorizations
- ILU(0): Zero fill-in incomplete LU factorization
- ILU(k): Incomplete LU with k levels of fill-in
- ILUT: Threshold-based incomplete LU factorization
- ILUTP: ILUT with partial pivoting
- ILUP: Incomplete LU with partial pivoting
Advanced Preconditioners
- Chebyshev: Enhanced polynomial preconditioning with eigenvalue estimation
- AMG: Algebraic Multigrid with configurable smoothing parameters
- ASM: Additive Schwarz Method (domain decomposition)
- Approximate Inverse: SPAI-type approximate inverse preconditioners
Composite Preconditioning
- PC-Chaining: Sequential application of multiple preconditioners via
pc_chainoption - Enhanced Chebyshev: Matrix-aware polynomial preconditioning with automatic eigenvalue estimation
- Smoothed AMG: Configurable pre- and post-smoothing parameters (
amg_nu_pre,amg_nu_post)
Monitoring & Automation
- Iteration Monitoring: Real-time convergence tracking with
IterationMonitor - Parameter Tuning: Automated optimization with
ParameterTunerand grid search - Data Export: CSV output for convergence analysis with
enable_csv_logging() - Performance Metrics: Comprehensive timing and convergence rate analysis
Architecture
- PETSc-style API: Unified KSP context for runtime solver selection
- Command-line Options: Complete options database with 50+ parameters
- Trait-based Design: Extensible for custom matrices and preconditioners
- Memory Efficiency: In-place operations and configurable workspace management
- High Performance: Optimized inner kernels with SIMD and parallelization
- Matrix-Free Operators: Shell matrices for callback-based MatVec operations
- Setup Reuse: Two-phase API with preconditioner and workspace recycling
- CSR utilities: zero-copy
row_ptr/col_idx/valuesaccess and sparse kernels (spgemm, CSR Galerkin triple product)
Installation
Add to your Cargo.toml:
[]
= "1.0"
Feature Flags
[]
= ["rayon", "logging"] # Shared-memory parallelism + monitoring
= ["dep:rayon"] # Rayon-based parallel execution
= ["dep:mpi"] # Distributed-memory parallelism via MPI
= ["dep:log"] # Iteration monitoring and profiling
Quick Start
Basic Usage with KspContext (Recommended)
use ;
use PcType;
use DenseOp;
use Mat;
use Arc;
// Create a 100×100 test system
let n = 100;
let mat = from_fn;
let a = new;
let rhs = vec!;
let mut solution = vec!;
// Configure solver and preconditioner
let mut ksp = new;
ksp.set_type?
.set_pc_type?
.set_operators;
ksp.rtol = 1e-8;
ksp.maxits = 1000;
// Setup once then solve
ksp.setup?;
let stats = ksp.solve?;
println!;
Explicit Setup and Reuse
Reuse factorization and workspace across multiple solves by calling setup() once:
let mut ksp = new;
ksp.set_type?
.set_pc_type?
.set_operators;
ksp.setup?; // perform factorization and allocate workspace
for rhs in rhs_set.iter
Advanced Features: Composite Preconditioning
use KspContext;
use ;
let mut ksp_opts = default;
ksp_opts.ksp_type = Some;
let mut pc_opts = default;
pc_opts.pc_chain = Some;
pc_opts.chebyshev_degree = Some;
let mut ksp = new;
ksp.set_from_options?
.set_operators;
ksp.setup?;
let stats = ksp.solve?;
Enhanced AMG with Smoothing
use ;
use PcType;
use PcOptions;
let mut pc_opts = default;
pc_opts.amg_levels = Some;
pc_opts.amg_strength_threshold = Some;
pc_opts.amg_nu_pre = Some; // Pre-smoothing steps
pc_opts.amg_nu_post = Some; // Post-smoothing steps
let mut ksp = new;
ksp.set_type?
.set_pc_type?
.set_operators;
ksp.setup?;
let stats = ksp.solve?;
Iteration Monitoring and Analysis
use ;
use Duration;
// Monitor convergence behavior
let mut monitor = new;
// In practice, integrate monitor with solver iteration callbacks
// Automated parameter tuning
let mut tuner = new;
tuner.set_solver_types;
tuner.set_pc_types;
tuner.set_tolerances;
tuner.set_max_config_time;
let = tuner.tune_parameters.unwrap;
println!;
Command-line Interface (PETSc-style)
use ;
use KspContext;
// Parse command-line options
let args: = args.collect;
let = parse_all_options?;
// Configure from options
let mut ksp = new;
ksp.set_from_all_options?
.set_operators;
ksp.setup?;
let stats = ksp.solve?;
Run your program with PETSc-style options:
# Basic solver configuration
# Direct solvers
# Advanced preconditioning
# Show all available options
Supported Command-line Options
KSP (Krylov Solver) Options
-ksp_type <solver>- Solver type:cg,pcg,gmres,fgmres,bicgstab,cgs,qmr,tfqmr,minres,cgnr,preonly-ksp_rtol <float>- Relative convergence tolerance (default: 1e-5)-ksp_atol <float>- Absolute convergence tolerance (default: 1e-50)-ksp_dtol <float>- Divergence tolerance (default: 1e5)-ksp_max_it <int>- Maximum number of iterations (default: 10000)-ksp_gmres_restart <int>- GMRES restart parameter (default: 50)-ksp_pc_side <side>- Preconditioning side:left,right,symmetric
PC (Preconditioner) Options
Basic Preconditioner Options
-pc_type <pc>- Preconditioner type:jacobi,blockjacobi,sor,none
Incomplete Factorization Options
-pc_type <pc>- ILU variants:ilu0,ilu,ilut,ilutp,ilup-pc_ilu_levels <int>- ILU fill levels (default: 0)-pc_ilut_drop_tol <float>- ILUT drop tolerance (default: 1e-3)-pc_ilut_max_fill <int>- ILUT maximum fill per row (default: 10)
Enhanced Preconditioner Options
-pc_type chebyshev- Enhanced Chebyshev with eigenvalue estimation-chebyshev_degree <int>- Polynomial degree (default: 3)-pc_type amg- Algebraic multigrid with smoothing control-amg_levels <int>- Number of AMG levels (default: 4)-amg_strength_threshold <float>- Strong connection threshold (default: 0.25)-amg_nu_pre <int>- Pre-smoothing steps (default: 1)-amg_nu_post <int>- Post-smoothing steps (default: 1)
Composite Preconditioning Options
-pc_chain <string>- Sequential preconditioner chain (e.g., "jacobi,chebyshev")-pc_type asm- Additive Schwarz Method-pc_type approxinv- Approximate inverse preconditioner
Direct Solver Options
-pc_type lu- Direct LU factorization via SuperLU-pc_type qr- Direct QR factorization
Domain Decomposition Options
-asm_overlap <int>- ASM subdomain overlap (default: 1)-asm_type <type>- ASM variant:restrict,interpolate,basic
Usage Examples
# Enhanced Chebyshev preconditioning
# AMG with custom smoothing
# Composite preconditioning (PC-chaining)
# High-accuracy direct solve
# BiCGStab with threshold ILU
# GMRES with additive Schwarz
Monitoring and Automation
Iteration Monitoring
Track solver convergence with real-time monitoring:
use IterationMonitor;
use ;
use PcType;
use ;
use Duration;
// Create and configure monitor
let mut monitor = new;
monitor.enable_csv_logging.unwrap;
// Configure solver with monitoring callback
let monitor_ref = new;
let monitor_clone = clone;
let mut ksp = new;
ksp.set_type?
.set_pc_type?
.set_operators;
// Add monitoring callback
ksp.add_monitor;
// Solve with monitoring
ksp.setup?;
let stats = ksp.solve?;
// Analyze convergence
if let Ok = monitor_ref.lock
Automated Parameter Tuning
Optimize solver/preconditioner combinations automatically:
use ;
use SolverType;
use PcType;
use Duration;
let mut tuner = new;
// Configure search space
tuner.set_solver_types
.set_pc_types
.set_tolerances
.set_max_config_time;
// Add PC-chain configurations for composite preconditioning
tuner.add_pc_chains;
// Run automated tuning
let = tuner.tune_parameters.unwrap;
println!;
println!;
println!;
println!;
if let Some = &best_config.pc_chain
println!;
// Export results for further analysis
tuner.export_results.unwrap;
let summary = tuner.get_summary;
println!;
Advanced Monitoring Features
use IterationMonitor;
use Duration;
let mut monitor = new;
monitor.start_solve;
// Record some iterations
monitor.record_iteration;
monitor.record_iteration;
monitor.record_iteration;
// Mark convergence
monitor.mark_converged;
// Get detailed statistics
let stats = monitor.get_statistics;
println!;
println!;
println!;
println!;
println!;
// Check recent convergence behavior
if let Some = monitor.recent_convergence_rate
// Set up real-time monitoring callbacks
let mut ksp = new;
ksp.add_monitor;
Profiling and Performance Analysis
Enable detailed timing and performance information:
[]
= { = "1.0", = ["logging"] }
Run with environment variables for detailed profiling:
# Trace-level logging shows detailed stage timing
RUST_LOG=trace
# Debug-level shows major operations
RUST_LOG=debug
# Info-level shows high-level progress
RUST_LOG=info
Profiling output includes:
- KSPSetup: Preconditioner setup and workspace allocation timing
- KSPSolve: Complete solve time breakdown
- PCSetup: Individual preconditioner setup timing
- WorkspaceAllocation: Memory allocation timing
- MatVec: Matrix-vector product timing
- PCApply: Preconditioner application timing
Solver Algorithms
Krylov Methods
- CG: Conjugate Gradient for symmetric positive definite systems
- PCG: Preconditioned Conjugate Gradient
- GMRES: Generalized Minimal Residual with restart
- FGMRES: Flexible GMRES for variable preconditioning
- BiCGStab: BiConjugate Gradient Stabilized for nonsymmetric systems
- CGS: Conjugate Gradient Squared
- QMR: Quasi-Minimal Residual method
- TFQMR: Transpose-Free QMR
- MINRES: Minimal Residual for symmetric indefinite systems
- CGNR: Conjugate Gradient on the Normal Equations
Direct Methods
- PREONLY: Single-step direct solve using LU or QR factorization
- Supports both
-pc_type luand-pc_type qr - Ideal for well-conditioned systems where direct methods are preferred
Preconditioner Details
Basic Preconditioners
- Jacobi: Diagonal scaling
M⁻¹ = diag(A)⁻¹ - Block Jacobi: Block-wise diagonal preconditioning with configurable block sizes
- SOR/SSOR: Successive Over-Relaxation with configurable relaxation parameter
- None: Identity preconditioning (no preconditioning)
Incomplete Factorizations
- ILU(0): Zero fill-in incomplete LU factorization
- ILU(k): Incomplete LU with k levels of fill-in
- ILUT: ILU with threshold-based dropping strategy
- ILUTP: ILUT with partial pivoting for numerical stability
- ILUP: Incomplete LU with partial pivoting
Advanced Preconditioners
Enhanced Chebyshev
Enhanced polynomial preconditioning implementation based on eigenvalue estimation:
use Chebyshev;
use PcOptions;
use PcType;
// Enhanced Chebyshev with automatic eigenvalue estimation
let mut pc_opts = default;
pc_opts.chebyshev_degree = Some; // Higher degree for better approximation
ksp.set_pc_type?;
Features:
- Matrix-aware: Automatic eigenvalue bound estimation using power iteration
- Configurable Degree: Polynomial degree optimization (default: 3, range: 1-20)
- Storage Efficient: Reuses matrix storage for eigenvalue computation
- Robust: Handles near-singular matrices with adaptive bounds
Enhanced AMG
Advanced Algebraic Multigrid with configurable smoothing:
use Amg;
use PcOptions;
use PcType;
// Enhanced AMG with smoothing control
let mut pc_opts = default;
pc_opts.amg_levels = Some; // Multigrid levels
pc_opts.amg_strength_threshold = Some; // Strong connection threshold
pc_opts.amg_nu_pre = Some; // Pre-smoothing steps
pc_opts.amg_nu_post = Some; // Post-smoothing steps
ksp.set_pc_type?;
Features:
- Smoothed Multigrid: Configurable pre- and post-smoothing parameters
- Adaptive Coarsening: Automatic grid hierarchy construction based on strength
- Strength Threshold: Customizable strong connection criteria (default: 0.25)
- Flexible Smoothing: Separate control of pre/post smoothing iterations
Composite Preconditioning
PC-chaining allows sequential application of multiple preconditioners:
use ;
// Example 1: Jacobi + Chebyshev combination
let mut pc_opts = default;
pc_opts.pc_chain = Some;
pc_opts.chebyshev_degree = Some;
ksp.set_from_options?;
// Example 2: Multi-stage preconditioning
let mut pc_opts = default;
pc_opts.pc_chain = Some;
ksp.set_from_options?;
// Example 3: Domain decomposition + multigrid
let mut pc_opts = default;
pc_opts.pc_chain = Some;
pc_opts.amg_nu_pre = Some;
ksp.set_from_options?;
Features:
- Flexible Combinations: Mix any preconditioner types in sequence
- Automatic Setup: Transparent handling of composite preconditioner construction
- Parameter Inheritance: Specialized parameters apply to respective stages
- Performance Tuning: Optimize combinations via
ParameterTuner
Domain Decomposition
- ASM: Additive Schwarz Method with configurable overlap
- Approximate Inverse: SPAI-type sparse approximate inverse
Performance Features
Parallelization
- Shared Memory: Rayon-based parallel execution for matrix operations and preconditioner application
- Distributed Memory: MPI support for distributed linear algebra operations (via mpi feature)
- SIMD Optimization: Leverages hardware acceleration through optimized inner kernels via faer
- Parallel Preconditioners: Thread-safe preconditioner application with work stealing
Memory Management
- In-place Operations: Minimizes memory allocations during iteration
- Workspace Reuse: Preallocated workspace vectors for Krylov methods
- Block Operations: Efficient cache usage through blocked algorithms
- Sparse Patterns: Memory-efficient storage for sparse matrices and preconditioners
Algorithm Optimizations
- Eigenvalue Estimation: Fast power iteration for Chebyshev eigenvalue bounds
- Adaptive Restart: GMRES restart optimization based on convergence behavior
- Early Termination: Configurable stopping criteria with multiple tolerance options
- Matrix Preprocessing: Reordering and scaling for improved conditioning
Matrix Support
Dense Matrices
- Full support via
faer::Mat<T>integration - Optimized BLAS-level operations
- Support for f32, f64 precision
- Efficient dense matrix-vector products
Sparse Matrices
- Custom CSR format implementation
- Efficient sparse matrix-vector products
- Pattern-based optimization for preconditioners
- Memory-efficient storage with configurable sparsity patterns
Matrix-Free Methods
- Trait-based
MatVecinterface for custom matrix implementations - Support for implicit matrix representations
- Easy integration of matrix-free operators
- Efficient for PDE discretizations and other structured problems
Examples and Demonstrations
The library includes comprehensive demonstration programs:
Basic Usage Examples
# Options and CLI interface demonstration
# Direct solver usage
# Matrix market file demonstration
Advanced Feature Examples
# Convergence behavior analysis
# Iteration monitoring demonstration
# HYPRE-style ILU demonstration
# MPI parallel examples (requires MPI)
Note: Matrix Market example files (*.mtx) are excluded from the published crate to stay within size limits. The matrix_market_demo example will auto-generate test data if example files are not found.
Command-line Examples
# Enhanced Chebyshev preconditioning
# AMG with custom smoothing parameters
# Composite preconditioning with PC-chaining
# High-precision direct solve
# Complex preconditioner combinations
Benchmarks and Performance
Performance benchmarks are available via:
Benchmark categories include:
- Solver Comparison: GMRES vs BiCGStab vs CG performance on various problems
- Preconditioner Effectiveness: Impact of different preconditioners on convergence
- Direct vs Iterative: Performance comparison for different problem sizes
- Parallel Scaling: Shared-memory (Rayon) and distributed-memory (MPI) performance
- Phase III Features: PC-chaining and enhanced preconditioning performance
- Memory Usage: Workspace allocation and memory efficiency analysis
Sample benchmark results (varies by system and problem):
solver_comparison/gmres time: 45.2 ms (convergence: 23 iterations)
solver_comparison/bicgstab time: 38.7 ms (convergence: 31 iterations)
solver_comparison/cg time: 22.1 ms (convergence: 18 iterations)
pc_effectiveness/jacobi time: 156 ms (convergence: 89 iterations)
pc_effectiveness/amg time: 67.3 ms (convergence: 12 iterations)
pc_chaining/jacobi+cheby time: 43.8 ms (convergence: 15 iterations)
Custom Extensions
Custom Solvers
use ;
Custom Preconditioners
use ;
Matrix-Free Operators
use MatVec;
use KError;
// Usage with KspContext
use Arc;
let laplacian = new;
let mut ksp = new;
ksp.set_type?
.set_pc_type?
.set_operators;
// Can use matrix-free operator directly
let rhs = vec!;
let mut sol = vec!;
ksp.setup?;
let stats = ksp.solve?;
Documentation and Resources
- API Documentation - Complete API reference with examples
- Repository - Source code, issues, and discussions
- Examples Directory - Comprehensive demonstration programs
- Benchmarks - Performance comparison suite
- Phase III/IV Summary - Advanced preconditioning and automation features
Mathematical References
- Saad, Y. (2003). Iterative Methods for Sparse Linear Systems, 2nd Edition. SIAM.
- Barrett, R. et al. (1994). Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods. SIAM.
- Trefethen, L.N. & Bau, D. (1997). Numerical Linear Algebra. SIAM.
- Briggs, W.L., Henson, V.E. & McCormick, S.F. (2000). A Multigrid Tutorial, 2nd Edition. SIAM.
Software References
- PETSc Documentation: https://petsc.org/release/documentation/
- Trilinos Documentation: https://trilinos.github.io/
Testing and Validation
Run the comprehensive test suite:
# All tests
# Specific test categories
# Integration tests
# With specific features
# Performance testing
Test Coverage
- Unit Tests: 200+ individual component tests across solvers, preconditioners, and utilities
- Integration Tests: End-to-end validation including monitor integration and parameter tuning
- Options Tests: CLI parsing and configuration validation
- Feature Tests: Advanced functionality validation (PC-chaining, monitoring, tuning)
- Performance Tests: Benchmark validation and regression testing
Migration Guide
From Version 0.x to 1.0
New Features:
- Enhanced Chebyshev preconditioner with eigenvalue estimation
- AMG with configurable pre/post smoothing parameters
- PC-chaining for composite preconditioning
- Iteration monitoring and automated parameter tuning
- Expanded CLI options (50+ parameters)
Breaking Changes:
- None! Version 1.0 maintains full backward compatibility
Recommended Upgrades:
// Old approach
ksp.set_pc_type?;
// Enhanced approach (optional)
let mut pc_opts = default;
pc_opts.chebyshev_degree = Some;
ksp.set_pc_type?;
New Monitoring Capabilities:
// Add iteration monitoring
use IterationMonitor;
let mut monitor = new;
ksp.add_monitor;
// Add automated parameter tuning
use ParameterTuner;
let mut tuner = new;
let = tuner.tune_parameters.unwrap;
License
This project is licensed under the MIT License - see the LICENSE file for details.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
Development Setup
-
Clone the repository:
-
Install Rust (stable toolchain recommended):
| -
Optional: Install MPI for distributed features:
# Ubuntu/Debian # macOS -
Run tests and benchmarks:
Areas for Contribution
High Priority
- GPU Acceleration: CUDA/OpenCL backends for matrix operations
- Additional Solvers: LOBPCG, IDR(s), BiCGStab(l) variants
- Matrix Formats: Coordinate (COO), block sparse (BSR) formats
- Performance: SIMD optimizations, better cache utilization
Medium Priority
- Multigrid Variants: Classical AMG, smoothed aggregation
- Eigenvalue Solvers: Integration with Krylov eigenvalue methods
- Nonlinear Solvers: Newton-Krylov, JFNK methods
- Adaptive Methods: Adaptive restart, dynamic tolerance adjustment
Lower Priority
- Complex Arithmetic: Complex-valued linear systems support
- Mixed Precision: fp16/fp32/fp64 combinations for accuracy/performance tradeoffs
- Advanced I/O: HDF5, NetCDF matrix I/O support
- Visualization: Integration with plotting libraries for convergence analysis
Code Style and Standards
- Follow Rust standard formatting:
cargo fmt - Ensure clippy compliance:
cargo clippy - Add comprehensive tests for new features
- Include benchmark tests for performance-critical code
- Document public APIs with examples
- Follow semantic versioning for releases
Pull Request Process
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Make your changes and add tests
- Ensure all tests pass:
cargo test - Run formatting and linting:
cargo fmt && cargo clippy - Commit your changes:
git commit -m 'Add amazing feature' - Push to the branch:
git push origin feature/amazing-feature - Open a Pull Request with a clear description
kryst provides a comprehensive, high-performance linear algebra toolkit for the Rust ecosystem, with particular focus on iterative methods for large-scale scientific computing applications. The library combines the mathematical rigor of established numerical libraries like PETSc with the safety and performance characteristics of Rust, making it ideal for research, scientific computing, and production applications requiring robust linear system solvers.