kryst
High-performance Krylov subspace and preconditioned iterative solvers for dense and sparse linear systems, with advanced preconditioning strategies and automated parameter optimization.
Features
Iterative Solvers
- Krylov Methods: CG, PCG, GMRES, FGMRES, BiCGStab, CGS, QMR, TFQMR, MINRES, CGNR
- Direct Methods: LU and QR factorization via PREONLY solver type
- Parallel Support: Shared-memory (Rayon) and distributed-memory (MPI) parallelism
Preconditioners
Basic Preconditioners
- Jacobi: Diagonal scaling preconditioner
- Block Jacobi: Block-wise diagonal preconditioning
- SOR/SSOR: Successive Over-Relaxation methods
- None: No preconditioning (identity)
Incomplete Factorizations
- ILU(0): Zero fill-in incomplete LU factorization
- ILU(k): Incomplete LU with k levels of fill-in
- ILUT: Threshold-based incomplete LU factorization
- ILUTP: ILUT with partial pivoting
- ILUP: Incomplete LU with partial pivoting
Advanced Preconditioners
- Chebyshev: Enhanced polynomial preconditioning with eigenvalue estimation
- AMG: Algebraic Multigrid with configurable smoothing parameters
- ASM: Additive Schwarz Method (domain decomposition)
- Approximate Inverse: SPAI-type approximate inverse preconditioners
Composite Preconditioning (Phase III)
- PC-Chaining: Sequential application of multiple preconditioners
- Enhanced Chebyshev: Matrix-aware polynomial preconditioning with automatic eigenvalue bounds
- Smoothed AMG: Configurable pre- and post-smoothing parameters
Monitoring & Automation (Phase IV)
- Iteration Monitoring: Real-time convergence tracking and analysis
- Parameter Tuning: Automated optimization with grid search
- Data Export: CSV/JSON output for external analysis
- Performance Metrics: Time-based optimization with configurable timeouts
Architecture
- PETSc-style API: Unified KSP context for runtime solver selection
- Command-line Options: Complete options database with 50+ parameters
- Trait-based Design: Extensible for custom matrices and preconditioners
- Memory Efficiency: In-place operations and configurable workspace management
- High Performance: Optimized inner kernels with SIMD and parallelization
Installation
Add to your Cargo.toml:
[]
= "1.0"
Feature Flags
[]
= ["rayon", "logging"] # Shared-memory parallelism + monitoring
= ["dep:rayon"] # Rayon-based parallel execution
= ["dep:mpi"] # Distributed-memory parallelism via MPI
= ["dep:log"] # Iteration monitoring and profiling
Quick Start
Basic Usage with KspContext (Recommended)
use ;
use Mat;
// Create a 100x100 test system
let n = 100;
let matrix = from_fn;
let rhs = vec!;
let mut solution = vec!;
// Configure solver and preconditioner
let mut ksp = new;
ksp.set_type.unwrap
.set_pc_type.unwrap;
ksp.rtol = 1e-8;
ksp.maxits = 1000;
// Solve the system
ksp.setup.unwrap;
let stats = ksp.solve.unwrap;
println!;
Advanced Features: Composite Preconditioning
use ;
let mut ksp = new;
ksp.set_type.unwrap;
// Use PC-chaining for composite preconditioning
let mut pc_opts = default;
pc_opts.pc_chain = Some;
pc_opts.chebyshev_degree = Some;
ksp.set_pc_options;
ksp.setup.unwrap;
let stats = ksp.solve.unwrap;
Enhanced AMG with Smoothing
use ;
let mut ksp = new;
ksp.set_type.unwrap
.set_pc_type.unwrap;
// Configure AMG smoothing parameters
let mut pc_opts = default;
pc_opts.amg_levels = Some;
pc_opts.amg_strength_threshold = Some;
pc_opts.amg_nu_pre = Some; // Pre-smoothing steps
pc_opts.amg_nu_post = Some; // Post-smoothing steps
ksp.set_pc_options;
ksp.setup.unwrap;
let stats = ksp.solve.unwrap;
Iteration Monitoring and Analysis
use ;
use Duration;
// Monitor convergence behavior
let mut monitor = new;
// In practice, integrate monitor with solver iteration callbacks
// Automated parameter tuning
let mut tuner = new;
tuner.set_solver_types;
tuner.set_pc_types;
tuner.set_tolerances;
tuner.set_max_config_time;
let = tuner.tune_parameters.unwrap;
println!;
Command-line Interface (PETSc-style)
use ;
// Parse command-line options
let args: = args.collect;
let = parse_all_options.unwrap;
// Configure from options
let mut ksp = new;
ksp.set_from_all_options.unwrap;
ksp.setup.unwrap;
let stats = ksp.solve.unwrap;
Run your program with PETSc-style options:
# Basic solver configuration
# Direct solvers
# Advanced preconditioning
# Show all available options
Supported Command-line Options
KSP (Krylov Solver) Options
-ksp_type <solver>- Solver type:cg,pcg,gmres,fgmres,bicgstab,cgs,qmr,tfqmr,minres,cgnr,preonly-ksp_rtol <float>- Relative convergence tolerance (default: 1e-6)-ksp_atol <float>- Absolute convergence tolerance (default: 1e-12)-ksp_dtol <float>- Divergence tolerance (default: 1e3)-ksp_max_it <int>- Maximum number of iterations (default: 1000)-ksp_gmres_restart <int>- GMRES restart parameter (default: 50)-ksp_pc_side <side>- Preconditioning side:left,right,symmetric
PC (Preconditioner) Options
Basic Preconditioner Options
-pc_type <pc>- Preconditioner type:jacobi,blockjacobi,sor,none
Incomplete Factorization Options
-pc_type <pc>- ILU variants:ilu0,ilu,ilut,ilutp,ilup-pc_ilu_levels <int>- ILU fill levels (default: 0)-pc_ilut_drop_tol <float>- ILUT drop tolerance (default: 1e-3)-pc_ilut_max_fill <int>- ILUT maximum fill per row (default: 10)
Enhanced Preconditioner Options (Phase III)
-pc_type chebyshev- Enhanced Chebyshev with eigenvalue estimation-chebyshev_degree <int>- Polynomial degree (default: 3)-pc_type amg- Algebraic multigrid with smoothing control-amg_levels <int>- Number of AMG levels (default: 4)-amg_strength_threshold <float>- Strong connection threshold (default: 0.25)-amg_nu_pre <int>- Pre-smoothing steps (default: 1)-amg_nu_post <int>- Post-smoothing steps (default: 1)
Composite Preconditioning Options
-pc_chain <string>- Sequential preconditioner chain (e.g., "jacobi,chebyshev")-pc_type asm- Additive Schwarz Method-pc_type approxinverse- Approximate inverse preconditioner
Domain Decomposition Options
-asm_overlap <int>- ASM subdomain overlap (default: 1)-asm_type <type>- ASM variant:restrict,interpolate,basic
Usage Examples
# Enhanced Chebyshev preconditioning
# AMG with custom smoothing
# Composite preconditioning (PC-chaining)
# High-accuracy direct solve
# BiCGStab with threshold ILU
# GMRES with additive Schwarz
Monitoring and Automation
Iteration Monitoring (Phase IV)
Track solver convergence with real-time monitoring:
use ;
// Create and configure monitor
let mut monitor = new;
// Configure solver
let mut ksp = new;
ksp.set_type.unwrap
.set_pc_type.unwrap;
// Solve with monitoring
ksp.setup.unwrap;
let stats = ksp.solve.unwrap;
// Record iteration data (integrate with solver callbacks in practice)
for i in 0..stats.iterations
// Analyze convergence
let stats = monitor.get_statistics;
println!;
println!;
// Export data for analysis
monitor.write_to_csv.unwrap;
Automated Parameter Tuning (Phase IV)
Optimize solver/preconditioner combinations automatically:
use ;
use Duration;
let mut tuner = new;
// Configure search space
tuner.set_solver_types;
tuner.set_pc_types;
tuner.set_tolerances;
tuner.set_max_config_time;
// Add PC-chain configurations for composite preconditioning
tuner.add_pc_chain_config;
tuner.add_pc_chain_config;
// Run automated tuning
let = tuner.tune_parameters.unwrap;
println!;
println!;
println!;
println!;
if let Some = &best_config.pc_chain
// Export results for further analysis
tuner.export_results.unwrap;
Advanced Monitoring Features
// Detect convergence stagnation
if monitor.detect_stagnation
// Get convergence rate analysis
let rate_analysis = monitor.analyze_convergence_rate;
match rate_analysis
// Set up real-time monitoring callbacks
let mut ksp = new;
ksp.add_monitor;
Profiling and Performance Analysis
Enable detailed timing and performance information:
[]
= { = "1.0", = ["logging"] }
Run with environment variables for detailed profiling:
# Trace-level logging shows detailed stage timing
RUST_LOG=trace
# Debug-level shows major operations
RUST_LOG=debug
# Info-level shows high-level progress
RUST_LOG=info
Profiling output includes:
- KSPSetup: Preconditioner setup and workspace allocation timing
- KSPSolve: Complete solve time breakdown
- PCSetup: Individual preconditioner setup timing
- WorkspaceAllocation: Memory allocation timing
- MatVec: Matrix-vector product timing
- PCApply: Preconditioner application timing
Solver Algorithms
Krylov Methods
- CG: Conjugate Gradient for symmetric positive definite systems
- PCG: Preconditioned Conjugate Gradient
- GMRES: Generalized Minimal Residual with restart
- FGMRES: Flexible GMRES for variable preconditioning
- BiCGStab: BiConjugate Gradient Stabilized for nonsymmetric systems
- CGS: Conjugate Gradient Squared
- QMR: Quasi-Minimal Residual method
- TFQMR: Transpose-Free QMR
- MINRES: Minimal Residual for symmetric indefinite systems
- CGNR: Conjugate Gradient on the Normal Equations
Direct Methods
- PREONLY: Single-step direct solve using LU or QR factorization
- Supports both
-pc_type luand-pc_type qr - Ideal for well-conditioned systems where direct methods are preferred
Preconditioner Details
Basic Preconditioners
- Jacobi: Diagonal scaling
M⁻¹ = diag(A)⁻¹ - Block Jacobi: Block-wise diagonal preconditioning with configurable block sizes
- SOR/SSOR: Successive Over-Relaxation with configurable relaxation parameter
- None: Identity preconditioning (no preconditioning)
Incomplete Factorizations
- ILU(0): Zero fill-in incomplete LU factorization
- ILU(k): Incomplete LU with k levels of fill-in
- ILUT: ILU with threshold-based dropping strategy
- ILUTP: ILUT with partial pivoting for numerical stability
- ILUP: Incomplete LU with partial pivoting
Advanced Preconditioners
Enhanced Chebyshev (Phase III)
- Matrix-aware: Automatic eigenvalue bound estimation using power iteration
- Configurable Degree: Polynomial degree optimization
- Storage Efficient: Reuses matrix storage for eigenvalue computation
Enhanced AMG (Phase III)
- Smoothed Multigrid: Configurable pre- and post-smoothing parameters
- Adaptive Coarsening: Automatic grid hierarchy construction
- Strength Threshold: Customizable strong connection criteria
Composite Preconditioning (Phase III)
- PC-Chaining: Sequential application of multiple preconditioners
- Flexible Combinations: Mix any preconditioner types (e.g., "jacobi,amg,chebyshev")
- Automatic Setup: Transparent handling of composite preconditioner construction
Domain Decomposition
- ASM: Additive Schwarz Method with configurable overlap
- Approximate Inverse: SPAI-type sparse approximate inverse
Performance Features
Parallelization
- Shared Memory: Rayon-based parallel execution for matrix operations and preconditioner application
- Distributed Memory: MPI support for distributed linear algebra operations (via mpi feature)
- SIMD Optimization: Leverages hardware acceleration through optimized inner kernels via faer
- Parallel Preconditioners: Thread-safe preconditioner application with work stealing
Memory Management
- In-place Operations: Minimizes memory allocations during iteration
- Workspace Reuse: Preallocated workspace vectors for Krylov methods
- Block Operations: Efficient cache usage through blocked algorithms
- Sparse Patterns: Memory-efficient storage for sparse matrices and preconditioners
Algorithm Optimizations
- Eigenvalue Estimation: Fast power iteration for Chebyshev eigenvalue bounds
- Adaptive Restart: GMRES restart optimization based on convergence behavior
- Early Termination: Configurable stopping criteria with multiple tolerance options
- Matrix Preprocessing: Reordering and scaling for improved conditioning
Matrix Support
Dense Matrices
- Full support via
faer::Mat<T>integration - Optimized BLAS-level operations
- Support for f32, f64 precision
- Efficient dense matrix-vector products
Sparse Matrices
- Custom CSR format implementation
- Efficient sparse matrix-vector products
- Pattern-based optimization for preconditioners
- Memory-efficient storage with configurable sparsity patterns
Matrix-Free Methods
- Trait-based
MatVecinterface for custom matrix implementations - Support for implicit matrix representations
- Easy integration of matrix-free operators
- Efficient for PDE discretizations and other structured problems
Examples and Demonstrations
The library includes comprehensive demonstration programs:
Basic Usage Examples
# Options and CLI interface demonstration
# Direct solver usage
# Setup and workspace reuse patterns
Advanced Feature Examples
# Convergence behavior analysis
# Matrix market file I/O (auto-generates test data if files not found)
# Iteration monitoring demonstration
# MPI parallel examples (requires MPI)
Note: Large Matrix Market example files (*.mtx) are excluded from the published crate to stay within size limits. The matrix_market_demo example will auto-generate test data if the example files are not found. For the complete Matrix Market example files, clone the repository from GitHub.
Command-line Examples
# Enhanced Chebyshev preconditioning
# AMG with custom smoothing parameters
# Composite preconditioning with PC-chaining
# High-precision direct solve
# Complex preconditioner combinations
Benchmarks and Performance
Performance benchmarks are available via:
Benchmark categories include:
- Solver Comparison: GMRES vs BiCGStab vs CG performance on various problems
- Preconditioner Effectiveness: Impact of different preconditioners on convergence
- Direct vs Iterative: Performance comparison for different problem sizes
- Parallel Scaling: Shared-memory (Rayon) and distributed-memory (MPI) performance
- Phase III Features: PC-chaining and enhanced preconditioning performance
- Memory Usage: Workspace allocation and memory efficiency analysis
Sample benchmark results (varies by system and problem):
solver_comparison/gmres time: 45.2 ms (convergence: 23 iterations)
solver_comparison/bicgstab time: 38.7 ms (convergence: 31 iterations)
solver_comparison/cg time: 22.1 ms (convergence: 18 iterations)
pc_effectiveness/jacobi time: 156 ms (convergence: 89 iterations)
pc_effectiveness/amg time: 67.3 ms (convergence: 12 iterations)
pc_chaining/jacobi+cheby time: 43.8 ms (convergence: 15 iterations)
Custom Extensions
Custom Solvers
use ;
Custom Preconditioners
use ;
Matrix-Free Operators
use ;
// Usage with KspContext
let laplacian = LaplacianOperator ;
let mut ksp = new;
ksp.set_type.unwrap
.set_pc_type.unwrap;
// Can use matrix-free operator directly
// ksp.setup(&laplacian, 1000).unwrap();
Documentation and Resources
- API Documentation - Complete API reference with examples
- Repository - Source code, issues, and discussions
- Examples Directory - Comprehensive demonstration programs
- Benchmarks - Performance comparison suite
- Phase III/IV Summary - Advanced preconditioning and automation features
Mathematical References
- Saad, Y. (2003). Iterative Methods for Sparse Linear Systems, 2nd Edition. SIAM.
- Barrett, R. et al. (1994). Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods. SIAM.
- Trefethen, L.N. & Bau, D. (1997). Numerical Linear Algebra. SIAM.
- Briggs, W.L., Henson, V.E. & McCormick, S.F. (2000). A Multigrid Tutorial, 2nd Edition. SIAM.
Software References
- PETSc Documentation: https://petsc.org/release/documentation/
- Trilinos Documentation: https://trilinos.github.io/
Testing and Validation
Run the comprehensive test suite:
# All tests
# Specific test categories
# Integration tests
# With specific features
# Performance testing
Test Coverage
- Unit Tests: 148+ individual component tests
- Integration Tests: End-to-end solver and preconditioner validation
- Options Tests: CLI parsing and configuration validation
- Phase Tests: Advanced feature validation (PC-chaining, monitoring, tuning)
- Performance Tests: Benchmark validation and regression testing
Migration Guide
From Version 0.x to 1.0
New Features:
- Enhanced Chebyshev preconditioner with eigenvalue estimation
- AMG with configurable pre/post smoothing parameters
- PC-chaining for composite preconditioning
- Iteration monitoring and automated parameter tuning
- Expanded CLI options (50+ parameters)
Breaking Changes:
- None! Version 1.0 maintains full backward compatibility
Recommended Upgrades:
// Old approach
ksp.set_pc_type.unwrap;
// Enhanced approach (optional)
let mut pc_opts = default;
pc_opts.chebyshev_degree = Some;
ksp.set_pc_options;
License
This project is licensed under the MIT License - see the LICENSE file for details.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
Development Setup
-
Clone the repository:
-
Install Rust (stable toolchain recommended):
| -
Optional: Install MPI for distributed features:
# Ubuntu/Debian # macOS -
Run tests and benchmarks:
Areas for Contribution
High Priority
- GPU Acceleration: CUDA/OpenCL backends for matrix operations
- Additional Solvers: LOBPCG, IDR(s), BiCGStab(l) variants
- Matrix Formats: Coordinate (COO), block sparse (BSR) formats
- Performance: SIMD optimizations, better cache utilization
Medium Priority
- Multigrid Variants: Classical AMG, smoothed aggregation
- Eigenvalue Solvers: Integration with Krylov eigenvalue methods
- Nonlinear Solvers: Newton-Krylov, JFNK methods
- Adaptive Methods: Adaptive restart, dynamic tolerance adjustment
Lower Priority
- Complex Arithmetic: Complex-valued linear systems support
- Mixed Precision: fp16/fp32/fp64 combinations for accuracy/performance tradeoffs
- Advanced I/O: HDF5, NetCDF matrix I/O support
- Visualization: Integration with plotting libraries for convergence analysis
Code Style and Standards
- Follow Rust standard formatting:
cargo fmt - Ensure clippy compliance:
cargo clippy - Add comprehensive tests for new features
- Include benchmark tests for performance-critical code
- Document public APIs with examples
- Follow semantic versioning for releases
Pull Request Process
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Make your changes and add tests
- Ensure all tests pass:
cargo test - Run formatting and linting:
cargo fmt && cargo clippy - Commit your changes:
git commit -m 'Add amazing feature' - Push to the branch:
git push origin feature/amazing-feature - Open a Pull Request with a clear description
kryst provides a comprehensive, high-performance linear algebra toolkit for the Rust ecosystem, with particular focus on iterative methods for large-scale scientific computing applications. The library combines the mathematical rigor of established numerical libraries like PETSc with the safety and performance characteristics of Rust, making it ideal for research, scientific computing, and production applications requiring robust linear system solvers.