cp2k-rs 0.2.0

Rust bindings for CP2K with Python interface
Documentation
# CP2K-RS: Rust and Python bindings for CP2K

Rust bindings for the [CP2K](https://www.cp2k.org/) quantum chemistry package, with Python bindings via PyO3.

The `build-cp2k` feature compiles CP2K and all its key numerical dependencies (OpenBLAS, FFTW3, ScaLAPACK, DBCSR) **from source** and links them **statically** into the resulting binary. No pre-installed BLAS, LAPACK, FFTW3, or ScaLAPACK libraries are required on the build or target machine.

## Requirements

- Rust toolchain (install via [rustup]https://rustup.rs/)
- C and Fortran compilers: `gcc`, `gfortran`
- MPI with Fortran wrapper: `mpicc`, `mpifort` (OpenMPI or MPICH)
- CMake ≥ 3.16
- `curl` (used to download the FFTW3 release tarball during build)
- `git` (used to clone CP2K, DBCSR, ScaLAPACK)
- Python 3.8+ (for Python bindings)
- zlib (`zlib-devel` / `zlib1g-dev`)

BLAS, LAPACK, FFTW3, ScaLAPACK, libxc, and libint2 are **not** required as system packages — BLAS/LAPACK/FFTW3/ScaLAPACK are built and linked statically by `build.rs`, and libxc/libint2 are disabled in the CP2K build.

### Installing Dependencies on Fedora/RHEL/Rocky Linux

```bash
sudo dnf install gcc gcc-gfortran openmpi-devel cmake make git curl \
    zlib-devel python3-devel python3-pip environment-modules

module load mpi/openmpi-x86_64
```

### Installing Dependencies on Ubuntu/Debian

```bash
sudo apt update
sudo apt install build-essential gfortran libopenmpi-dev cmake git curl \
    zlib1g-dev python3-dev python3-pip
```

## Building

> **First build takes 60–90 minutes** — it compiles OpenBLAS, FFTW3, ScaLAPACK, DBCSR, and CP2K from source. Subsequent builds are fast due to skip guards (each dependency is only rebuilt if its output archive is missing).

### Rust build (with MPI and extended DFT properties)

```bash
module load mpi/openmpi-x86_64   # required for mpicc / mpifort
cargo build --release --features mpi,extended,build-cp2k
```

### Python wheel (for PyPI / local install)

```bash
pip install maturin
module load mpi/openmpi-x86_64
maturin build --release    # features are set in pyproject.toml
pip install target/wheels/cp2k_rs-*.whl
```

### Python development install

```bash
module load mpi/openmpi-x86_64
maturin develop --release --features python,mpi,extended,build-cp2k
```

### Build control

| Environment variable | Effect |
|----------------------|--------|
| `NUM_JOBS=N`         | Parallel compile jobs (default: number of CPUs) |
| `CARGO_TARGET_DIR=/path` | Change build output directory (useful when `/tmp` is too small) |

CP2K requires approximately **8 GB of disk space** in the build target directory.

## Features

| Feature | Description |
|---------|-------------|
| `mpi` | MPI parallel support (required for all real calculations) |
| `build-cp2k` | Build CP2K + all numerical dependencies from source |
| `extended` | Extended DFT interface: HOMO/LUMO, SCF info, Mulliken charges |
| `python` | Python bindings via PyO3 |

## Static linking

The following libraries are compiled from source and linked **statically**:

| Library | Version | Notes |
|---------|---------|-------|
| OpenBLAS | 0.3.31 | BLAS + LAPACK; via `openblas-src` crate |
| FFTW3 | 3.3.10 | Includes OpenMP variant (`libfftw3_omp`); built via Autotools |
| ScaLAPACK | 2.2.0 | Reference-ScaLAPACK; BLACS bundled |
| DBCSR | latest | Sparse matrix library for CP2K |
| CP2K | latest | Quantum chemistry engine |

Dynamic dependencies that remain (unavoidable):

- `libmpi.so` / `libmpi_mpifh.so` — MPI ABI is interconnect-specific
- `libgfortran.so` — Fortran runtime
- `libgomp.so` — OpenMP runtime
- `libgcc_s.so`, `libquadmath.so` — GCC support libraries
- `libc.so`, `libm.so` — standard C library

The Python wheel bundles `libgfortran`, `libquadmath`, and `libgcc_s` so that the wheel is self-contained. MPI must be provided by the host environment.

## Using in Rust

```rust
use cp2k_rs::{init, finalize, ForceEnv};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    init()?;

    let mut force_env = ForceEnv::new("input.inp", "output.out")?;
    force_env.calc_energy_force()?;

    let energy = force_env.get_potential_energy()?;
    let forces = force_env.get_forces()?;

    println!("Energy: {:.10} Ha", energy);
    println!("Forces: {:?}", forces.as_slice().unwrap());

    finalize()?;
    Ok(())
}
```

### MPI-parallel Rust example

```bash
module load mpi/openmpi-x86_64
cargo build --release --example rust_mpi_example --features mpi,extended,build-cp2k

export CP2K_DATA_DIR=$(find target/release/build -name "data" -type d | head -1)
export OMP_NUM_THREADS=2
mpirun -np 4 -x CP2K_DATA_DIR -x OMP_NUM_THREADS \
    ./target/release/examples/rust_mpi_example
```

The `-x` flag is required to propagate `CP2K_DATA_DIR` to all MPI ranks.

### Extended DFT interface

With the `extended` feature, `ForceEnv` gains additional methods:

```rust
// Returns (homo, lumo, homo_idx, lumo_idx)
let (homo, lumo, homo_idx, lumo_idx) = force_env.get_homo_lumo(1)?;

// Returns (nsteps, converged, energy_change)
let (nsteps, converged, delta_e) = force_env.get_scf_info()?;

// Mulliken charges, number of electrons, number of MOs
let charges = force_env.get_mulliken_charges()?;
```

## Using in Python

The Python interface runs CP2K in a separate worker process (started automatically via `mpirun`). No `mpi4py` is needed in the calling Python script.

```python
import cp2k_rs

# Start worker with 4 MPI ranks
cp2k_rs.init_cp2k(nproc=4)

fe = cp2k_rs.PyForceEnv("/path/to/input.inp", "/path/to/output.out")
fe.calc_energy_force()

energy = fe.get_potential_energy()
forces = fe.get_forces()          # numpy array, shape (3*natom,)
positions = fe.get_positions()    # numpy array, shape (3*natom,)
cell = fe.get_cell()              # numpy array, shape (3, 3)

# Extended interface (requires 'extended' feature)
homo, lumo, homo_idx, lumo_idx = fe.get_homo_lumo(spin=1)
nsteps, converged, delta_e = fe.get_scf_info()
mulliken = fe.get_mulliken_charges()

cp2k_rs.finalize_cp2k()
```

Run the provided example:

```bash
# Install first: maturin develop --release --features python,mpi,extended,build-cp2k
python examples/h2o_example.py --nproc 4

# On SLURM:
python examples/h2o_example.py --launcher srun -n 4
```

## CP2K data directory

After a successful build, `CP2K_DATA_DIR` is set automatically when using the Python interface. For the Rust binary / MPI examples, set it manually:

```bash
export CP2K_DATA_DIR=$(find target/release/build -name "data" -type d | head -1)
```

## Troubleshooting

### MPI module not loaded

```
error: could not find `mpicc` / `mpifort`
```

Load the MPI environment module before building:

```bash
module load mpi/openmpi-x86_64
```

### Build takes too long / runs out of memory

Reduce the number of parallel jobs:

```bash
NUM_JOBS=4 cargo build --release --features mpi,extended,build-cp2k
```

### Out of disk space

CP2K + dependencies need ~8 GB. Redirect the build directory:

```bash
export CARGO_TARGET_DIR=/path/to/large/disk/target
cargo build --release --features mpi,extended,build-cp2k
```

### ScaLAPACK build fails with gfortran rank-mismatch errors

This is handled automatically by `build.rs` (`-fallow-argument-mismatch` is passed to the ScaLAPACK CMake build). If you see it anyway, ensure you are using the `build-cp2k` feature and not a manually installed ScaLAPACK.

## License

GPL-2.0-or-later (same as CP2K).