upf 0.1.1

Rust library for reading UPF text into typed structs and writing validated UpfData back to UPF
Documentation
# Project Guide

## Purpose

`upf` is a Rust library for working with Unified Pseudopotential Format (UPF)
documents as typed Rust data. The current codebase supports both directions:

- read UPF text into a validated [`UpfData`]../src/model/core.rs structure
- write a validated `UpfData` value back to UPF text

The project is aimed at semantic round-tripping. A document can be parsed,
serialized, and parsed again into the same Rust data model, even if the exact
whitespace or original layout is not preserved.

## Public API

The crate exposes six primary entry points:

- `from_str`: parse a UPF document from a UTF-8 string
- `from_reader`: parse a UPF document from a buffered reader
- `from_file`: parse a UPF document from a file path
- `to_string`: serialize a validated `UpfData` into UPF text
- `to_writer`: serialize a validated `UpfData` into any writer
- `to_file`: serialize a validated `UpfData` to a file path

Parse and write operations use the shared public model type `UpfData` and
return `Result<_, UpfError>`.

## Current architecture

The implementation is organized around serde-based XML mapping rather than a
custom parser pipeline.

### Entry points

- `src/de.rs`
  Read-side APIs. These use `quick_xml::de` to deserialize a full document into
  `UpfData`, then run semantic validation.
- `src/ser.rs`
  Write-side APIs. These validate `UpfData` first, then use `quick_xml::se` to
  serialize it back into UPF text.

### Public model

- `src/model/core.rs`
  Defines the root `UpfData` type, `PP_HEADER`, `PP_MESH`, shared numeric
  arrays, and the central validation logic.
- `src/model/nonlocal.rs`
  Defines `PP_INFO`, `PP_NONLOCAL`, `PP_SEMILOCAL`, `PP_PSWFC`, and related
  nested nodes.
- `src/model/paw.rs`
  Defines PAW-specific sections such as `PP_FULL_WFC`, `PP_PAW`, and
  `PP_AUGMENTATION`.
- `src/model/gipaw.rs`
  Defines GIPAW-specific sections.

### Support code

- `src/error.rs`
  Defines `UpfError` for XML decode/encode, I/O, value parsing, and validation
  failures.
- `src/text.rs`
  Provides helpers for whitespace-delimited numeric fields and UPF boolean
  flags.

## Validation rules

The crate currently enforces a small set of structural invariants in
`UpfData::validate()`:

- `PP_HEADER/@mesh_size` must match the lengths of `PP_R`, `PP_RAB`,
  `PP_LOCAL`, and `PP_RHOATOM`
- `PP_HEADER/@is_paw="T"` requires a `PP_PAW` section
- `PP_HEADER/@has_gipaw="T"` requires a `PP_GIPAW` section

These checks run after deserialization and before serialization, so both read
and write paths enforce the same structural contract.

## Supported UPF sections

The current top-level model covers these sections:

- `PP_INFO`
- `PP_HEADER`
- `PP_MESH`
- `PP_NLCC`
- `PP_LOCAL`
- `PP_SEMILOCAL`
- `PP_NONLOCAL`
- `PP_PSWFC`
- `PP_FULL_WFC`
- `PP_RHOATOM`
- `PP_PAW`
- `PP_GIPAW`

Optional sections are represented as `Option<T>`. Repeated numbered tags such
as `PP_BETA.n`, `PP_CHI.n`, and PAW/GIPAW entry lists are represented with enums
and vectors that match the serialized UPF tags.

## Current scope and limitations

- The code is built around the UPF `2.0.1` structure currently represented in
  `src/model`.
- Serialization aims to produce valid UPF for the supported model, not to
  preserve original comments, formatting, or unknown sections byte-for-byte.
- The crate does not currently preserve unsupported top-level sections.
- Input still needs to be readable by `quick-xml`; the old custom
  normalization/tree pipeline described in previous docs is no longer part of
  the implementation.

## Testing strategy

The repository uses focused inline fixtures in `tests/*.rs` to cover:

- basic parsing of core sections
- file/string/reader read APIs
- file/string/writer write APIs
- semantic round-tripping
- validation failures for inconsistent sections
- PAW, GIPAW, and nonlocal subtree coverage

## Abbreviation glossary

- `UPF`: Unified Pseudopotential Format
- `PP`: pseudopotential
- `NC`: norm-conserving
- `US`: ultrasoft
- `PAW`: projector augmented wave
- `GIPAW`: gauge including projector augmented wave
- `AE`: all-electron
- `PS`: pseudo
- `WFC`: wavefunction
- `NLCC`: nonlinear core correction
- `RHOATOM`: atomic charge density
- `RAB`: radial integration measure
- `DIJ`: nonlocal projector coupling matrix

## Verification

The current repository verification commands are:

- `cargo fmt --check`
- `cargo clippy --all-targets -- -D warnings`
- `cargo test`
- `cargo doc --no-deps` when public API docs or rustdoc are touched