zipatch-rs 1.7.0

# Architecture

Internal design notes for contributors and maintainers. None of this is load-bearing for users of the crate — see [README.md](README.md) and [docs.rs](https://docs.rs/zipatch-rs) for the public API.

## The decision: parse + effects, plus capabilities built on top

The single architectural decision this crate is built around is the **strict separation between parsing and effects**.

Nothing under `src/chunk/` touches the filesystem. Nothing under `src/chunk/` performs I/O against the install tree. Nothing under `src/chunk/` decompresses anything. The parse layer is pure data — bytes in, structured values out.

Filesystem effects and decompression live under `src/apply/`. `Chunk::apply(&mut ApplySession)` is the bridge method that dispatches each chunk variant to its apply-side logic. The apply layer is the only place that calls into `Vfs` (the I/O abstraction) and `flate2` (the DEFLATE decoder).

Everything else in this crate is either:

1. A **capability built on top** — `src/index/` (build a plan from one or more patches; apply by region rather than chunk-by-chunk) and `src/verify/` (hash-check files against a `RepairManifest`). These compose parse and apply into higher-level workflows.
2. A **supporting decision** that exists to make parse + apply ergonomic — `Vfs`, the `ApplyConfig` / `ApplySession` typestate split, the error-domain split, `#[non_exhaustive]` on plan-model types, the tracing stability contract. See "Supporting decisions" below.

This framing tells you where new code goes:

- Inspects patch bytes without touching the filesystem? → `src/chunk/`.
- Touches the filesystem or decompresses? → `src/apply/`.
- Composes the two into a higher-level workflow (planning, verification, repair, dump, …)? → its own top-level module, like `src/index/` and `src/verify/`.

### Parse — `src/chunk/`

Houses the `Chunk` enum and its sub-types (`SqpkCommand`, `SqpkFile`, etc.), plus `ZiPatchReader` — the streaming iterator that parses the binary wire format frame by frame. Most chunk variants are `#[derive(BinRead)]` structs from the `binrw` crate; a handful use inline `u32::from_be_bytes` / `from_le_bytes` for single-primitive reads or perf-critical hot paths (notably `parse_sqpk_add_data_fast`, which is the byte-volume hot path for game data).

The sole entry point from the outside is `ZiPatchReader::next_chunk()`, which returns `Option<ChunkRecord<Chunk>>`. `ChunkRecord` carries the parsed chunk plus byte-accounting fields.

Wire format per chunk:

```
[body_len: u32 BE][tag: 4 bytes][body: body_len bytes][crc32: u32 BE]
```

CRC32 is computed over `tag ++ body` (not over `body_len`). The parser verifies CRC32 on every chunk before returning it to the caller, gated by `ZiPatchReader::verify_checksums`.

### Effects — `src/apply/`

Adds the two capabilities the parser intentionally omits: filesystem I/O (via the `Vfs` trait) and DEFLATE decompression (via `flate2`). `Chunk::apply(&mut ApplySession)` dispatches each chunk variant to its apply-side logic.

The apply layer splits into two complementary types — see "Supporting decisions" below for why:

- `ApplyConfig` — frozen configuration (install root, platform, `Vfs` backing, observer, checkpoint sink). Performs no I/O. Constructed on the caller's thread and shipped to a worker thread for the actual apply.
- `ApplySession` — runtime state (open file-handle cache, path caches, reusable DEFLATE decompressor, per-chunk progress counters). Created by consuming an `ApplyConfig` via `into_session()`.

### Capabilities built on top

**`src/index/`** — the indexed pipeline. Builds a `Plan` by streaming one or more patches without applying them, then executes the plan against a target install. Lets the launcher coalesce multiple patches into a single per-target write pass, parallelize across targets, and resume from a `Checkpoint`.

**`src/verify/`** — hash verification. Walks an install root, computes hashes, and produces a `VerifyOutcome` against an `ExpectedHash` set. Used post-apply and for offline integrity checks.

**Siblings, not a stack.** `index/` and `verify/` are peers: both depend on `parse + apply`, and neither depends on the other. There is no `use crate::index` in `verify/` and no `use crate::verify` in `index/` — a grep confirms it, and the rule is load-bearing, not incidental. The formal positioning is:

```
parse (chunk/)
  └─ apply (apply/)
      ├─ index/   (consumes parse + apply)
      └─ verify/  (consumes apply)
```

This is the rule for adding new capabilities: they sit alongside `index/` and `verify/`, not nested inside one. A hypothetical `diff/` module comparing two installs would be `src/diff/`, not `src/verify/diff/` or `src/index/diff/`. The flat module list at the top of `src/` is meaningful precisely because each name is one capability — keep it that way and the architecture stays self-explaining; let things nest and "where does this go" gets answered ad-hoc until the structure is illegible.

Feature-gating these capabilities behind Cargo features is an open option, not a decision the architecture rests on. Today they are always compiled in. If a consumer ever shows up that only needs parse + apply (e.g. a future `gaveloc-patcher` variant that pulls plans from a server instead of building them locally), `index = []` and `verify = []` features can be added without disturbing the structural rule above — the modules stay siblings whether they are gated or not.

Note: `index/verify.rs` (plan-level integrity check internal to `index/`) is a different thing from the top-level `verify/` module (install-root hash verification). The shared name is unfortunate; the responsibilities are distinct — one checks that a plan was applied correctly given its expected hashes, the other walks an install and computes hashes against an external manifest. If the collision starts causing reader confusion, rename `index/verify.rs` (e.g. to `index/integrity.rs`) rather than renaming the top-level module.

## Supporting decisions

These are not co-equal architectural pillars with parse/apply. They exist because the parse/apply split, taken alone, would leave specific ergonomic or correctness problems unsolved.

**`Vfs` abstraction.** Routes all filesystem effects through a trait rather than calling `std::fs` directly. *Solves:* how to test the effects layer hermetically. `InMemoryFs` makes apply-layer tests fast (no temp dirs) and deterministic (no real filesystem state). The synchronous trait surface is a deliberate choice — the apply hot path is dominated by DEFLATE decompression and blocking syscalls, both fundamentally synchronous, and keeping the trait sync avoids pulling an async runtime into the dependency graph.

**`ApplyConfig` / `ApplySession` typestate split.** *Solves:* how to construct apply state on a UI thread and ship it to a worker without `Arc<Mutex<_>>`. `ApplyConfig` is `Send` and immutable; `ApplySession` is the mutable runtime state created from it via `into_session()`. The typestate guarantees the immutable phase is fully done before any I/O begins.

**Error domain split.** `ParseError`, `ApplyError`, `IndexError`, and `VerifyError` each cover their own domain rather than collapsing into one enum. *Solves:* callers pattern-matching at the right granularity — a parser caller cares about `TruncatedPatch` and `ChecksumMismatch`; an apply caller cares about `Vfs` failures and CRC verification at write time; they are different concerns. The umbrella `Error` enum with `From` impls for each domain type provides a single `?`-friendly exit point for callers who don't need the distinction.

**`#[non_exhaustive]` on plan-model types.** All public plan-model types (`Plan`, `Target`, `Region`, `PartSource`, `PartExpected`, `FilesystemOp`, checkpoint types) are `#[non_exhaustive]`. *Solves:* adding fields (e.g. per-region provenance metadata) in future minor versions without a breaking change. The cost is requiring `::new()` constructors for external callers who want to construct them. Used only where the additive-fields case is plausible.

**Tracing stability contract.** Span and field names at `info!` and `debug!` levels are treated as a public API. They live in a single `tracing_schema` module so a rename lands in one file. `trace!` names are best-effort. See the "Tracing" section of the crate-level rustdoc for the full catalog.

## Repository layout

The repo is a Cargo workspace with two members:

- **`zipatch-rs`** at the repo root — the library (parse + apply + index + verify). Published to crates.io as `zipatch-rs`. This is what `[package]` describes in the root `Cargo.toml`.
- **`zipatch-cli/`** — the `zipatch` binary (a thin clap wrapper over the library, for ad-hoc inspection of `.patch` files). Versioned independently so it can ship on its own cadence without dragging the library along; the library has no compile- or runtime dependency on the CLI.

## Module map

```
/                          (workspace root + zipatch-rs lib package)
├── Cargo.toml             — [workspace] + zipatch-rs [package]
├── src/
│   ├── lib.rs             — crate root, re-exports, Platform enum, apply_patch_file
│   ├── error.rs           — ParseError, ApplyError, IndexError, VerifyError, Error umbrella
│   ├── newtypes.rs        — PatchIndex, ChunkTag, SchemaVersion
│   ├── tracing_schema.rs  — stable span/event name constants (pub(crate))
│   ├── test_utils.rs      — test fixtures (test-utils feature, doc(hidden))
│   │
│   ├── chunk/             — parse: wire-format → Chunk enum, no I/O
│   │   ├── mod.rs         — Chunk enum, ChunkRecord, ZiPatchReader, open_patch
│   │   ├── adir.rs        — AddDirectory chunk
│   │   ├── afsp.rs        — ApplyFreeSpace chunk (no-op at apply time)
│   │   ├── aply.rs        — ApplyOption chunk (sets ignore flags)
│   │   ├── ddir.rs        — DeleteDirectory chunk
│   │   ├── fhdr.rs        — FileHeader chunk
│   │   ├── util.rs        — SqpackFileId, SqpkCompressedBlock helpers
│   │   └── sqpk/          — SQPK sub-commands
│   │       ├── mod.rs     — SqpkCommand enum
│   │       ├── add_data.rs       — SqpkAddData (A)
│   │       ├── delete_data.rs    — SqpkDeleteData (D)
│   │       ├── expand_data.rs    — SqpkExpandData (E)
│   │       ├── file.rs           — SqpkFile (F) — AddFile, DeleteFile, RemoveAll, MakeDirTree
│   │       ├── header.rs         — SqpkHeader (H)
│   │       ├── index.rs          — SqpkIndex (I, no-op)
│   │       └── target_info.rs    — SqpkTargetInfo (T)
│   │
│   ├── apply/             — effects: Vfs I/O + DEFLATE, ApplyConfig → ApplySession
│   │   ├── mod.rs         — ApplyConfig, ApplySession, Chunk::apply dispatch
│   │   ├── cancel.rs      — CancelToken
│   │   ├── checkpoint.rs  — Checkpoint, CheckpointPolicy, CheckpointSink, SequentialCheckpoint, IndexedCheckpoint
│   │   ├── driver.rs      — sequential apply loop (apply_patch, resume_apply_patch)
│   │   ├── observer.rs    — ApplyObserver, ChunkEvent, NoopObserver
│   │   ├── path.rs        — SqPack path resolution (pub(crate))
│   │   ├── sqpk.rs        — SQPK apply logic (pub(crate))
│   │   └── vfs.rs         — Vfs trait, StdFs, InMemoryFs
│   │
│   ├── index/             — capability: plan-and-apply pipeline over parse + apply
│   │   ├── mod.rs         — public re-exports
│   │   ├── plan.rs        — Plan, Target, Region, PartSource, PartExpected, FilesystemOp, TargetPath
│   │   ├── builder.rs     — PlanBuilder
│   │   ├── apply.rs       — IndexApplier
│   │   ├── source.rs      — PatchSource trait, FilePatchSource
│   │   ├── verify.rs      — PlanVerifier, RepairManifest
│   │   └── region_map.rs  — per-target region accumulator (pub(crate))
│   │
│   └── verify/            — capability: hash verification against ExpectedHash
│       └── mod.rs         — HashVerifier, ExpectedHash, FileVerifyOutcome, VerifyOutcome
│
└── zipatch-cli/           — workspace member: the `zipatch` binary
    ├── Cargo.toml         — depends on zipatch-rs as a path dep
    ├── src/
    │   ├── main.rs        — clap Cli/Commands + dispatch
    │   └── dump.rs        — `dump` subcommand implementation
    └── tests/
        └── dump.rs        — integration tests via assert_cmd
```

## Ecosystem context

`zipatch-rs` and `sqpack-rs` are standalone libraries that exist independently of the launcher product. They have no network dependency and no runtime requirements beyond Rust's standard library plus the dependencies listed in `Cargo.toml`.

The launcher product that consumes `zipatch-rs` is a separate workspace. It handles patch discovery, authentication, download, and UI; `zipatch-rs` handles only the binary format parsing and application once bytes are on disk.