# Architecture
Internal design notes for contributors and maintainers. None of this is load-bearing for users of the crate — see [README.md](README.md) and [docs.rs](https://docs.rs/zipatch-rs) for the public API.
## The decision: parse + effects, plus capabilities built on top
The single architectural decision this crate is built around is the **strict separation between parsing and effects**.
Nothing under `src/chunk/` touches the filesystem. Nothing under `src/chunk/` performs I/O against the install tree. Nothing under `src/chunk/` decompresses anything. The parse layer is pure data — bytes in, structured values out.
Filesystem effects and decompression live under `src/apply/`. `Chunk::apply(&mut ApplySession)` is the bridge method that dispatches each chunk variant to its apply-side logic. The apply layer is the only place that calls into `Vfs` (the I/O abstraction) and `flate2` (the DEFLATE decoder).
Everything else in this crate is either:
1. A **capability built on top** — `src/index/` (build a plan from one or more patches; apply by region rather than chunk-by-chunk) and `src/verify/` (hash-check files against a `RepairManifest`). These compose parse and apply into higher-level workflows.
2. A **supporting decision** that exists to make parse + apply ergonomic — `Vfs`, the `ApplyConfig` / `ApplySession` typestate split, the error-domain split, `#[non_exhaustive]` on plan-model types, the tracing stability contract. See "Supporting decisions" below.
This framing tells you where new code goes:
- Inspects patch bytes without touching the filesystem? → `src/chunk/`.
- Touches the filesystem or decompresses? → `src/apply/`.
- Composes the two into a higher-level workflow (planning, verification, repair, dump, …)? → its own top-level module, like `src/index/` and `src/verify/`.
### Parse — `src/chunk/`
Houses the `Chunk` enum and its sub-types (`SqpkCommand`, `SqpkFile`, etc.), plus `ZiPatchReader` — the streaming iterator that parses the binary wire format frame by frame. Most chunk variants are `#[derive(BinRead)]` structs from the `binrw` crate; a handful use inline `u32::from_be_bytes` / `from_le_bytes` for single-primitive reads or perf-critical hot paths (notably `parse_sqpk_add_data_fast`, which is the byte-volume hot path for game data).
The sole entry point from the outside is `ZiPatchReader::next_chunk()`, which returns `Option<ChunkRecord<Chunk>>`. `ChunkRecord` carries the parsed chunk plus byte-accounting fields.
Wire format per chunk:
```
[body_len: u32 BE][tag: 4 bytes][body: body_len bytes][crc32: u32 BE]
```
CRC32 is computed over `tag ++ body` (not over `body_len`). The parser verifies CRC32 on every chunk before returning it to the caller, gated by `ZiPatchReader::verify_checksums`.
### Effects — `src/apply/`
Adds the two capabilities the parser intentionally omits: filesystem I/O (via the `Vfs` trait) and DEFLATE decompression (via `flate2`). `Chunk::apply(&mut ApplySession)` dispatches each chunk variant to its apply-side logic.
The apply layer splits into two complementary types — see "Supporting decisions" below for why:
- `ApplyConfig` — frozen configuration (install root, platform, `Vfs` backing, observer, checkpoint sink). Performs no I/O. Constructed on the caller's thread and shipped to a worker thread for the actual apply.
- `ApplySession` — runtime state (open file-handle cache, path caches, reusable DEFLATE decompressor, per-chunk progress counters). Created by consuming an `ApplyConfig` via `into_session()`.
### Capabilities built on top
**`src/index/`** — the indexed pipeline. Builds a `Plan` by streaming one or more patches without applying them, then executes the plan against a target install. Lets the launcher coalesce multiple patches into a single per-target write pass, parallelize across targets, and resume from a `Checkpoint`.
**`src/verify/`** — hash verification. Walks an install root, computes hashes, and produces a `VerifyOutcome` against an `ExpectedHash` set. Used post-apply and for offline integrity checks.
**Siblings, not a stack.** `index/` and `verify/` are peers: both depend on `parse + apply`, and neither depends on the other. There is no `use crate::index` in `verify/` and no `use crate::verify` in `index/` — a grep confirms it, and the rule is load-bearing, not incidental. The formal positioning is:
```
parse (chunk/)
└─ apply (apply/)
├─ index/ (consumes parse + apply)
└─ verify/ (consumes apply)
```
This is the rule for adding new capabilities: they sit alongside `index/` and `verify/`, not nested inside one. A hypothetical `diff/` module comparing two installs would be `src/diff/`, not `src/verify/diff/` or `src/index/diff/`. The flat module list at the top of `src/` is meaningful precisely because each name is one capability — keep it that way and the architecture stays self-explaining; let things nest and "where does this go" gets answered ad-hoc until the structure is illegible.
Feature-gating these capabilities behind Cargo features is an open option, not a decision the architecture rests on. Today they are always compiled in. If a consumer ever shows up that only needs parse + apply (e.g. a future `gaveloc-patcher` variant that pulls plans from a server instead of building them locally), `index = []` and `verify = []` features can be added without disturbing the structural rule above — the modules stay siblings whether they are gated or not.
Note: `index/verify.rs` (plan-level integrity check internal to `index/`) is a different thing from the top-level `verify/` module (install-root hash verification). The shared name is unfortunate; the responsibilities are distinct — one checks that a plan was applied correctly given its expected hashes, the other walks an install and computes hashes against an external manifest. If the collision starts causing reader confusion, rename `index/verify.rs` (e.g. to `index/integrity.rs`) rather than renaming the top-level module.
## Supporting decisions
These are not co-equal architectural pillars with parse/apply. They exist because the parse/apply split, taken alone, would leave specific ergonomic or correctness problems unsolved.
**`Vfs` abstraction.** Routes all filesystem effects through a trait rather than calling `std::fs` directly. *Solves:* how to test the effects layer hermetically. `InMemoryFs` makes apply-layer tests fast (no temp dirs) and deterministic (no real filesystem state). The synchronous trait surface is a deliberate choice — the apply hot path is dominated by DEFLATE decompression and blocking syscalls, both fundamentally synchronous, and keeping the trait sync avoids pulling an async runtime into the dependency graph.
**`ApplyConfig` / `ApplySession` typestate split.** *Solves:* how to construct apply state on a UI thread and ship it to a worker without `Arc<Mutex<_>>`. `ApplyConfig` is `Send` and immutable; `ApplySession` is the mutable runtime state created from it via `into_session()`. The typestate guarantees the immutable phase is fully done before any I/O begins.
**Error domain split.** `ParseError`, `ApplyError`, `IndexError`, and `VerifyError` each cover their own domain rather than collapsing into one enum. *Solves:* callers pattern-matching at the right granularity — a parser caller cares about `TruncatedPatch` and `ChecksumMismatch`; an apply caller cares about `Vfs` failures and CRC verification at write time; they are different concerns. The umbrella `Error` enum with `From` impls for each domain type provides a single `?`-friendly exit point for callers who don't need the distinction.
**`#[non_exhaustive]` on plan-model types.** All public plan-model types (`Plan`, `Target`, `Region`, `PartSource`, `PartExpected`, `FilesystemOp`, checkpoint types) are `#[non_exhaustive]`. *Solves:* adding fields (e.g. per-region provenance metadata) in future minor versions without a breaking change. The cost is requiring `::new()` constructors for external callers who want to construct them. Used only where the additive-fields case is plausible.
**Tracing stability contract.** Span and field names at `info!` and `debug!` levels are treated as a public API. They live in a single `tracing_schema` module so a rename lands in one file. `trace!` names are best-effort. See the "Tracing" section of the crate-level rustdoc for the full catalog.
## Repository layout
The repo is a Cargo workspace with two members:
- **`zipatch-rs`** at the repo root — the library (parse + apply + index + verify). Published to crates.io as `zipatch-rs`. This is what `[package]` describes in the root `Cargo.toml`.
- **`zipatch-cli/`** — the `zipatch` binary (a thin clap wrapper over the library, for ad-hoc inspection of `.patch` files). Versioned independently so it can ship on its own cadence without dragging the library along; the library has no compile- or runtime dependency on the CLI.
## Module map
```
/ (workspace root + zipatch-rs lib package)
├── Cargo.toml — [workspace] + zipatch-rs [package]
├── src/
│ ├── lib.rs — crate root, re-exports, Platform enum, apply_patch_file
│ ├── error.rs — ParseError, ApplyError, IndexError, VerifyError, Error umbrella
│ ├── newtypes.rs — PatchIndex, ChunkTag, SchemaVersion
│ ├── tracing_schema.rs — stable span/event name constants (pub(crate))
│ ├── test_utils.rs — test fixtures (test-utils feature, doc(hidden))
│ │
│ ├── chunk/ — parse: wire-format → Chunk enum, no I/O
│ │ ├── mod.rs — Chunk enum, ChunkRecord, ZiPatchReader, open_patch
│ │ ├── adir.rs — AddDirectory chunk
│ │ ├── afsp.rs — ApplyFreeSpace chunk (no-op at apply time)
│ │ ├── aply.rs — ApplyOption chunk (sets ignore flags)
│ │ ├── ddir.rs — DeleteDirectory chunk
│ │ ├── fhdr.rs — FileHeader chunk
│ │ ├── util.rs — SqpackFileId, SqpkCompressedBlock helpers
│ │ └── sqpk/ — SQPK sub-commands
│ │ ├── mod.rs — SqpkCommand enum
│ │ ├── add_data.rs — SqpkAddData (A)
│ │ ├── delete_data.rs — SqpkDeleteData (D)
│ │ ├── expand_data.rs — SqpkExpandData (E)
│ │ ├── file.rs — SqpkFile (F) — AddFile, DeleteFile, RemoveAll, MakeDirTree
│ │ ├── header.rs — SqpkHeader (H)
│ │ ├── index.rs — SqpkIndex (I, no-op)
│ │ └── target_info.rs — SqpkTargetInfo (T)
│ │
│ ├── apply/ — effects: Vfs I/O + DEFLATE, ApplyConfig → ApplySession
│ │ ├── mod.rs — ApplyConfig, ApplySession, Chunk::apply dispatch
│ │ ├── cancel.rs — CancelToken
│ │ ├── checkpoint.rs — Checkpoint, CheckpointPolicy, CheckpointSink, SequentialCheckpoint, IndexedCheckpoint
│ │ ├── driver.rs — sequential apply loop (apply_patch, resume_apply_patch)
│ │ ├── observer.rs — ApplyObserver, ChunkEvent, NoopObserver
│ │ ├── path.rs — SqPack path resolution (pub(crate))
│ │ ├── sqpk.rs — SQPK apply logic (pub(crate))
│ │ └── vfs.rs — Vfs trait, StdFs, InMemoryFs
│ │
│ ├── index/ — capability: plan-and-apply pipeline over parse + apply
│ │ ├── mod.rs — public re-exports
│ │ ├── plan.rs — Plan, Target, Region, PartSource, PartExpected, FilesystemOp, TargetPath
│ │ ├── builder.rs — PlanBuilder
│ │ ├── apply.rs — IndexApplier
│ │ ├── source.rs — PatchSource trait, FilePatchSource
│ │ ├── verify.rs — PlanVerifier, RepairManifest
│ │ └── region_map.rs — per-target region accumulator (pub(crate))
│ │
│ └── verify/ — capability: hash verification against ExpectedHash
│ └── mod.rs — HashVerifier, ExpectedHash, FileVerifyOutcome, VerifyOutcome
│
└── zipatch-cli/ — workspace member: the `zipatch` binary
├── Cargo.toml — depends on zipatch-rs as a path dep
├── src/
│ ├── main.rs — clap Cli/Commands + dispatch
│ └── dump.rs — `dump` subcommand implementation
└── tests/
└── dump.rs — integration tests via assert_cmd
```
## Ecosystem context
`zipatch-rs` and `sqpack-rs` are standalone libraries that exist independently of the launcher product. They have no network dependency and no runtime requirements beyond Rust's standard library plus the dependencies listed in `Cargo.toml`.
The launcher product that consumes `zipatch-rs` is a separate workspace. It handles patch discovery, authentication, download, and UI; `zipatch-rs` handles only the binary format parsing and application once bytes are on disk.