cargo-buckal 0.1.3

# Cache system (cargo-buckal)

This document describes the current cache behavior in `cargo-buckal` as implemented in the
Rust sources under `cargo-buckal/src/`.

## Overview

`cargo-buckal` keeps a small, persistent snapshot of the last Cargo dependency graph it saw.
The snapshot is used to compute a diff on the next run so it only updates BUCK files for
packages that actually changed.

The cache is intentionally simple:

- It records a fingerprint per `cargo_metadata::Node` (dependency graph node).
- It is stored as a TOML file in the Buck2 repo root.
- It is versioned, and incompatible versions are ignored (no automatic migration).

## Cache file location

The cache file is written to the Buck2 project root as:

```
<buck2-root>/buckal.snap
```

The Buck2 root is discovered at runtime via `buck2 root`.

## File format

The cache is serialized as pretty TOML with a generated header comment. The top-level
structure is:

- `version`: schema version (currently `2`).
- `fingerprints`: a map of `PackageId -> fingerprint`.

Each `fingerprint` is a 32-byte BLAKE3 digest, hex-encoded as a string.

Example (shape only):

```toml
# @generated by `cargo buckal`
# Not intended for manual editing.
version = 2

[fingerprints]
"path+file://($WORKSPACE)/crates/foo#foo@0.1.0" = "...hex..."
"registry+https://github.com/rust-lang/crates.io-index#serde@1.0.196" = "...hex..."
```

## Fingerprints

For each `cargo_metadata::Node`, the cache stores:

- `fingerprint = BLAKE3(bincode(Node))`

This means the cache changes whenever Cargo metadata for a node changes (dependencies,
features, targets, etc.).

## Workspace canonicalization

To keep the cache portable across machines and directories, `PackageId` values that point
into the workspace are canonicalized:

- Stored in cache: `path+file://($WORKSPACE)/...`
- Resolved at runtime: `($WORKSPACE)` is replaced with the actual workspace root

This conversion happens when writing (`canonicalize`) and when diffing (`resolve`).

## Versioning and invalidation

The cache schema is versioned via `CACHE_VERSION` in `cache.rs`.

- Current version: `2` (introduced for multi-platform support).
- If the cache file is missing or has a version mismatch, it is ignored and rebuilt.
- There is no migration step; correctness is preferred over reuse.

## Lifecycle in commands

### Read path

- `get_last_cache()` attempts to load the cache.
- If loading fails (missing or invalid), it falls back to a fresh snapshot from
  `cargo metadata`.

### Write path

After applying updates, a new cache snapshot is always written.

### Commands that use the cache

- `cargo buckal migrate`:
  - Uses the cache by default.
  - `--no-cache` forces a clean run by starting from an empty cache.
- `cargo buckal add`, `cargo buckal update`, `cargo buckal remove`:
  - Load the last cache, run the Cargo command, compute a diff, apply it, and save.

### Diff behavior

The diff compares cached fingerprints to the new snapshot:

- Present in new, missing in old: `Added`
- Present in old, missing in new: `Removed`
- Present in both but fingerprint changed: `Changed`

These changes drive BUCK generation and vendor directory cleanup.

## In-process cfg cache (platform mapping)

Separately from the on-disk cache, platform mapping uses an in-memory cache of
`rustc --print=cfg --target <triple>` results. This cache is per-process and initialized
once (via `OnceLock`) to avoid repeated `rustc` invocations when evaluating platform
predicates.

This cache is not persisted and does not affect `buckal.snap`.

## Troubleshooting notes

- If you see unexpected full regeneration, check whether `buckal.snap` is missing or has an
  older `version` value.
- If you move the workspace, the `($WORKSPACE)` placeholder allows the cache to remain valid.
- If you want to force a clean run, use `cargo buckal migrate --no-cache`.