captrack 0.1.0 - Docs.rs

# Changelog

All notable changes to `captrack` will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).

## Unreleased

## 0.1.0 — 2026-07-03

First release. Everything below shipped in this single cut — nothing has
previously been published to crates.io.

### Public API — macros (17 total)

13 base macros plus 4 added during development:

- `tvec!("name", cap)` — `Vec<T>` (unified, zero-overhead off-feature)
- `tvecdeque!("name", cap)` — `VecDeque<T>`
- `tbtreemap!("name", cap)` — `BTreeMap<K,V>` (cap hint accepted, ignored)
- `tbtreeset!("name", cap)` — `BTreeSet<T>` (cap hint accepted, ignored)
- `tbytesmut!("name", cap)` — `bytes::BytesMut` (requires `bytes` crate or `telemetry`)
- `tfxmap!("name", cap[; hasher])` — `std::HashMap<K,V,S>` with `;`-arm per-call hasher override
- `tfxset!("name", cap[; hasher])` — `std::HashSet<T,S>` with `;`-arm override
- `tmap!("name", cap[; hasher])` — `indexmap::IndexMap<K,V,S>` with `;`-arm override
- `tset!("name", cap[; hasher])` — `indexmap::IndexSet<T,S>` with `;`-arm override
- `tdashmap!("name", cap[; hasher])` — `dashmap::DashMap<K,V,S>` with `;`-arm override
- `tsccmap!("name", cap[; hasher])` — `scc::HashMap<K,V,S>` with `;`-arm override
- `tsccset!("name", cap[; hasher])` — `scc::HashSet<T,S>` with `;`-arm override
- `tscctree!("name", cap)` — `scc::TreeIndex<K,V>` (cap hint accepted, ignored)
- `tstring!("name", cap)` — `String`
- `tbinaryheap!("name", cap)` — `BinaryHeap<T>`
- `thashbrownmap!("name", cap[; hasher])` — `hashbrown::HashMap<K,V,S>` (requires `hashbrown` feature)
- `tsmallvec!("name", cap)` — `smallvec::SmallVec<[T; N]>` (requires `smallvec` feature)

`tvec!`/`tvecdeque!`/`tbtreemap!`/`tbtreeset!`/`tfxmap!`/`tfxset!`/`tstring!`/`tbinaryheap!`
are **unified** — a single `macro_rules!` arm delegates to a `#[cfg]`-branched free
ctor function in `src/ctor.rs`. The remaining, optional-dependency macros
(`tbytesmut!`, `tmap!`, `tset!`, `tdashmap!`, `tsccmap!`, `tsccset!`, `tscctree!`,
`tsmallvec!`, `thashbrownmap!`) retain dual `#[cfg]` arms directly in `lib.rs` so
their expansion resolves against the consumer's own dependency graph rather than
captrack's.

### `t*_owned!` macro family (initial-cap-only, 10 macros)

`tvec_owned!`, `tvecdeque_owned!`, `tbytesmut_owned!`, `tfxmap_owned!`,
`tfxset_owned!`, `tmap_owned!`, `tset_owned!`, `tdashmap_owned!`,
`tsccmap_owned!`, `tsccset_owned!`.

Each always returns the **bare** collection type (never `Tracked*`) and records
just the requested initial capacity as a single sample — no Drop-time peak
tracking, no wrapper overhead at runtime beyond that one recorded sample. No
`_owned` sibling exists for `tbtreemap!`/`tbtreeset!`/`tscctree!` since those
types have no `with_capacity` constructor.

### Public API — types and traits

- `trait IntoInner` — converts `TrackedX` (or its off-feature alias) to the inner bare type
  without requiring `S: Default` or `S: Clone`; uses `ptr::read` + `mem::forget` internally.
- `struct SampleStats` with fields `count`, `min`, `max`, `mean`, `median`, `p95`, `p99`,
  `stddev`; constructed via `SampleStats::from_samples(&[usize]) -> Option<SampleStats>`.
- `fn dump_capacity_stats(path: impl AsRef<Path>) -> std::io::Result<()>` — writes a sorted
  JSON report (`version`, `stats[]`) in telemetry mode; no-op stub in off-feature mode.
- `trait CapInspect` (`src/cap_inspect.rs`) — `cap_inspect_at(&self, name, file, line, column)`,
  one impl per tracked type. Records a consumption-point sample (not a creation) against the
  binding's construction-site location. Used by the `captrack-pgo-lint` Dylint plugin at
  by-value-escape positions (return, struct field init, function argument, type-ascribed `let`
  init) where a `wrap_from` rewrite would trip `E0308`. Calling `cap_inspect_at` for a call-site
  that was never registered (e.g. constructed in non-instrumented code) is a silent no-op in
  release, a `debug_assert` in debug — it never panics in production.

### Axis 1 — `telemetry` feature (on/off)

- Off (default): every macro expands to the bare constructor; compiler sees no extra code.
- On: macros return `Tracked*` wrapper structs; a global lock-free registry keyed by
  `(file, line, column)` call-site location collects samples.
- All 17 tracked types implement `Deref`/`DerefMut`, `Drop`, `IntoIterator`,
  `From<TrackedX> for BareX`, `IntoInner`, and (Phase K) `wrap_from(inner, name, file, line,
  column)` — wraps an already-constructed value without extra allocation, the universal
  instrumentation path used by the Dylint plugin.

**Axis 1B — off-feature mirror features (alias-only, no telemetry overhead)**

- `bytes`, `indexmap`, `dashmap`, `scc`, `smallvec`, `hashbrown` feature flags expose `TrackedX`
  as a type alias to the underlying bare type so consumer code compiles without `#[cfg]` guards.

### Axis 2 — hasher choice

**2A — global default (`CapHasher`, `src/hasher.rs`):**

- Default: `std::collections::hash_map::RandomState`
- `fxhash` → `fxhash::FxBuildHasher`
- `ahash` → `ahash::RandomState`
- `foldhash` → `foldhash::fast::RandomState`
- `rustc-hash` → `rustc_hash::FxBuildHasher`
- Selecting two hasher features simultaneously triggers a `compile_error!`.

**2B — per-call override:** the 7 hash-keyed macros (`tfxmap!`, `tfxset!`, `tmap!`, `tset!`,
`tdashmap!`, `tsccmap!`, `tsccset!`) accept an optional `; hasher_expr` arm to inject a different
hasher at a single call-site without changing the global default.

**2C — `declare_collections!` proc-macro** (companion crate `captrack-macros`):

- `captrack::declare_collections! { hasher = MyHasher, prefix = my }` generates 13
  `macro_rules!` (`my_vec!`, `my_map!`, …) that delegate to `::captrack::t*!` with the
  named hasher injected via the `;`-arm. Per-call `; hasher` override in generated macros is
  preserved. Needed because stable `macro_rules!` can't emit `$`-metavariables (no `$$` on
  stable), so the expansion has to be generated by a real proc-macro.

### Axis 3 — clippy enforcement

- `clippy.toml.example` — full disallowed-methods ban list covering all bare constructors
  for every tracked type.
- All captrack macro expansions include `#[allow(clippy::disallowed_methods,
  clippy::disallowed_types)]` so consumer-level bans never fire on generated code.

### Registry internals

- Key is `(&'static str, u32, u32)` — `(file, line, column)` captured via `file!()`, `line!()`,
  `column!()` in each macro. Each distinct source location is one independent registry entry;
  the `name` string carried in the macro literal is a human label only, fixed at first insert.
- **Three distinct telemetry numbers, three meanings:**
  - `creation_count` — `+1` per CONSTRUCTION of an instance (fires from `with_capacity_named` /
    `wrap_from`). A binding constructed once has `creation_count == 1` regardless of how many
    times its consumption points are inspected.
  - `samples` — the reservoir-bounded snapshot of capacity/length observations
    (`samples.len() ≤ CAPTRACK_SAMPLE_CAP`).
  - `total_observed` — the true count of every sample ever recorded for the site, including
    those evicted by the reservoir.
- **Bounded reservoir sampling** — Vitter's Algorithm R bounds memory to `CAPTRACK_SAMPLE_CAP`
  (default 4096) statistically-representative samples per site, backed by an `AtomicU64
  seen_count` tracking the true population size. All registry operations are lock-free except
  the reservoir's inner `Mutex<Vec<usize>>`, which only sees contention on Drop / `cap_inspect_at`
  / construction — never a hot inner loop.
- **Capacity- vs length-based sample sources.** Every `Drop` (and `IntoIterator::into_iter`,
  which pre-empts Drop) pushes one sample: capacity-based collections (`Vec`, `VecDeque`,
  `String`, `HashMap`, `HashSet`, `IndexMap`, `IndexSet`, `BytesMut`, `SmallVec`,
  `hashbrown::HashMap`, `BinaryHeap`) push `inner.capacity()` — monotonically non-decreasing,
  so the final value equals the peak. Length-based collections (`BTreeMap`, `BTreeSet`,
  `DashMap`, `scc::HashMap`, `scc::HashSet`, `scc::TreeIndex` — none of these expose a
  `capacity()` method) push `inner.len()` at Drop time — **not** the peak if the collection was
  drained or partially cleared first.
- `dump_capacity_stats` drains samples via `pop_all`, serialises, then pushes values back
  (registry survives repeated dump calls). Entries sorted by `max(samples)` descending.
- Autodump writes are crash-resilient: atomic write via `.tmp` + rename, PID- and
  process-start-time-qualified filenames and tmp paths so concurrent processes of the same
  compiled binary (e.g. `cargo-nextest`'s one-process-per-test model) never collide or corrupt
  each other's output, plus a `DUMP_LOCK` mutex serializing the periodic autodump thread against
  the atexit destructor within one process.

### `captrack-pgo` — profile-guided capacity optimization CLI

Separate bin-crate in the workspace. Full pipeline, one subcommand per step:

`wire → instrument → cargo bench/test × N → merge → analyze (optional) → apply → undo`,
plus `uninstrument` / `unwire` to revert instrumentation, and a one-command `measure`
orchestration.

**Subcommands:**

- `wire` / `unwire` — patch every workspace member's `Cargo.toml` to add the captrack dep
  and `telemetry` feature; `unwire` reverts.
- `instrument` / `uninstrument` — run the `captrack-pgo-lint` Dylint plugin in instrumentation
  mode, wrapping bare constructors in `wrap_from(...)` (Phase K) and injecting `cap_inspect_at`
  calls at by-value-escape points the wrapper rewrite can't reach (Phase L). Covers `--all-targets`
  (lib, bins, tests, benches, examples) — constructors inside `#[cfg(test)]` modules and
  `tests/*.rs` integration tests are instrumented too.
- `merge` — group profile entries by `(file, line, column)`, sum `creation_count`, concatenate
  `samples`, then optionally reservoir-sample down to `--reservoir-cap` via a per-site seeded
  LCG (deterministic across runs). Accepts glob patterns for `--inputs`.
- `analyze` — `SiteShape` distribution classifier (`UnimodalTight` / `UnimodalSpread` / `Bimodal`
  / `HeavyTail` / `MostlyZero` / `InsufficientData`) with a per-shape `PolicyOverride`
  recommendation; `--write-policy` injects the recommendation back into the profile JSON as a
  per-site `policy` field.
- `apply` — runs `cargo dylint --fix` with the `CAPTRACK_PGO_CAPACITY` lint, rewriting matched
  constructors to `with_capacity(N)` (or the hasher-bearing form, see below). Snapshots files
  before/after and writes a manifest for `undo`.
- `undo` — restores `content_before` from the manifest after verifying the current file still
  matches `sha256_after` (refuses if the file was edited after `apply`).
- `measure` — one-command orchestration of `wire → instrument → bench×N → merge →
  uninstrument → unwire`, RAII-guarded via `CleanupGuard` (`disarm()` on success, reverts on
  panic), with `cargo metadata`-driven bench-crate auto-detection. Each bench run gets its own
  dump subdirectory (cleared before the run) rather than a predictable filename, since the
  autodump filename now embeds PID and process-start-time and can't be guessed up front.
- `string-reuse` — see the standalone lint section below.

**Capacity policy knobs** (`--cap-from`, `--cap-mul`, `--cap-round` on `apply`):

| Flag | Env var | Values | Default |
|---|---|---|---|
| `--cap-from` | `CAPTRACK_PGO_CAP_FROM` | `max` \| `mean` \| `median` \| `p95` \| `p99` | `p95` |
| `--cap-mul` | `CAPTRACK_PGO_CAP_MUL` | float > 0 | `1.0` |
| `--cap-round` | `CAPTRACK_PGO_CAP_ROUND` | `pow2` \| `to8` \| `exact` | `pow2` |

Formula: `cap = round_mode(source_statistic × cap_mul)`. Defaults reproduce the original
`next_pow2(p95)` formula exactly; default-variant values are omitted from the forwarded
environment so the plugin's own defaults match. Per-site `policy` fields in the profile JSON
(as written by `analyze --write-policy`) override individual globals for that one site —
this now round-trips correctly through the profile loader (see Fixed, below). Invalid
`--cap-mul` (≤ 0 or NaN) is rejected at pre-flight with a clear error before touching files.

**Hasher swap** (`--hasher <fx|ahash|foldhash|none>` on `apply`):

When set, matched `HashMap`/`HashSet`/`IndexMap`/`IndexSet`/`DashMap`/`scc::HashMap`/
`scc::HashSet`/`hashbrown::HashMap` constructors are additionally upgraded to
`with_capacity_and_hasher(N, <hasher_default_expr>)`:

| `--hasher` | Replacement expression |
|---|---|
| `fx` | `::fxhash::FxBuildHasher::default()` |
| `ahash` | `::ahash::RandomState::new()` |
| `foldhash` | `::foldhash::fast::RandomState::default()` |
| `none` | no hasher change (default) |

- `HashMap::new()` / `HashMap::with_capacity(K)` → `with_capacity_and_hasher(N, <expr>)`.
- `with_capacity_and_hasher(K, h)` where `h` is one of the three known defaulted expressions →
  replaced (idempotent); a custom `h` is preserved and only `K` is replaced.
- `Vec`, `VecDeque`, `BTreeMap`, `BTreeSet` — `--hasher` is silently ignored (not hash-keyed).
- **Phase N — multi-span suggestion for type-ascribed lets.** `let m: HashMap<K, V> = HashMap::new();`
  would `E0308` if only the constructor were rewritten (the ascription pins `S = RandomState`).
  The lint emits a `multipart_suggestion` that atomically rewrites both the ascription's generic
  arg list and the constructor, so `cargo fix` applies both edits together.
- **Phase O — `HasherKind` classifier (already-fast detection).** When the ascription pins a
  hasher, the snippet is classified as `FastKnown` (already `fxhash::FxBuildHasher`,
  `ahash::RandomState`, `foldhash::fast::RandomState`, `rustc_hash::FxBuildHasher`, or
  `BuildHasherDefault<FxHasher>` — skip the swap, capacity-only rewrite, avoids churn in
  workspaces already on a fast hasher), `SlowDefault` (explicit `RandomState` — skip the swap,
  emit a nudge to remove the explicit hasher instead), or `Unknown` (user-defined hasher — skip
  the swap).

**`apply` recognizes empty `vec![]` sites.** `vec![]` expands to `Vec::new()`, so a profiled
`let mut v = vec![];` grown via `.push()` in a hot loop now gets the same
`Vec::with_capacity(N)` suggestion a hand-written `Vec::new()` would. Detection is structural
(the resolved callee must literally be `Vec::new`), so `vec![x; n]` (resolves to `from_elem`)
and `vec![a, b, c]` (resolves to `into_vec`) are correctly left untouched — their capacity is
already fixed by the macro's arguments.

### `CAPTRACK_PGO_STRING_REUSE` lint

A standalone Dylint pass, independent of the capacity profile, that detects a `String` binding
fully reassigned inside a loop —

```rust
let mut s = String::new();
for item in items {
    s = format!("{}-{}", prefix, item);  // old buffer dropped, fresh one allocated every iteration
    consume(&s);
}
```

— and suggests reusing its heap buffer instead:

```rust
{ s.clear(); s.push_str(&(format!("{}-{}", prefix, item))); }
```

(`Applicability::MaybeIncorrect`, so `cargo fix` won't auto-apply it — review with `--dry-run`
first). Fires only when the RHS provably does not reference the old value of `s` (the critical
soundness guard: `clear()`-ing before evaluating a self-referential RHS would change semantics).
Nested reassignment (inside `if`/`match`), multiple reassignment sites, and non-`String` types
are accepted false negatives — the lint never false-positives.

CLI: `captrack-pgo string-reuse [--workspace <dir>] [--lint-path <dir>] [--dry-run]
[--allow-dirty]`; writes a revertible manifest (`target/captrack-pgo/last-string-reuse.json`).

### Design history: Dylint-based `apply` supersedes an earlier syn-based prototype

An early prototype of `captrack-pgo apply` (plus `propose` / `auto`) used a syn-based
source-matching pipeline. It was replaced during development by the current Dylint-based
`apply`, which operates on rustc's HIR after type-checking and has no false negatives for
type aliases, `Default::default()` calls, or constructors inside macro expansions — gaps the
syn matcher could not close. Since nothing was ever published, this is a design-history note
rather than a breaking change: the syn-based commands never shipped and no external caller
depended on them.

### Fixed

- **Autodump filename collisions under `cargo-nextest`.** `default_dump_path()` originally
  derived the destination filename from `current_exe()`'s stem alone, which is identical across
  every OS process running the same compiled test/bench binary. Under `cargo-nextest`'s
  one-process-per-test model this meant N processes raced to write (and silently clobber) the
  same file — observed losing entire crates' worth of samples. Fixed in two steps: first by
  qualifying the filename with the writer's PID, then (after discovering Windows recycles PIDs
  within a single profiling run, causing ~16% of dumps to still collide) by adding the process
  start time as well: `profile-<stem>-<pid>-<start_ms>.json`.
- **Concurrent same-binary processes could corrupt the dump file.** The atomic-write `.tmp`
  intermediate path was unqualified (`<path>.tmp`), so two racing processes' writes to the same
  tmp file could interleave into a corrupt, concatenated JSON document before the rename. Fixed
  by qualifying the tmp path with the writer's PID and adding a `DUMP_LOCK` mutex serializing
  the periodic autodump thread against the atexit destructor within one process.
- **`TrackedString` missing `AsRef<str>` / `AsRef<[u8]>`.** Broke generic-bound call sites like
  `tokio::fs::write(path, tracked_string)`. Both impls added.
- **`captrack-pgo instrument` was missing `--all-targets`.** `lint_apply.rs` already passed it
  (so the capacity rewrite covered every compilation unit), but `lint_instrument.rs` did not —
  cargo defaulted to lib+bin targets only, so constructors inside `#[cfg(test)]` modules and
  `tests/*.rs` were never instrumented and could never appear in a profile regardless of test
  coverage. Fixed (also applied to `lint_string_reuse.rs` for the same reason).
- **`analyze --write-policy` output was silently discarded on load.** The profile loader
  hard-coded `policy: None` regardless of what was actually in the dump JSON, so per-site policy
  overrides written by `analyze` never reached `apply`'s rules engine. The loader now threads
  `SitePolicy` through the parsed profile entry.
- **`measure`'s per-bench dump discovery broke after the PID-qualified filename fix**, since it
  expected a predictable `profile-<bench>.json` path that no longer exists. Fixed by giving each
  bench run its own dump subdirectory instead of guessing a filename.
- **CI was fully red on GitHub Actions despite passing locally**, for three independent reasons,
  all fixed: the MSRV job choked on a `Cargo.lock` v4 lockfile format and edition-2024
  transitive deps (`indexmap`, `getrandom`) incompatible with the pinned 1.74 toolchain
  (lockfile pinned back to v3, deps pinned to edition-2021-safe versions); the `test-tools` job's
  `run_lint_apply`/`run_lint_instrument` pre-flight checked `cargo-dylint` availability before
  cheaper non-fatal checks, so runners without `cargo-dylint` installed failed even on
  `--dry-run` code paths designed to work without it (reordered so the dylint check runs last,
  and is skipped entirely in `--dry-run` mode); and a flaky test in `autodump.rs` that read/wrote
  `CAPTRACK_DUMP_DIR` without taking the module's `ENV_LOCK`, causing intermittent cross-test
  interference under parallel test execution (fixed by taking the lock, matching every other
  env-mutating test in the file). CI also gained a `test-tools` job — previously `cargo test` at
  the workspace root only exercised the root `captrack` package, so `captrack-pgo` and
  `captrack-macros` never ran in CI at all.
- **Third-party/std type recognition robustness in `captrack-pgo-lint`** (#354): in some
  nightlies the `#[rustc_diagnostic_item]` for `String` is not present on the struct itself (only
  on individual methods), so diagnostic-item matching alone missed it. Fixed by adding a
  path-string fallback (`"alloc::string::String"` / `"std::string::String"`) ahead of
  `match_third_party_path`. The same session also fixed a Windows-specific `compiletest_rs`
  pipe race that cascade-failed the `instrument` / `suggest_hasher` / `ui` UI-test trio (poisoned
  mutex recovery + `catch_unwind` isolation per test), applied to both `tests/ui_test.rs` and
  `tests/per_type.rs`.

### Internal architecture

- `src/ctor.rs` — `#[cfg]`-branched `#[inline(always)]` free ctor functions backing the unified
  std macros; off-feature variant folds to a bare constructor with zero overhead.
- `src/aliases.rs` — off-feature `TrackedX = BareX` type aliases for source-level symmetry.
- `From<TrackedX> for BareX` impls use `unsafe { ptr::read(&self.inner) }` +
  `mem::forget(self)` to move the inner value without `S: Default` or `S: Clone` bounds.
- `captrack-pgo-lint` is a Dylint plugin (nightly-only, pinned toolchain in
  `captrack-pgo-lint/rust-toolchain.toml`); `captrack-pgo` itself remains stable-only and shells
  out to `cargo dylint`.