alloc-chaos 0.1.0

# alloc-chaos

`alloc-chaos` is deterministic allocation-failure testing for Rust.

It fails each observed heap allocation attempt in a test, one at a time, so
libraries can verify that allocation-failure paths return controlled errors
instead of panicking or taking unexpected control-flow paths.

```rust
#[global_allocator]
static GLOBAL: alloc_chaos::ChaosAllocator = alloc_chaos::ChaosAllocator::system();

#[derive(Debug, Clone, Copy, PartialEq, Eq)]
struct OutOfMemory;

fn build_payload(size: usize) -> Result<Vec<u8>, OutOfMemory> {
    let mut bytes = Vec::new();
    bytes.try_reserve_exact(size).map_err(|_| OutOfMemory)?;
    bytes.resize(size, 0);
    Ok(bytes)
}

#[test]
fn payload_builder_handles_oom() {
    alloc_chaos::check(|| {
        match build_payload(4096) {
            Ok(bytes) => assert_eq!(bytes.len(), 4096),
            Err(OutOfMemory) => {}
        }
    })
    .assert_success();
}
```

## What it does

`alloc-chaos` wraps the process global allocator. During a check it first probes
that the wrapper is actually installed, then runs the closure once to count heap
allocation attempts made by the checked thread. It then reruns the same closure
repeatedly, failing allocation attempt `0`, then `1`, then `2`, and so on.

The default checker is intentionally strict:

- the global allocator wrapper must be observed by an explicit probe;
- the baseline run must complete;
- every observed allocation attempt must be selected and executed;
- every selected run must reach and fail the selected allocation attempt;
- every injected run must complete without panic;

The checker itself remains small:

- one active check per process;
- no dependencies;
- no background threads;
- no allocator behavior changes outside active checks;
- allocation, zeroed allocation, and reallocation are all treated as allocation
  attempts;
- deallocation is never failed.

The closure is executed repeatedly. Build the state under test inside the
closure, or otherwise ensure that the closure behaves deterministically across
runs.

## Strict success

The main workflow is deliberately narrow:

```rust
let report = alloc_chaos::check(|| {
    // test body
});

report.assert_success();
```

`assert_success()` is exhaustive. It fails for panics, missing allocator setup,
unstable baseline runs, early stops, selected subsets, and `max_failures` caps
that leave observed allocation attempts untested.

For expensive cases, you can cap the number of selected failure targets, but the
report is partial unless the cap still covers all observed allocation attempts:

```rust
let report = alloc_chaos::Check::new()
    .max_failures(128)
    .stop_on_failure(true)
    .run(|| {
        // test body
    });

println!("{report}");
```

## Reproducing one allocation failure

After a full check identifies a specific failing allocation number, rerun only
that target:

```rust
let report = alloc_chaos::Check::new()
    .only_failure(37)
    .run(|| {
        // same test body
    });

println!("{report}");
```

A single-target report is intentionally marked truncated when the baseline has
other allocation attempts. Use this mode for diagnosis and reproduction, not as
the final exhaustive assertion.

To inspect a smaller window:

```rust
let report = alloc_chaos::Check::new()
    .failure_range(30..40)
    .run(|| {
        // same test body
    });
```

`failure_range` accepts a half-open range and rejects reversed ranges.

## Allocation metadata

Each injected attempt records the allocator operation and layout that was
failed:

```rust
for attempt in report.attempts() {
    if let Some(allocation) = attempt.injected_allocation() {
        eprintln!(
            "#{}: {} size={} align={} new_size={:?}",
            allocation.index(),
            allocation.operation(),
            allocation.size(),
            allocation.align(),
            allocation.new_size(),
        );
    }
}
```

This metadata is intentionally limited to information already available inside
`GlobalAlloc`: operation, size, alignment, and `realloc` new size. The allocator
path does not capture backtraces.

`tested_failures()` reports how many selected target runs were executed.
`injected_failures()` reports how many of those runs actually reached and failed
the selected allocation attempt. A target that is not reached is reported as a
failure and is not counted as injected.

## Baseline stability

If the closure has nondeterministic allocation behavior, allocation number `N`
may not mean the same thing across repeated runs. Ask the checker to validate
that the baseline allocation count is stable before injecting failures:

```rust
let report = alloc_chaos::Check::new()
    .stability_runs(3)
    .run(|| {
        // deterministic test body
    });

report.assert_success();
```

When stability checking is enabled, all baseline runs must complete with the
same allocation count. An unstable baseline is reported as an invalid check and
no failure injection is performed.

## Important limits

This crate verifies a concrete execution path, not your entire library. It can
say that this test case survived each observed allocation failure. It cannot
prove that untested inputs or untested branches are OOM-safe.

The in-process checker is intended for code that uses fallible allocation APIs,
such as `try_reserve`. If the code under test uses infallible allocation APIs,
the standard library may abort the process on allocation failure. An abort
cannot be caught by `catch_unwind`.

Only one check can be active in a process. Allocation counting and failure
injection are deliberately limited to the thread executing the checked closure;
allocations from the Cargo test runner or other application threads are ignored.
If the checked code spawns worker threads, allocations performed by those worker
threads are not part of the checked sequence.

The checker observes allocation attempts that go through Rust's process global
allocator on the checked thread. It does not observe direct calls to foreign
allocators such as `malloc`, custom allocation paths that bypass the global
allocator, allocations from worker threads, untested control-flow branches,
multi-failure OOM scenarios, or real system memory pressure.

## Examples

A minimal complete example is available in `examples/try_reserve.rs`:

```sh
cargo run --example try_reserve
```

It demonstrates the intended shape: use `try_reserve_exact`, translate
allocation failure into a domain error, and assert that both the success path
and the injected-OOM path are acceptable.

A broader scenario walktrough is available in `examples/scenarios.rs`:

```sh
cargo run --example scenarios
```

It demonstrates strict exhaustive success, bounded diagnostic runs, selected
failure ranges, single-target reproduction, allocation metadata, unstable
baseline detection, and a mishandled OOM path. Several reports in that example
are intentionally partial or failing; they are printed for diagnosis rather than
asserted as successful checks.

## Status

`alloc-chaos` is currently in an early development stage. At the moment, it is
primarily built for personal experimentation and validation of the core idea.
The public API, behavior, reporting format, and internal implementation details
may change without notice. No stability, correctness, compatibility, or
production-readiness guarantees are provided yet. Use it at your own risk,
review the results carefully, and avoid relying on it as the only verification
mechanism for critical allocation-failure handling.