# global-state-detector-rs
Rust bindings for [`global-state-detector`](https://github.com/AFLplusplus/global-state-detector),
a small C helper that reports persistent writable global state between fuzzer
iterations. This is useful when a fuzz target is supposed to be deterministic
and iteration-local but hidden `.data` / `.bss` state makes later
inputs depend on earlier ones. Surfaces the same kind of instability
that AFL++'s `afl-fuzz` and LibAFL flag, but per-byte and with symbol
attribution.
## What it detects
* Writable, non-executable `PT_LOAD` segments (`.data` / `.bss`) of the
main binary.
* Writable, non-executable `PT_LOAD` segments of every loaded shared
object discovered through `dl_iterate_phdr`.
* Page-level changes via a fast hash, followed by byte-range reporting
for changed pages with `dladdr`-resolved symbol attribution.
* Clang sanitizer coverage counters are ignored when the
`__sancov_cntrs` linker-provided range is present, so libFuzzer's
own coverage bitmap does not dominate reports.
## What it does NOT detect
* Heap or `mmap`-backed state (anonymous mappings).
* Thread-local storage (`thread_local!`, `__thread`, glibc TLS).
* External process state — files, sockets, pipes, IPC.
* Writable state in deliberately filtered noisy modules: `libc.so*`,
`ld-linux*`, `libpthread*`, `libstdc++*`, `linux-vdso.so*`.
## Platform support
Linux ELF processes only. Uses `dl_iterate_phdr`, `dladdr`, and ELF
program headers from `<elf.h>` / `<link.h>`. macOS and Windows are not
supported.
The crate's `build.rs` invokes `cc` with the system C compiler. Use a
clang-based fuzzer toolchain (`afl-clang-fast`, `clang`) when
instrumenting your target. The detector itself only needs a working
C compiler to build.
## Installation
Clone with submodules — the C source ships under `csrc/`:
```sh
git clone --recurse-submodules https://github.com/AFLplusplus/global-state-detector-rs
# or, if you already cloned without submodules:
git submodule update --init --recursive
```
Add the dependency to your fuzz harness's `Cargo.toml`:
```toml
[dependencies]
global-state-detector = { path = "../path/to/global-state-detector-rs" }
```
Or, once published:
```toml
[dependencies]
global-state-detector = "0.1"
```
## Required linker flags
Cargo does **not** propagate `rustc-link-arg` from rlib dependencies to
downstream binaries, so the consuming crate must arrange for the
linker to receive these flags itself. Add a `.cargo/config.toml`
alongside the harness. For cargo-fuzz that is `fuzz/.cargo/config.toml`:
```toml
[target.'cfg(target_os = "linux")']
rustflags = ["-C", "link-arg=-rdynamic", "-C", "link-arg=-Wl,-z,now"]
```
| `-rdynamic` | Keeps non-exported symbols in the dynamic symbol table so `dladdr` can resolve them. Without it, reports show `?+0x...`. |
| `-Wl,-z,now` | Disables lazy PLT/GOT binding. Without it, the first iteration reports massive churn from binding being resolved on demand. |
> **cargo-fuzz users:** `.cargo/config.toml` rustflags do **not**
> survive cargo-fuzz. cargo-fuzz sets its own `RUSTFLAGS` environment
> variable, and env-var rustflags *override* config-file rustflags
> rather than merging. Emit the same flags from a `fuzz/build.rs`
> instead — `cargo:rustc-link-arg-bins` goes through cargo's metadata
> channel and is not affected:
>
> ```rust
> // fuzz/build.rs
> fn main() {
> println!("cargo:rustc-link-arg-bins=-rdynamic");
> println!("cargo:rustc-link-arg-bins=-Wl,-z,now");
> }
> ```
>
> See [`fuzz/build.rs`](fuzz/build.rs) in this repo for the working version.
## API
```rust
pub fn init();
pub fn check(rebaseline: bool) -> i32;
pub fn rebaseline();
```
* [`init`] — snapshots all writable `PT_LOAD` segments. Call once
after one-time target initialization is complete.
* [`check`] — diffs current memory against the last snapshot. Returns
the number of pages that changed. With `rebaseline = true`, updates
the snapshot so the next call only shows new deltas. Pass `false`
for cumulative drift across the entire run.
* [`rebaseline`] — re-snapshots without reporting. Use it to refresh
the baseline immediately before invoking the target.
[`init`]: https://docs.rs/global-state-detector/latest/global_state_detector/fn.init.html
[`check`]: https://docs.rs/global-state-detector/latest/global_state_detector/fn.check.html
[`rebaseline`]: https://docs.rs/global-state-detector/latest/global_state_detector/fn.rebaseline.html
## Recommended harness pattern
`rebaseline` immediately before the target, `check(true)` immediately
after. That window attributes drift to the target rather than to the
fuzzer's own bookkeeping between callbacks.
### AFL++ persistent mode (afl.rs/cargo-afl)
```rust
use afl::fuzz;
use std::sync::Once;
static INIT: Once = Once::new();
fn main() {
fuzz!(|data: &[u8]| {
if !INIT.is_completed() {
INIT.call_once(|| {
global_state_detector::init();
});
} else {
global_state_detector::rebaseline();
}
let _ = my_target::process(data);
global_state_detector::check(true);
});
}
```
### cargo-fuzz / libFuzzer
```rust
#![no_main]
use libfuzzer_sys::fuzz_target;
use std::sync::Once;
static INIT: Once = Once::new();
fuzz_target!(|data: &[u8]| {
if !INIT.is_completed() {
INIT.call_once(|| {
// any one-time target init goes here:
// my_target::init_global_resources();
global_state_detector::init();
});
} else {
global_state_detector::rebaseline();
}
let _ = my_target::process(data);
global_state_detector::check(/* rebaseline = */ true);
});
```
## Running the bundled example
The repo ships a runnable demo split across two directories on purpose:
* **`example/example.rs`** is the user-shaped template — what you
would replicate in your own project. It contains the canonical
harness pattern plus a tiny inline stand-in for the target under
test (a `static` accumulator that mutates on every call, mirroring
`csrc/harness_example.c`). In a real harness, replace the inline
`target_process` with a call into your own crate.
* **`fuzz/`** is the cargo-fuzz scaffold that points at the example so
it actually runs from this repo. Its `[[bin]]` references
`../example/example.rs` directly — no copy, no duplication.
`fuzz/build.rs` supplies the linker flags (see the cargo-fuzz note
above).
Prerequisites: nightly Rust and cargo-fuzz.
```sh
rustup toolchain install nightly
cargo install cargo-fuzz rustfilt
git submodule update --init --recursive
```
Build and run:
```sh
cargo fuzz build example
You should see:
```text
[global-state-detector] init: N regions, M bytes, K modules skipped
[global-state-detector] CHANGE 0x... len=... ACCUMULATOR+0x0 ([main])
was: ...
now: ...
```
cargo-fuzz's default AddressSanitizer is fine — the upstream C
library is ASAN-aware (it reads through ASAN red zones safely). If
you previously saw `global-buffer-overflow` from `memcpy` inside
`global_state_detector_check`, update the `csrc/` submodule to pick up
the fix.
### libFuzzer self-state in reports
You will see some changes attributed to `_ZN6fuzzer3TPCE+...` —
libFuzzer's own coverage/program-counter tables. The detector skips
`__sancov_cntrs` but not the rest of libFuzzer's writable globals.
Treat those as fuzzer bookkeeping, not target drift.
## Sample report
```text
[global-state-detector] init: 14 regions, 921600 bytes, 5 modules skipped
[global-state-detector] CHANGE 0x55c1a04b3020 len=8 target_accumulator+0x0 ([main])
was: 00 00 00 00 00 00 00 00
now: 7f 00 00 00 00 00 00 00
```
Format:
```
[global-state-detector] CHANGE <addr> len=<bytes> <symbol>+<offset> (<module>)
was: <up to 16 bytes hex>
now: <up to 16 bytes hex>
```
Up to 32 byte-runs per `check` call are reported; further changes are
counted but not dumped to keep output bounded.
### Demangling
Rust symbols come out mangled (`_ZN8...` or `_R...`). Pipe stderr
through [`rustfilt`](https://crates.io/crates/rustfilt):
```sh
cargo install rustfilt
## Rust-specific caveats
| `static` / `static mut` | Yes — lives in `.data` / `.bss`. |
| `AtomicU*`, `AtomicBool`, etc. | Yes. |
| `Mutex<T>` (lock word), `RwLock` (lock word) | Yes. |
| `OnceCell`, `OnceLock`, `LazyLock`, `lazy_static!` | Pointer/discriminant in `.bss` only — heap payload is invisible. You will see "this static was first-used in this iteration" but not what value it took. |
| `thread_local!` | No — TLS is not snapshotted. |
| `Box`, `Vec`, `String` held in a `static` | The header in `.bss` is tracked; heap contents are not. |
## Noise and limitations
* The detector skips a small allowlist of glibc-family modules
(`libc.so*`, `ld-linux*`, `libpthread*`, `libstdc++*`,
`linux-vdso.so*`). Other runtime libraries (libgcc, libssl, custom
allocators, …) may report expected state — filter at your own
discretion.
* The first invocation of any uninitialized lazy static — including
the Rust standard library's allocator state, panic infrastructure,
and thread-local fallbacks — will look like new writable state. Use
`rebaseline` immediately before the target call to mask it.
* Hard cap of `PROBE_MAX_REGIONS = 512` snapshotted segments.
Processes with extremely large module counts will hit this limit; a
warning is printed to stderr and further regions are skipped.
* Hard cap of `PROBE_MAX_REPORTS = 32` reported byte-runs per check.
The change *count* returned by `check` is exact; the dumped detail
is truncated.
* The page hash is FNV-1a, not collision-resistant. Adversarial
collisions are possible but irrelevant for fuzzing instability
detection.
## Thread safety
The underlying C implementation is **not** thread-safe. Internal state
(snapshot table, region list) is shared and unsynchronized. Use this
crate from a single-threaded harness, or add external synchronization
around `init` / `check` / `rebaseline`. Most fuzzer harnesses are
single-threaded by default; multi-threaded targets are fine as long as
the detector itself is only invoked from one of them.
## How it works
`init` walks every loaded ELF object via `dl_iterate_phdr`, records
every writable non-executable `PT_LOAD` segment, and copies it into a
shadow buffer with a per-page FNV-1a hash. `check` rehashes each page,
and for any mismatch walks the page byte-by-byte to find contiguous
runs of differing bytes, resolves the run's start address with
`dladdr` for symbol attribution, and prints a hex diff. `rebaseline`
just refreshes the shadow without reporting.
The full implementation is ~340 lines of C; see
[`csrc/global_state_detector.c`](csrc/global_state_detector.c).
## License
AGPL-3.0-or-later, matching the upstream C library. See
[`LICENSE`](LICENSE) for the full text.
## Acknowledgements
Upstream C library: [AFLplusplus/global-state-detector](https://github.com/AFLplusplus/global-state-detector)
by Marc "vanHauser" Heuse.