Structured output of a single bench run: named metrics (e.g. (“evaltime”, secs),
(“pp512”, tok/s)) plus the loop iteration count for the human report line. The
bench/llm-bench runners return this so callers — the interactive subcommand or
the bench suite — consume data instead of parsing stdout.
Extract the load-time readings the bench suite reports from a readings-probe
output file. For each tracked checkpoint, emit time_to_<stage> (elapsed
seconds), rsz_at_<stage> (resident bytes) and active_at_<stage> (alloc −
free bytes). A missing file or absent checkpoint is skipped; the orchestrator
decides which metrics are required.