ktstr 0.5.2 - Docs.rs

# BPF Verifier

The verifier sweep boots every declared scheduler in a KVM VM and
captures per-program verifier statistics from the real kernel
verifier.

## Design

The verifier sweep follows ktstr's two core principles.

**Fidelity without overhead.** Each scheduler binary runs inside a
VM on the same kernel the scheduler targets in production. The
verifier that runs is the real verifier in the real kernel — no
host-side BPF loading, no version skew between the host kernel's
verifier and the target kernel's verifier.

**Direct access over tooling layers.** No subprocess to `bpftool`
or `veristat`. The host reads per-program `verified_insns` directly
from guest memory via `bpf_prog_aux` introspection and applies
cycle collapse to verifier logs instead of truncating.

## Quick start

```sh
# Run every declared scheduler against the kernel discovered via
# KTSTR_KERNEL (or the cache).
cargo ktstr verifier

# Run against a specific kernel build.
cargo ktstr verifier --kernel ../linux

# Sweep across multiple kernels (each cell runs against its own).
cargo ktstr verifier --kernel 6.14.2 --kernel 6.15.0

# Print the raw verifier log without cycle collapse.
cargo ktstr verifier --raw
```

See [cargo-ktstr verifier](cargo-ktstr.md#verifier) for the
full flag list.

## How it works

`cargo ktstr verifier` is a thin dispatcher around
`cargo nextest run -E 'test(/^verifier/)'`. The matrix that nextest
runs is generated by the test binaries themselves.

1. **Cell emission** — every test binary that links the
   ktstr test support and contains at least one
   `declare_scheduler!` declaration emits a `verifier/<sched>/<kernel>/<preset>: test`
   listing for each (declared scheduler × kernel-list entry ×
   accepted gauntlet topology preset) triple. Cells whose
   topology preset would exceed the host's CPU / LLC / per-LLC
   capacity are filtered at emission time (mirroring the
   gauntlet variant filter).

2. **Per-cell dispatch** — nextest invokes the test binary once
   per cell with `--exact verifier/<sched>/<kernel>/<preset>`.
   The binary's `#[ctor]` intercepts the prefix, parses the
   cell name into its three components, resolves the scheduler
   binary, kernel directory, and topology, and boots a single
   VM dedicated to that cell.

3. **Verifier collection** — inside the VM, the scheduler
   loads its BPF programs via `scx_ops_load!`; the real kernel
   verifier runs against them. The host reads per-program
   `verified_insns` from `bpf_prog_aux` via guest physical
   memory introspection. On load failure, libbpf prints the
   verifier log to stderr; the VM forwards it to the host via
   the bulk SHM port between `===SCHED_OUTPUT_START===` /
   `===SCHED_OUTPUT_END===` markers.

4. **Rendering** — per-program summary lines, then the verifier
   log with cycle collapse applied (or raw, with `--raw`).

Eevdf + KernelBuiltin scheduler variants have no userspace
binary to load BPF programs from, so they are skipped at
cell-emission time. Direct invocation outside nextest
(`--exact verifier/<eevdf-sched>/...`) prints a `SKIP` banner
and exits 0.

## Matrix dimension + per-scheduler filter

The verifier sweep matrix is driven by the operator's `--kernel`
set, not by per-scheduler `declare_scheduler!` declarations. The
dispatcher always exports `KTSTR_KERNEL_LIST`
(`label1=path1;label2=path2;...`) to the nextest invocation —
even with no `--kernel` flag it synthesizes a single entry from
the auto-discovered kernel. The test binary's lister walks that
list as the matrix dimension and emits one cell per (declared
scheduler × kernel-list entry × accepted preset).

Each scheduler's `declare_scheduler!` `kernels = [...]`
declaration acts as a **per-scheduler filter** on the matrix:

- Empty (`kernels = []`) accepts every kernel-list entry — the
  scheduler verifies against everything the operator passes.
- `Version` specs (`"6.14.2"`) match entries whose raw label
  equals the version (or whose sanitized label equals the
  sanitized form of the version).
- `Range` specs (`"6.14..6.16"`, `"6.14..=6.16"`) match
  entries whose raw version falls in the inclusive range,
  parsed via the same `decompose_version_for_compare` helper
  the operator-side range expansion uses.
- `Path` / `CacheKey` / `Git` specs match by sanitized-label
  equality.

```sh
# Scheduler declares kernels = ["6.14..6.16"]
# Operator runs --kernel 6.14.2 --kernel 6.15.0 --kernel 6.17.0
# Dispatcher's KTSTR_KERNEL_LIST: kernel_6_14_2, kernel_6_15_0,
#                                 kernel_6_17_0
# Scheduler filter: 6.14.2 ∈ [6.14, 6.16] ✓
#                   6.15.0 ∈ [6.14, 6.16] ✓
#                   6.17.0 ∈ [6.14, 6.16] ✗ — rejected
# Cells emitted: verifier/<sched>/kernel_6_14_2/<preset>
#                verifier/<sched>/kernel_6_15_0/<preset>
cargo ktstr verifier --kernel 6.14.2 --kernel 6.15.0 --kernel 6.17.0

# No --kernel: dispatcher auto-discovers one kernel via the
# cache + filesystem chain and synthesizes a single-entry
# KTSTR_KERNEL_LIST. The auto-discovered entry's label is
# derived from the resolved path (`kernel_path_<basename>_<hash6>`).
# Schedulers with non-empty `kernels = [...]` may filter the
# entry out — operators wanting deterministic coverage should
# always pass --kernel.
cargo ktstr verifier
```

The verifier cell handler resolves the per-cell kernel
directory by looking up the cell's sanitized label in
`KTSTR_KERNEL_LIST` — there is no single-kernel fallback that
would silently run a cell against an unrelated kernel. A label
that doesn't appear in the list errors out with an actionable
diagnostic naming the present labels and pointing at both fix
paths (add `--kernel <SPEC>` or drop the matching entry from
`declare_scheduler!`).

## Output

### Brief (default)

Per-program summary line:

```text
  ktstr_enqueue                              verified_insns=500
```

`verified_insns` is the number of instructions the kernel
verifier processed, read from `bpf_prog_aux` via host-side
memory introspection.

On load failure, the scheduler-log section shows libbpf's
verifier output with **cycle collapse** applied — repeating
loop iterations are reduced to the first iteration, an
omission marker, and the last iteration:

```text
--- 8x of the following 10 lines ---
100: (bf) r0 = r1 ; frame1: R0_w=scalar(id=0,umin=0)
101: (bf) r1 = r2 ; frame1: R1_w=scalar(id=1,umin=1)
...
--- 6 identical iterations omitted ---
100: (bf) r0 = r1 ; frame1: R0_w=scalar(id=70,umin=700)
101: (bf) r1 = r2 ; frame1: R1_w=scalar(id=71,umin=701)
...
--- end repeat ---
```

### Raw (`--raw`)

Full raw verifier log without cycle collapse. Use for
debugging verification failures where the exact register
state at each iteration matters. The flag exports
`KTSTR_VERIFIER_RAW=1` for the nextest invocation; the
in-binary cell handler reads it via `env::var_os` and switches
the `format_verifier_output` rendering branch.

## Cycle collapse algorithm

The kernel verifier unrolls loops by re-verifying each
instruction with updated register states. A bounded loop of 8
instructions verified 100 times produces 800 near-identical
lines — differing only in register-state annotations. Naive
truncation loses context. Cycle collapse preserves structure:
the first iteration shows what the loop does, the last shows
the final state, and a count tells you how many iterations were
elided.

The algorithm normalizes lines by stripping variable
annotations, then detects repeating blocks:

1. **Normalize** — strip `; frame1: R0_w=...` annotations,
   standalone register dumps (`3041: R0=scalar()`), and
   inline branch-target state after `goto pc+N`. Source
   comments (`; for (int j = 0; ...)`) are preserved as
   cycle anchors.

2. **Detect** — find the most frequent normalized line (the
   "anchor"), compute gaps between anchor occurrences to
   determine the cycle period, then verify consecutive blocks
   match after normalization. Minimum period: 5 lines.
   Minimum repetitions: 3.

3. **Collapse** — replace the cycle with the first iteration,
   an omission count, and the last iteration. Run iteratively
   (up to 5 passes) to handle nested loops.

## scx-ktstr test flags

scx-ktstr supports these flags to exercise the verifier
pipeline:

**`--fail-verify`** — sets a `.rodata` variable before
`scx_ops_load!`, enabling a code path the BPF verifier
rejects. On failure, libbpf prints the verifier log to stderr.

**`--verify-loop`** — sets a `.rodata` variable that enables
an unrolled loop followed by `while(1)` in `ktstr_dispatch`.
The verifier rejects the infinite loop and libbpf prints the
full instruction trace to stderr, exercising cycle collapse.