ktstr 0.5.2

Test harness for Linux process schedulers
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
# Troubleshooting

## Build errors

### clang not found

```text
error: failed to run custom build command for `ktstr`
  ...
  clang: No such file or directory
```

The BPF skeleton build (`libbpf-cargo`) invokes clang to compile
`.bpf.c` sources. Install clang:

- Debian/Ubuntu: `sudo apt install clang`
- Fedora: `sudo dnf install clang`

### pkg-config not found

```text
error: failed to run custom build command for `libbpf-sys`
  ...
  pkg-config: command not found
```

libbpf-sys uses pkg-config during its vendored build. Install it:

- Debian/Ubuntu: `sudo apt install pkg-config`
- Fedora: `sudo dnf install pkgconf`

### autotools errors (autoconf, autopoint, aclocal)

```text
autoreconf: command not found
aclocal: command not found
autopoint: command not found
```

The vendored libbpf-sys build compiles bundled libelf and zlib from
source using autotools. These libraries are not system dependencies
-- they ship with libbpf-sys -- but the autotools toolchain is
needed to build them. Install:

- Debian/Ubuntu: `sudo apt install autoconf autopoint flex bison gawk`
- Fedora: `sudo dnf install autoconf gettext-devel flex bison gawk`

### make or gcc not found

```text
busybox build requires 'make' — install build-essential (Debian/Ubuntu) or base-devel (Fedora/Arch)
busybox build requires 'gcc' — install build-essential (Debian/Ubuntu) or base-devel (Fedora/Arch)
```

The build script compiles busybox from source for guest shell mode.
This requires make and gcc.

- Debian/Ubuntu: `sudo apt install make gcc`
- Fedora: `sudo dnf install make gcc`

### BTF errors

```text
no BTF source found. Set KTSTR_KERNEL to a kernel build directory,
or ensure /sys/kernel/btf/vmlinux exists.
```

build.rs generates `vmlinux.h` from kernel BTF data. It searches
the kernel discovery chain (`KTSTR_KERNEL`, `./linux`, `../linux`,
installed kernel) for a `vmlinux` file, falling back to
`/sys/kernel/btf/vmlinux`. Most distros ship
`/sys/kernel/btf/vmlinux` with CONFIG_DEBUG_INFO_BTF enabled.

**Fixes:**

- Verify BTF is available: `ls /sys/kernel/btf/vmlinux`
- If missing, set `KTSTR_KERNEL` to a kernel build directory that
  contains a `vmlinux` with BTF:
  `export KTSTR_KERNEL=/path/to/linux`
- Build a kernel with `CONFIG_DEBUG_INFO_BTF=y`.
- Some minimal/cloud kernels strip BTF. Use a distro kernel or
  build your own.

### busybox download failure

```text
failed to obtain busybox source.
  tarball (https://github.com/mirror/busybox/archive/refs/tags/1_36_1.tar.gz): download: ...
  git clone (https://github.com/mirror/busybox.git): ...
  Check network connectivity. First build requires internet access.
```

build.rs downloads busybox source on first build (tarball first,
git clone fallback). Subsequent builds use the cached binary in
`$OUT_DIR`.

**Fixes:**

- Verify network connectivity to github.com.
- If behind a proxy, set `HTTP_PROXY` / `HTTPS_PROXY`.
- After a successful first build, no network access is needed
  unless `cargo clean` removes the cached binary.

## /dev/kvm not accessible

The host-side pre-flight emits one of the following, depending on
whether the device node is missing or merely unreadable:

```text
/dev/kvm not found. KVM requires:
  - Linux kernel with KVM support (CONFIG_KVM)
  - Access to /dev/kvm (check permissions or add user to 'kvm' group)
  - Hardware virtualization enabled in BIOS (VT-x/AMD-V)
```

```text
/dev/kvm: permission denied. Add your user to the 'kvm' group:
  sudo usermod -aG kvm $USER
  then log out and back in.
```

ktstr boots Linux kernels in KVM virtual machines. The host must have
KVM enabled and the user must have read+write access to `/dev/kvm`.

**Diagnose:**

- Check the device exists and inspect its permissions and owning group:
  `ls -l /dev/kvm`. Typical output: `crw-rw---- 1 root kvm 10, 232 ...`.
- Confirm the `kvm` group exists and see its members:
  `getent group kvm`.

**Fixes:**

- Load the KVM module: `modprobe kvm_intel` or `modprobe kvm_amd`.
- Follow the group-membership hint in the error text above (log out
  and back in afterward for the group change to take effect).
- On cloud VMs (GCP, AWS, Azure) or nested hypervisors, nested
  virtualization is typically off by default. Enable it per the
  provider's instructions (e.g. GCP `--enable-nested-virtualization`,
  AWS metal/`.metal` instance types, Azure Dv3/Ev3+ with nested virt).
- In CI, ensure the runner has KVM access (e.g. `runs-on: [self-hosted, kvm]`).

## No kernel found

```text
no kernel found
  hint: set KTSTR_KERNEL to a kernel source directory, a version (e.g. `6.14.2`), or a cache key (see `cargo ktstr kernel list`), or run `cargo ktstr kernel build` to populate the cache
  hint: or set KTSTR_TEST_KERNEL=/path/to/bzImage to point at a pre-built bootable image directly (bypasses KTSTR_KERNEL resolution)
```

On aarch64 the second hint says `Image` instead of `bzImage`.

`ktstr shell` and `cargo ktstr shell` auto-download the latest
stable kernel when no `--kernel` is specified and no kernel is found
via the discovery chain. See
[Kernel auto-download failures](#kernel-auto-download-failures) for
download-specific errors.

ktstr needs a bootable Linux kernel image (`bzImage` on x86_64,
`Image` on aarch64). See
[Kernel discovery](getting-started.md#kernel-discovery) for the
search order.

**Fixes:**

- Download and cache a kernel: `cargo ktstr kernel build`
- Build from a local tree: `cargo ktstr kernel build --source ../linux`
- Set `KTSTR_TEST_KERNEL` to an explicit image path.
- The host's installed kernel works for basic testing.

## Scheduler not found

```text
scheduler 'scx_mitosis' not found. Set KTSTR_SCHEDULER or
place it next to the test binary or in target/{debug,release}/
```

When using `SchedulerSpec::Discover`, ktstr searches for the scheduler
binary in:

1. `KTSTR_SCHEDULER` environment variable.
2. Sibling of the current executable (and, when the test binary
   lives under `target/{debug,release}/deps/`, the parent of
   `deps/` one level up — this covers the nextest / integration-
   test layout where the scheduler binary sits next to the test
   binary's parent).
3. `target/debug/`.
4. `target/release/`.
5. On-demand build via `cargo build` against the scheduler's
   package name — ktstr invokes the build itself when the
   preceding four locations have no match, so a fresh checkout
   with an unbuilt scheduler still produces a usable binary
   without the caller pre-running `cargo build`.

**Fixes:**

- Build the scheduler first: `cargo build -p scx_mitosis` (skipped
  automatically if step 5 above can build it on demand, but
  pre-building makes the first test run faster).
- Set `KTSTR_SCHEDULER=/path/to/binary`.
- Use `SchedulerSpec::Path` for an explicit path in `#[ktstr_test]`.

## Scheduler died

```text
scheduler process died unexpectedly after completing step 2 of 5 (12.3s into test)
```

The scheduler process died while the scenario was running. This
is usually a crash. The exact message varies by when the crash was
detected (between steps, during workload, after completion).

The failure output contains diagnostic sections (each present only
when relevant):

- `--- scheduler log ---`: the scheduler's stdout and stderr,
  cycle-collapsed for readability.
- `--- diagnostics ---`: init stage classification, VM exit code,
  and the last 20 lines of kernel console output.
- `--- sched_ext dump ---`: `sched_ext_dump` trace lines from the
  guest kernel (present when a SysRq-D dump fired).

Set `RUST_BACKTRACE=1` to force `--- diagnostics ---` on all
failures, not just scheduler deaths.

**Next steps:**

- Check the `--- scheduler log ---` for the crash reason.
- Check `--- diagnostics ---` for BPF errors or kernel oops in
  the kernel console.
- Enable `auto_repro` in the test to capture the crash path with
  BPF probes. See [Auto-Repro]running-tests/auto-repro.md.
- Run with a longer duration and specific flags to narrow the
  reproducer.

See [Investigate a Crash](recipes/investigate-crash.md) for the
complete failure output format and auto-repro walkthrough.

## Insufficient hugepages

```text
performance_mode: WARNING: no 2MB hugepages available, guest memory will use regular pages
```

```text
performance_mode: WARNING: need N 2MB hugepages, only K free — falling back to regular pages
```

[Performance mode](concepts/performance-mode.md) requests 2MB
hugepages for guest memory. The first form fires when no 2MB hugepages
are reserved on the host (`free == 0`); the second fires when some are
reserved but fewer than the run needs. In both cases the VM falls back
to regular pages and continues to boot.

**Fix:**

Allocate hugepages before the run:

```sh
echo 2048 | sudo tee /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
```

## Worker assertion failures

```text
stuck 4500ms on cpu2 at +3200ms (threshold 3000ms)
unfair cgroup: spread=42% (8-50%) 4 workers on 4 cpus (threshold 35%)
```

The Assert checks (`max_gap_ms`, `max_spread_pct`, etc.) detected a
worker metric outside the configured thresholds.

**Fixes:**

- Check whether the topology has enough CPUs for the scenario. Small
  topologies produce higher contention, larger gaps, and more spread.
- Use `execute_steps_with()` with a custom `Assert` to override
  thresholds for scenarios that need relaxed limits.
- Check the scheduler's behavior under the specific flag profile that
  triggered the failure.

## Cgroup name typos

```text
No such file or directory: /sys/fs/cgroup/.../nonexistent/cgroup.procs
```

A cgroup name passed to `Op::SetCpuset`, `Op::Spawn`, or
`CgroupManager::move_tasks` does not match a previously created
cgroup. Cgroup names are case-sensitive strings.

**Fixes:**

- Verify the cgroup name matches the `name` in `Op::AddCgroup` or
  `CgroupDef::named()`.
- When using dynamic cgroup names (e.g. `format!("cg_{i}")`), ensure
  the same formatting is used in all ops referencing that cgroup.

## CpusetSpec errors

```text
cgroup 'cg_0': CpusetSpec validation failed: not enough usable CPUs (4) for 8 partitions
cgroup 'cg_1': CpusetSpec validation failed: index 3 >= partition count 3
cgroup 'cg_2': CpusetSpec validation failed: Range fracs must lie in [0.0, 1.0]: start_frac=-1, end_frac=0.5
```

A `CpusetSpec` cannot produce a valid cpuset for the test topology.
`execute_steps` treats this as a hard error and aborts the step so the
downstream slicing/arithmetic in `CpusetSpec::resolve` is never reached
with inputs that would panic.

**Fixes:**

- Guard with a topology check before creating the step:
  `if ctx.topo.usable_cpus().len() < needed { return Ok(AssertResult::skip(...)); }`
- Call `CpusetSpec::validate(&ctx)` in your scenario builder so failures
  surface before `execute_steps` runs.
- Reduce the partition count or use `CpusetSpec::Llc` instead of
  `Disjoint` on topologies with fewer CPUs than partitions.
- For `Range`/`Overlap`, keep fractions finite and inside `[0.0, 1.0]`;
  `Range` additionally requires `start_frac < end_frac`.

## Worker count mismatches

```text
PipeIo requires num_workers divisible by 2, got 3
```

Grouped work types (`PipeIo`, `FutexPingPong`, `CachePipe`,
`FutexFanOut`, `FanOutCompute`) require `num_workers` divisible by their
group size. `WorkType::worker_group_size()` returns the divisor.

**Fixes:**

- Set `CgroupDef::workers(n)` to a value divisible by the work
  type's group size (2 for pipe/futex pairs, `fan_out + 1` for
  FutexFanOut and FanOutCompute).
- Use an ungrouped work type (`SpinWait`, `Mixed`, `Bursty`,
  `IoSyncWrite`, `IoRandRead`, `IoConvoy`, `YieldHeavy`) if worker
  count flexibility is needed.

## Cache corruption

```text
  6.14.2-tarball-x86_64-kc...                 (corrupt: metadata.json malformed: ...)
warning: entries marked (corrupt) cannot be used — cached metadata is missing, malformed, or references a missing image. Inspect the entry directory under ~/.cache/ktstr/kernels to remove it manually, or run `kernel clean --corrupt-only --force` which removes ONLY corrupt entries and leaves valid ones intact. ...
```

A cached kernel entry has missing, unparseable, or
schema-drifted `metadata.json`, or metadata that references an
image file that is no longer present. This can happen after a
partial write (e.g. disk full, killed process), or after a ktstr
release that evolved the metadata schema in a
non-backward-compatible way. `cargo ktstr kernel list` surfaces
these as `(corrupt: ...)` rows; the trailing footer on stderr
summarizes the remediation options. `CacheDir::lookup` returns
`None` for corrupt entries so test runs at a specific cache key
fall through to the normal re-build path.

The JSON form (`cargo ktstr kernel list --json`) emits an
`error_kind` field on every corrupt entry — one of `"missing"`,
`"unreadable"`, `"schema_drift"`, `"malformed"`, `"truncated"`,
`"parse_error"`, `"image_missing"`, or `"unknown"` — so CI
scripts can dispatch on a stable token without parsing the
free-form `error` string.

**Fixes:**

- Remove ONLY corrupt entries (keeps valid ones intact):
  `cargo ktstr kernel clean --corrupt-only --force`
- Remove the corrupt entry along with everything else:
  `cargo ktstr kernel clean --force`
- Rebuild a specific version after cleanup: `cargo ktstr kernel build --force 6.14.2`
- Override the cache directory via `KTSTR_CACHE_DIR` if the default
  location is on a problematic filesystem.
- See [`cargo ktstr kernel clean`]running-tests/cargo-ktstr.md#kernel-clean
  for all cleanup options, including `--keep N --force` to preserve
  the N newest entries.

## Stale `vmlinux.btf` or `default.profraw` in kernel source tree

After upgrading from an older ktstr version, you may notice extra
files in your kernel source directory:

- `<source>/vmlinux.btf` — a sidecar of the kernel's `.BTF`
  section bytes. Older ktstr versions wrote it next to whichever
  `vmlinux` they parsed, including source-tree builds. Current
  ktstr only writes the sidecar when the vmlinux path is inside
  the cache root (`~/.cache/ktstr/kernels/` or whatever
  `KTSTR_CACHE_DIR` points at) so source trees stay pristine.
- `<source>/default.profraw` — an LLVM coverage runtime artifact.
  Older ktstr versions could leave it in cwd when a
  coverage-instrumented `cargo ktstr test` was launched from
  inside the kernel tree. Current ktstr injects
  `LLVM_PROFILE_FILE=<cargo-ktstr-binary-parent>/llvm-cov-target/default-{pid}-{binary_hash}.profraw`
  for the bare `nextest` path so the profraw lands next to the
  cargo-ktstr binary regardless of cwd. See
  [profraw layout]running-tests/cargo-ktstr.md#profraw-layout
  for the per-population directory map.

Both files are leftover state from prior runs and are safe to
remove:

```sh
rm -f /path/to/linux/vmlinux.btf
rm -f /path/to/linux/default.profraw
```

If you also see them turn up under a different ktstr-driven
source tree, check that you are running a current ktstr build
(re-run `cargo build` or `cargo install ktstr` to pick up the
fix) before deleting again — the guards live in the resolver,
not on disk, so an old binary will keep regenerating these
files.

## Cache directory not found

```text
HOME is unset; cannot resolve cache directory. The container init or login shell did not assign HOME — set it to an absolute path, or set KTSTR_CACHE_DIR to an absolute path (e.g. /tmp/ktstr-cache) or XDG_CACHE_HOME to specify a cache location explicitly.
```

```text
HOME is set to the empty string; cannot resolve cache directory. An empty HOME usually means a Dockerfile or shell rc has `export HOME=` or `ENV HOME=` with no value. Either set HOME to a real absolute path, or set KTSTR_CACHE_DIR to an absolute path (e.g. /tmp/ktstr-cache) or XDG_CACHE_HOME to specify a cache location explicitly.
```

The kernel image cache requires a writable directory. ktstr resolves
it as: `KTSTR_CACHE_DIR` > `$XDG_CACHE_HOME/ktstr/kernels/` >
`$HOME/.cache/ktstr/kernels/`. The first form fires when `HOME` is
absent from the environment (typical of bare container inits or
systemd units with no `Environment=HOME=...`); the second fires when
`HOME` is present but assigned to the empty string.

**Fix:** Set `KTSTR_CACHE_DIR` to an explicit path, or ensure `HOME`
is set to a real absolute path.

## Stale kconfig

```text
warning: entries marked (stale kconfig) were built against a different ktstr.kconfig.
Rebuild with: kernel build --force <entry version>
```

`cargo ktstr kernel list` marks entries whose stored `ktstr_kconfig_hash`
differs from the current embedded `ktstr.kconfig` fragment. This
happens after updating ktstr (which may change the kconfig fragment).

**Fix:**

Rebuilds happen automatically on the next `cargo ktstr kernel build`
for stale entries. Use `--force` to override the cache for other
reasons. See [`cargo ktstr kernel list`](running-tests/cargo-ktstr.md#kernel-list)
for the full listing output.

## Kernel auto-download failures

```text
ktstr: no kernel found, downloading latest stable
fetch https://www.kernel.org/releases.json: <error>
```

ktstr auto-downloads a kernel when no `--kernel` is specified and no
kernel is found via the discovery chain (see
[Kernel discovery](getting-started.md#kernel-discovery)). The same
download path runs when `--kernel` specifies a version (e.g.
`--kernel 6.14.2`) that is not in the cache. The CLI label varies:
`ktstr:` for the standalone binary, `cargo ktstr:` for the cargo
subcommand.

The `<error>` above is the underlying reqwest error (DNS resolution,
connection refused, timeout, TLS handshake failure).

```text
fetch https://www.kernel.org/releases.json: HTTP 503
```

kernel.org returned a non-success status code.

```text
no stable kernel with patch >= 8 found in releases.json
```

ktstr requires a stable or longterm release with patch version >= 8
to avoid brand-new major versions that may have build issues. This
error means releases.json contained no qualifying version.

```text
download https://cdn.kernel.org/.../linux-6.14.10.tar.xz: <error>
```

Network failure during tarball download (same causes as above).

```text
extract tarball: <error>
```

Tarball extraction failed. Common causes: disk full, insufficient
permissions on the temp directory, or a truncated download.

```text
kernel built but cache store failed — cannot return image from temporary directory
```

The kernel built successfully but could not be stored in the cache.
Check disk space and permissions on the cache directory.

For version-specific download errors (HTTP 404, HTML responses), see
[Kernel download failures](#kernel-download-failures).

**Fixes:**

- Verify network connectivity: `curl -sI https://www.kernel.org/releases.json`
- Check DNS resolution for kernel.org and cdn.kernel.org.
- Check disk space — the download, extraction, and build require
  significant disk space.
- If behind a proxy, set `HTTP_PROXY`, `HTTPS_PROXY`, and `NO_PROXY`
  (reqwest respects these environment variables).
- Override the cache directory via `KTSTR_CACHE_DIR` if the default
  location has insufficient space or permissions.
- Pre-download a kernel explicitly: `cargo ktstr kernel build 6.14.10`
  to isolate whether the failure is in version resolution or download.

## Kernel download failures

These errors occur when `cargo ktstr kernel build` or `--kernel`
specifies an explicit version. For network and extraction errors
during auto-download, see
[Kernel auto-download failures](#kernel-auto-download-failures).

```text
version 6.14.22 not found. latest 6.14.x: 6.14.10
```

The requested version does not exist on kernel.org. When a version in
the same major.minor series is available in releases.json, the error
suggests it.

```text
version 5.4.99 not found
```

When the series is EOL or not in releases.json, only the "not found"
message appears (no suggestion).

```text
RC tarball not found: https://git.kernel.org/torvalds/t/linux-6.15-rc3.tar.gz
  RC releases are removed from git.kernel.org after the stable version ships.
```

RC tarballs are removed from git.kernel.org after the stable version
ships. Use `--git` with a git.kernel.org URL to clone the tag instead.

```text
download ...: server returned HTML instead of tarball (URL may be invalid)
```

Some CDN error pages return HTTP 200 with `text/html` content type.
The download rejects these responses.

**Fixes:**

- Check the suggested version in the error message.
- Verify the version exists: check
  `https://www.kernel.org/releases.json` for available versions.
- For RC releases, use `--git` with a git.kernel.org URL instead of
  a tarball download.
- Run `cargo ktstr kernel build` without a version to automatically
  fetch the latest stable.

## Shell mode issues

### stdin must be a terminal

```text
stdin must be a terminal for interactive shell mode
```

`cargo ktstr shell` requires a terminal for bidirectional I/O
forwarding. Piped or redirected stdin is rejected.

**Fix:** Run from an interactive terminal session.

### include file not found

```text
-i strace: not found in filesystem or PATH
```

Bare names (without `/`, `.`, or `..`) are searched in `PATH`. If the
binary is not in `PATH`, use an explicit path.

```text
--include-files path not found: ./missing-file
```

Explicit paths (containing `/` or starting with `.`) must exist on
disk.

**Fix:** Verify the file exists and use the correct path.

### include directory contains no files

```text
warning: -i ./empty-dir: directory contains no regular files
```

The directory passed to `--include-files` was walked recursively but
contained no regular files. FIFOs, device nodes, and sockets are
skipped during the walk.

**Fix:** Verify the directory contains the files you expect.

## Model load failed

```text
GGUF model load failed at /home/.../models/Qwen3-4B-Q4_K_M.gguf. The
file may be corrupt or incompatible with the linked llama.cpp version
— delete the file and re-run `cargo ktstr model fetch` to download
a fresh copy. Check stderr for the upstream llama.cpp rejection reason.
```

The host-side LLM extraction backend (`OutputFormat::LlmExtract`)
could not load the cached GGUF weights. The cached file is either
corrupt (partial download, disk error) or incompatible with the
linked llama.cpp version.

**Diagnose:**

- Re-run with `RUST_LOG=llama-cpp-2=info` (or `=debug` for more
  detail) to surface llama.cpp's own rejection reason on stderr.
  The first call to the inference engine routes
  `llama_cpp_2::send_logs_to_tracing` events through the tracing
  subscriber under target `"llama-cpp-2"` (literal hyphens — see
  [Environment Variables]reference/environment-variables.md for
  the EnvFilter shape).
- `cargo ktstr model status` reports the cache path and verdict
  (`Matches`, `Mismatches`, `CheckFailed`, `NotCached`).

**Fix:**

- Delete the cached file and re-fetch:
  `cargo ktstr model clean && cargo ktstr model fetch`. `clean`
  removes both the GGUF artifact and its `.mtime-size` warm-cache
  sidecar; `fetch` re-downloads from the pinned URL and SHA-checks
  the result.
- If `model status` reports `Mismatches`, the local file's hash
  diverged from the pinned digest — `cargo ktstr model fetch` will
  refuse to overwrite a corrupt cache and the explicit `clean` is
  required first.
- If you set `KTSTR_MODEL_OFFLINE=1`, unset it for the re-fetch.
  See [`cargo ktstr model`]running-tests/cargo-ktstr.md#model.

## Flock timeout / NFS rejection

```text
flock LOCK_EX on run-dir target/ktstr/6.14-abc1234 timed out after
30s (lockfile target/ktstr/.locks/6.14-abc1234.lock, holders:
  pid=12345 cmd=cargo-ktstr test --kernel 6.14). A peer cargo
ktstr test process is writing sidecars to the same
{kernel}-{project_commit} directory; wait for it to finish or kill
it, then retry.
```

A peer process is holding the per-run-key advisory `flock(2)`
that serializes sidecar writes; the helper polled for 30 s and
gave up. Run-dir locks live at
`{runs_root}/.locks/{kernel}-{project_commit}.lock` and serialize
the (pre-clear + write) cycle so two concurrent ktstr runs
sharing the same key can't tear partially-written sidecars.

```text
target/ktstr/.locks/6.14-abc1234.lock: filesystem NFS is not
supported for ktstr lockfiles (NFSv3 is advisory-only without
an NLM peer; NFSv4 byte-range locking does not cover flock(2)).
Move the lockfile path to a local filesystem (tmpfs, ext4, xfs,
btrfs, f2fs, bcachefs).
```

`try_flock` rejects NFS, CIFS, SMB2, CephFS, AFS, and FUSE mounts
because `flock(2)` semantics on those filesystems are unreliable
(see [Resource Budget — Filesystem requirement](concepts/resource-budget.md#filesystem-requirement)
for the per-filesystem rationale).

**Diagnose:**

- `cargo ktstr locks` (or `ktstr locks --watch 1s`) prints every
  ktstr flock currently held on the host with PID + cmdline,
  including per-run-key sidecar locks under the "Run-dir locks"
  section (see
  [`cargo ktstr locks`]running-tests/cargo-ktstr.md#locks).
- `cat /proc/locks | grep '<lockfile-path-from-error>'` falls
  back to the kernel's own flock enumeration when the holder is
  outside ktstr.
- `stat -f -c '%T' <runs-root>` reports the filesystem type when
  the rejection error names NFS/CIFS/SMB/CephFS/AFS/FUSE.

**Fix:**

- For a peer-holder timeout: wait for the peer to finish, kill
  it (`kill <pid>` from the holder list), or retry with the peer
  done.
- For an NFS / remote-fs rejection: relocate the runs root to a
  local filesystem. Set `KTSTR_SIDECAR_DIR` to a local path
  (`/tmp/ktstr-sidecars`, a tmpfs mount) — note that this
  override path **also skips the cross-process flock**, so
  concurrent runs targeting the same `KTSTR_SIDECAR_DIR` have no
  serialization between them. Use the override only for a
  single-process run or per-process distinct paths.
- The kernel cache's lockfiles
  (`{cache_root}/.locks/*.lock`) face the same constraint —
  override `KTSTR_CACHE_DIR` to a local filesystem if the default
  resolves to NFS. See
  [Cache directory not found]#cache-directory-not-found.

## Tests pass locally but fail in CI

Common causes:

- **No KVM**: CI runners need hardware virtualization. Check for
  `/dev/kvm` access.
- **Fewer CPUs**: gauntlet topology presets up to 252 CPUs may
  exceed the runner's capacity. Use smaller topologies.
- **No kernel**: set `KTSTR_TEST_KERNEL` in the CI environment.
- **No CAP_SYS_NICE or rtprio**: performance-mode tests require
  `CAP_SYS_NICE` or an rtprio limit for RT scheduling, and enough
  host CPUs for exclusive LLC reservation. Pass `--no-perf-mode`
  (or set `KTSTR_NO_PERF_MODE=1`) to disable all performance mode
  features. Tests with `performance_mode=true` are skipped entirely
  under `--no-perf-mode`.
- **Debug thresholds**: CI often runs debug builds. Debug builds use
  relaxed thresholds (3000ms gap, 35% spread) but may still hit
  limits on slow runners. See
  [default thresholds]concepts/checking.md#default-thresholds.