1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
//! Guest-side BPF map fd pinning. See
//! [`crate::scenario::ops::Op::PinBpfMap`] for the full
//! motivation; in short, the same-binary `Op::ReplaceScheduler`
//! swap window's multi-bss case (two `<obj>.bss` copies coexisting
//! while the dying scheduler's BPF object is being torn down) only
//! fires when both copies are still alive at freeze time, and the
//! kernel frees the dying instance's maps as soon as libbpf drops
//! their fds. Holding an extra refcount via this helper keeps the
//! dying scheduler's map alive long enough for at least one
//! post-swap freeze to observe both copies, which is what the
//! framework's [`crate::scenario::snapshot::Snapshot::active`]
//! plus walker disambiguation chain exists to handle.
use ;
use libbpf_sys;
use MapInfoIter;
use io;
use ;
/// Walk the kernel's BPF map ID space, find the first map whose
/// `bpf_map_info.name` matches `name`, return its [`OwnedFd`]. The
/// caller holds the returned fd to keep the map alive (the kernel
/// refcount only drops to zero once every fd holder releases).
///
/// `name` is matched against the kernel-side map name by full-string
/// equality. BPF map names are NUL-terminated and capped at
/// `BPF_OBJ_NAME_LEN = 16` bytes (including the trailing NUL — so 15
/// usable chars max) per `kernel/bpf/syscall.c`'s `bpf_obj_name_cpy`.
/// Pass the kernel-visible name (typically `<obj>.bss` / `<obj>.data`
/// / `<obj>.rodata`); libbpf truncates long object prefixes to fit
/// the 15-char cap, so for a scheduler whose libbpf-source obj name
/// exceeds the limit, the kernel-visible name is the FIRST-15-chars
/// form. Reading a previous [`crate::monitor::dump::FailureDumpReport`]'s
/// `maps[].name` or running `bpftool map list` outside the test is
/// the safe way to discover the exact string the kernel sees.
///
/// **Order matters at the test layer**: this helper must run AFTER
/// the target scheduler's BPF object is loaded. The companion
/// [`crate::scenario::ops::Op::PinBpfMap`] doc documents the "place
/// after a hold long enough for the scheduler to be ready" pattern;
/// this helper itself does not block or retry.
///
/// **ID-order tiebreaker**: the underlying
/// [`libbpf_rs::query::MapInfoIter`] walks in monotonically-
/// increasing map-id order, so when multiple maps share the same
/// name (the same-binary swap window's multi-bss case), the lowest-
/// id (oldest) map is returned. For the swap-window scenario this
/// means: call BEFORE `Op::ReplaceScheduler` so the captured fd is
/// on the OUTGOING scheduler's map; the new scheduler's load will
/// then create a SECOND copy that's also kept alive because the
/// old refcount blocks the kernel from freeing the id.
///
/// **Error on miss**: returns Err naming every map name the walk
/// observed, so the caller can sanity-check what's actually loaded
/// (vs typo'd name vs scheduler-not-attached-yet vs map-already-freed).
///
/// **Privilege**: requires `CAP_SYS_ADMIN`. The kernel gates
/// `BPF_*_GET_NEXT_ID` and `BPF_MAP_GET_FD_BY_ID` on CAP_SYS_ADMIN
/// unconditionally (`kernel/bpf/syscall.c:4741` and `:4849`),
/// independent of `CAP_BPF` (which only governs prog/map creation).
/// ktstr always runs as root inside the guest VM so this is satisfied.