1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
//! BTF offsets and stable enum indices for per-node NUMA-event
//! counter capture.
//!
//! The walker resolves the kernel symbol `node_data` (an array of
//! `pglist_data *` indexed by node id), dereferences each entry to
//! reach the `pglist_data` for that node, walks
//! `pglist_data.node_zones[]` (an inline array of `struct zone`), and
//! reads `zone.vm_numa_event[NR_VM_NUMA_EVENT_ITEMS]` (an array of
//! `atomic_long_t`) to produce per-node counters.
//!
//! Items are `#[allow(dead_code)]` because the live walker that
//! consumes the offsets/indices is pending — the wire shape and
//! BTF resolver are landed but no caller resolves the offsets yet.
use Result;
use Btf;
use ;
/// Stable indices into `zone.vm_numa_event[NR_VM_NUMA_EVENT_ITEMS]`
/// from `enum numa_stat_item` (`include/linux/mmzone.h`). The kernel
/// pins this order; external readers (`/sys/devices/system/node/nodeN/numastat`,
/// `/proc/zoneinfo`) depend on it. Hard-coded here for the same
/// reason the [`super::CPUTIME_USER`] family is hard-coded — BTF only
/// encodes the array length, not the enum-to-position mapping, so a
/// BTF-driven read would require resolving the enum separately
/// (which is a UAPI break, not a layout drift this code can adapt
/// to).
pub const NUMA_HIT: usize = 0;
/// Pages allocated on the requested non-local node when the local
/// node was full. See [`NUMA_HIT`].
pub const NUMA_MISS: usize = 1;
/// Pages allocated on this node by a process whose policy targeted
/// a different node. See [`NUMA_HIT`].
pub const NUMA_FOREIGN: usize = 2;
/// Allocations from an interleave policy that hit this node.
/// See [`NUMA_HIT`].
pub const NUMA_INTERLEAVE_HIT: usize = 3;
/// Allocations on this node by a process running on this node.
/// See [`NUMA_HIT`].
pub const NUMA_LOCAL: usize = 4;
/// Allocations on this node by a process running on a different
/// node. See [`NUMA_HIT`].
pub const NUMA_OTHER: usize = 5;
/// Number of `numa_stat_item` slots per `zone.vm_numa_event[]`.
/// Mirrors `NR_VM_NUMA_EVENT_ITEMS` in `include/linux/mmzone.h`
/// (= 6 in current mainline). The pin-via-constant is intentional:
/// adding a slot to the kernel enum is a UAPI break that warrants a
/// host-side update of this constant, not a BTF-driven autodiscover.
pub const NR_VM_NUMA_EVENT_ITEMS: usize = 6;
/// Names of every NUMA event slot, indexed by [`NUMA_HIT`] etc.
/// Surfaced in failure-dump JSON so a downstream consumer reading
/// `vm_numa_event[i]` knows which counter each slot represents
/// without chasing the kernel header. Mirrors [`super::SOFTIRQ_NAMES`]'s
/// rationale.
pub const NUMA_EVENT_NAMES: = ;
/// Byte offsets used to read per-node NUMA-event counters from
/// guest memory.
///
/// The walk path is:
/// 1. Resolve kernel symbol `node_data` — an array of
/// `pglist_data *` indexed by node id (declared in
/// `arch/x86/mm/numa.c::node_data[]` on x86 / `arch/arm64/mm/numa.c`
/// on arm64).
/// 2. For each node, dereference `node_data[node]` to reach the
/// `pglist_data` for that node.
/// 3. Walk `pglist_data.node_zones[MAX_NR_ZONES]` (an inline array
/// of `struct zone`).
/// 4. For each zone, read `zone.vm_numa_event[]` (an array of
/// `atomic_long_t`) and sum across zones to produce per-node
/// counters.
///
/// `pglist_data_node_zones` and `zone_vm_numa_event` are the two
/// offsets the walker needs after the `node_data` symbol is
/// resolved; `zone_size` lets the walker stride to
/// `node_zones[zone_idx]`. `MAX_NR_ZONES` is hard-coded to 5
/// (matching mainline x86_64 and arm64: ZONE_DMA, ZONE_DMA32,
/// ZONE_NORMAL, ZONE_MOVABLE, ZONE_DEVICE) — a kernel without
/// CONFIG_ZONE_DEVICE drops the trailing slot but still reports
/// the others, so iterating up to 5 is safe (indices past the
/// kernel's actual count read all-zero).
///
/// Resolution returns `Err` when `pglist_data` or `zone` are
/// missing from BTF — universal types whose absence indicates a
/// stripped vmlinux. `vm_numa_event` is gated on
/// `CONFIG_NUMA + CONFIG_VM_EVENT_COUNTERS` (the latter defaults
/// to y on every modern kernel); when missing the resolver returns
/// Err so the caller skips the capture.