1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
//! Types shared between the eBPF program (`linprov-ebpf`) and the userspace
//! daemon (`linprov`). Everything here is `repr(C)` and Pod-friendly so it
//! survives a round-trip through a ring buffer and a kernel xattr.
//!
//! The crate compiles `no_std` by default (for the BPF target). Enable the
//! `user` feature in userspace to pull in `bytemuck::Pod` / `Zeroable`
//! derives on the wire types.
//!
//! Wire shapes at a glance:
//!
//! - [`OriginRecord`] is what the daemon stores in the xattr and in the BPF
//! `INODE_MARKS` map. BPF writes most of it in `file_open`; userspace
//! augments `creator_path` from `/proc/$pid/exe`.
//! - [`Event`] is the ringbuf record streamed from BPF to userspace.
//! - [`AllowRule`] is one allowlist rule, packed into the BPF
//! `ALLOW_RULES` array. String dims are stored as [`fnv_hash`] values
//! so the BPF side can compare without carrying full byte arrays.
//!
//! ```
//! use linprov_common::{fnv_hash, dim};
//!
//! // Both sides hash strings the same way; same input → same u64.
//! assert_eq!(fnv_hash("/usr/bin/curl"), fnv_hash("/usr/bin/curl"));
//! assert_ne!(fnv_hash("/usr/bin/curl"), fnv_hash("/usr/bin/wget"));
//!
//! // Dimension bits are independent flags on AllowRule::flags.
//! let two_dim = dim::CREATOR_UID | dim::CREATOR_COMM;
//! assert_eq!(two_dim.count_ones(), 2);
//! ```
pub const COMM_LEN: usize = 16;
/// Live exec/target path buffer size: the ringbuf [`Event`] filename
/// and the per-CPU scratch the target-dim walks scan. Sized to Linux
/// `PATH_MAX` so `target_filename` / `target_folder` match the full
/// execution path at any depth and any length. These buffers are
/// transient (per-CPU scratch + ringbuf), never persisted, so they
/// aren't bound by the xattr block-size limit that caps stored data.
pub const EXEC_PATH_LEN: usize = 4096;
/// Max bytes the BPF path walks inspect. Equal to [`EXEC_PATH_LEN`]:
/// the walk body is a `bpf_loop` callback the verifier inspects once,
/// so this is bounded only by the buffer, not the instruction budget.
pub const PATH_HASH_SCAN_LEN: usize = EXEC_PATH_LEN;
/// Number of landing-folder ancestor hashes stored per record, for
/// nested `landing_folder` matching. The walk records the hash of each
/// `/`-terminated prefix of the landing path (shallow → deep) into a
/// `[u64; MAX_FOLDER_ANCESTORS]`; a rule matches if its folder hash
/// equals any of them, so `landing_folder=/home/user/` matches a file
/// that landed in `/home/user/Downloads/sub/`. Bounds nesting *depth*
/// (path length is still unbounded — these are hashes). Must be a power
/// of two: the in-kernel walk masks the index (`& (N-1)`) to keep the
/// array write provably in-bounds without a panic branch. Real landing
/// paths sit well under this, so the mask never actually wraps.
///
/// Capped at 32 by the BPF 512-byte stack limit: the `file_open` walk
/// holds this array by value in its `bpf_loop` context (`32 × 8 = 256`
/// bytes, plus the other context fields). 64 would overflow the stack
/// frame — it'd need the array in a per-CPU map instead, not worth it
/// when 32 ancestor levels already exceeds any real landing path.
pub const MAX_FOLDER_ANCESTORS: usize = 32;
// FNV-1a-64 constants. Used by both sides to hash strings.
pub const FNV_OFFSET: u64 = 0xcbf2_9ce4_8422_2325;
pub const FNV_PRIME: u64 = 0x100_0000_01b3;
/// Hash a string with FNV-1a-64. Byte-by-byte, no trailing NUL, no
/// padding — identical on the BPF and userspace sides.
///
/// Both sides MUST compute the same hash for the same input; the FNV
/// constants ([`FNV_OFFSET`], [`FNV_PRIME`]) are fixed for that reason.
///
/// ```
/// use linprov_common::fnv_hash;
/// // FNV-1a of the empty string is the offset basis.
/// assert_eq!(fnv_hash(""), 0xcbf2_9ce4_8422_2325);
/// // Distinct inputs hash distinctly.
/// assert_ne!(fnv_hash("/tmp/"), fnv_hash("/etc/"));
/// ```
/// Same as [`fnv_hash`], but takes a byte slice. Useful when the source
/// isn't UTF-8 (e.g., a `[u8; EXEC_PATH_LEN]` filename buffer read out
/// of a ringbuf event).
// ----- Allowlist rule. One per line in the allowlist file; each rule
// is a conjunction of (dim, value) conditions. Rules OR together.
/// `flags` bits on [`AllowRule`]. Bits 0–7 are the dims a rule requires.
/// Bits 8+ are *modifiers* on a dim already set.
/// Maximum number of allowlist rules carried by the BPF Array map.
/// Each rule check is ~30 ops + 2 folder lookups; the verifier walks
/// the full bounded loop, so this caps the per-execve cost.
pub const MAX_RULES: usize = 32;
/// One allowlist rule. Set bits in `flags` mark required dims; the
/// corresponding fields below are then compared against the record /
/// execve context at enforce time. Cleared bits → field ignored.
///
/// Strings are stored as FNV-1a-64 hashes (computed identically in
/// userspace and BPF). Collision probability for distinct strings under
/// FNV-64 is negligible at any realistic allowlist size.
pub const XATTR_NAME: &str = "security.bpf.linprov.origin";
pub const EVENT_KIND_NETWORK_FILE_OPEN: u32 = 1;
pub const EVENT_KIND_EXECVE: u32 = 2;
/// A file written by a process that had **read** a marked file (taint
/// propagation — e.g. `tar`/`unzip` extracting a marked archive, or `cp`
/// of a marked file). The carried `origin` is inherited from the source
/// file's record; userspace persists the xattr without re-resolving the
/// creator (the inherited creator identity is kept verbatim).
pub const EVENT_KIND_DERIVED_FILE_OPEN: u32 = 3;
/// A marked file opened for **read** by a known script interpreter
/// (bash/python/…) — i.e. `python foo.py` / `bash foo.sh` / `. foo.sh`,
/// where the kernel execve's the unmarked interpreter and the script
/// itself never reaches `bprm_check_security`. The eBPF `file_open` read
/// branch runs the same allowlist check used at execve against the
/// script's path; in enforce mode `status` is the LSM verdict (`-1`
/// blocked). The `filename` carries the script's path and `comm` the
/// script's basename, so the script — not the interpreter — is the unit
/// surfaced in logs and soak. Matches `EVENT_KIND_EXECVE` handling.
pub const EVENT_KIND_SCRIPT_EXEC: u32 = 4;
/// Runtime mode communicated to the eBPF program via the CONFIG map.
pub const MODE_OBSERVE: u32 = 0;
pub const MODE_SOAK: u32 = 1; // eBPF behaves like OBSERVE; userspace records paths
pub const MODE_ENFORCE: u32 = 2;
/// Current schema version of [`OriginRecord`]. Records carrying a different
/// version are treated as unmarked.
///
/// v4 made the record fully hash-based: the variable-length path fields
/// (`creator_path`, the landing folder, the landing basename) became
/// `u64` FNV hashes instead of fixed buffers. This lifts the path-length
/// ceiling (a hash is the same 8 bytes whether the path is 12 or 4096
/// bytes) and shrinks the record to 64 bytes, well under the xattr
/// block limit. Human-readable resolution of those hashes lives in the
/// plaintext audit db (see the `hashdb` userspace module), not in the
/// record. v3 records (which embedded path strings) are treated as
/// unmarked and get re-marked on next open.
pub const ORIGIN_VERSION: u32 = 4;
/// Provenance record. Carried in the `security.bpf.linprov.origin` xattr
/// and in the INODE_MARKS storage map. Fixed 64 bytes — every
/// variable-length field is an FNV-1a-64 hash, so the record never
/// grows with path length and always fits a single xattr block.
///
/// Filled in stages:
/// * BPF `file_open` sets `version`, `pid`, `ts_boot_ns`, `comm`,
/// `creator_uid`, and the two landing hashes (`landing_folder_hash`,
/// `landing_basename_hash`), computed in one pass over the landing
/// path. `creator_path_hash` is left 0 — BPF can't cheaply resolve
/// the creator's exe path here.
/// * Userspace, on the corresponding ringbuf event, reads
/// `/proc/$pid/exe`, fills `creator_path_hash`, and overwrites the
/// xattr with the augmented record. It also records each hash →
/// path mapping in the plaintext audit db so logs, soak, and the
/// user's own `grep` can resolve hashes back to paths.
///
/// `creator_path_hash == 0` is the "not yet augmented" sentinel:
/// `bprm_check_security` reads the storage record first and falls
/// through to the xattr when it sees a zero creator hash. Rules keyed
/// on `creator_process` won't match an unaugmented record, but other
/// dims still do.
/// Ring-buffer record. Two kinds:
/// NetworkFileOpen — informational; eBPF just wrote (or tried to write)
/// the xattr. `status` is the kfunc return code.
/// Execve — bprm_check fired AND the file already carried the mark.
/// `origin` is the record we read back; `status` is unused.