1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
//! Attribute type-codes and runtime values.
//!
//! The spec (§3.2, §4.2) mandates these value categories: integer, float,
//! enum-coded string, raw-interned string, and arrays of each. The encoding
//! choices here are optimized for the scoring hot path:
//!
//! - Numeric values go into fixed-width slots (i64, f64) for SIMD friendliness.
//! - Categorical strings become `u32` enum codes, resolved via the Schema's
//! string interner at `StateUpdate` apply time. Scoring sees `u32`s only.
//! - Array types store a length + pointer pair; arrays live in a per-location
//! side buffer (see `location::state`). Keeping arrays out of the main
//! fixed-layout buffer avoids variable-size slots in the hot path.
//!
//! Reference: this tiered encoding (fixed-width hot path, dictionary-coded
//! strings, side buffer for variable-length) mirrors the physical layout of
//! Apache Arrow record batches [Arrow specification, arrow.apache.org].
/// Runtime cap on the dimension of an `F32Arr` embedding written into a
/// location buffer. Writes beyond this cap are rejected by `apply_update`.
///
/// Why 512? Most production embedding models in 2025–2026 ship 128–768-dim
/// vectors (`OpenAI` text-embedding-3-small: 1536, truncatable to 512;
/// Cohere embed-v3-english: 1024, truncatable to 256/384; `SentenceTransformers`
/// models: typically 384 or 768). 512 covers the 95th percentile and bounds
/// worst-case per-location memory at 512 × 4 = 2 KiB per array. Callers that
/// need larger dims can contribute a PR bumping this constant; it is a
/// deliberate guardrail, not an architectural limit.
pub const MAX_EMBEDDING_DIM: usize = 512;
/// The compile-time-like type tag assigned to each (`KindId`, `AttrId`) pair.
/// Borrowed value carried by an event.
///
/// `AttrSet` is a sequence of `(AttrId, Value<'a>)` pairs. The engine applies
/// each pair to the location's buffer according to `AttrType`.
/// Owned variant for `LocationDef` reference-attribute bundles that outlive
/// the request they came from.