pub struct Embedding {
pub id: Uuid,
pub chunk_id: Uuid,
pub vector: Vec<i16>,
pub model_hash: [u8; 32],
pub dim: u16,
pub l2_norm: f32,
pub embedding_version: u32,
}Expand description
An embedding vector derived from a chunk
Stores the vector in i16 format (quantized) for determinism and storage efficiency.
The embedding ID is derived deterministically via BLAKE3-16 from chunk_id + model_hash.
Fields§
§id: UuidUnique identifier for this embedding (BLAKE3-16 of chunk_id || model_hash)
chunk_id: UuidParent chunk ID
vector: Vec<i16>The embedding vector (i16 quantized, scale = 32767)
model_hash: [u8; 32]Hash of the model weights used to generate this embedding
dim: u16Dimensionality of the vector
l2_norm: f32Precomputed L2 norm of the quantized vector (for similarity computation) Per CP-001: stored for efficient cosine similarity without recomputation
embedding_version: u32Version of the embedding generation process (default 0)
Implementations§
Source§impl Embedding
impl Embedding
Sourcepub fn new(
chunk_id: Uuid,
vector_f32: &[f32],
model_hash: [u8; 32],
embedding_version: u32,
) -> Self
pub fn new( chunk_id: Uuid, vector_f32: &[f32], model_hash: [u8; 32], embedding_version: u32, ) -> Self
Create a new embedding from an f32 vector.
Per CP-010 §3.4-3.5:
- Normalize f32 vector to unit length
- Quantize to i16 with
round_ties_even(scale = 32767)
The embedding ID is deterministic: BLAKE3-16(chunk_id || model_hash || embedding_version).
Sourcepub fn from_quantized(
chunk_id: Uuid,
vector: Vec<i16>,
model_hash: [u8; 32],
embedding_version: u32,
) -> Self
pub fn from_quantized( chunk_id: Uuid, vector: Vec<i16>, model_hash: [u8; 32], embedding_version: u32, ) -> Self
Create an embedding directly from pre-quantized i16 values.
Used when loading from storage where quantization already occurred.
Sourcepub fn from_quantized_with_norm(
chunk_id: Uuid,
vector: Vec<i16>,
model_hash: [u8; 32],
l2_norm: f32,
embedding_version: u32,
) -> Self
pub fn from_quantized_with_norm( chunk_id: Uuid, vector: Vec<i16>, model_hash: [u8; 32], l2_norm: f32, embedding_version: u32, ) -> Self
Create an embedding from pre-quantized values with a precomputed L2 norm.
Used when loading from storage where the norm was already stored.
Sourcepub fn integer_dot_product(&self, other: &[i16]) -> i64
pub fn integer_dot_product(&self, other: &[i16]) -> i64
Compute integer dot product between this embedding and another i16 vector.
Per CP-003 §4.5: all similarity computations use integer math. Returns i64 to avoid overflow (1536 dims * 32767^2 fits in i64).
Sourcepub fn norm_squared(&self) -> i64
pub fn norm_squared(&self) -> i64
Compute the squared L2 norm of the quantized vector (integer).
This avoids sqrt and floating-point entirely.
Sourcepub fn norm_f32(&self) -> f32
pub fn norm_f32(&self) -> f32
Compute L2 norm as f32 (for display/diagnostics only, not canonical).
Sourcepub fn cosine_similarity(&self, other: &Embedding) -> f32
pub fn cosine_similarity(&self, other: &Embedding) -> f32
Compute cosine similarity using integer math.
Returns f32 for convenience, but the dot product and norms are computed entirely in integer arithmetic first.