pub struct EntryRecord {
pub word_offset: u32,
pub code_offset: u32,
pub log_prior: i16,
pub match_type: u8,
pub flags: u8,
pub raw_freq: u32,
pub embedding_offset: u32,
}Expand description
Per-entry record (16 bytes packed). word_offset and code_offset
are u24 (3 bytes); they point into the string pool. log_prior is
signed Q4 fixed-point (one log unit per 16 integer steps, per
inputx_scoring::Q4). raw_freq is the original pre-quantization
corpus frequency (added v1.4.7 sub-phase A4 step 1) — it lets
cement-side cement rebuild a lossless tiebreaker when two entries
land in the same Q4 log_prior bucket (e.g. 乎/护 for code hu,
both quantize to Q4=170; raw_freq distinguishes them).
Layout: u24 word_offset + u24 code_offset + i16 log_prior + u8 match_type + u8 flags + u32 raw_freq + 2 bytes reserved = 16.
raw_freq=0 on disk is the v1.4.6-era backward-compatible default
(those bytes were bigram_offset, never written non-zero and never
read), so old .idf blobs decode as raw_freq=0 — the only fallout
is loss of the tiebreaker for legacy snapshots.
Fields§
§word_offset: u32§code_offset: u32§log_prior: i16§match_type: u8§flags: u8§raw_freq: u32§embedding_offset: u32Implementations§
Trait Implementations§
Source§impl Clone for EntryRecord
impl Clone for EntryRecord
Source§fn clone(&self) -> EntryRecord
fn clone(&self) -> EntryRecord
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Debug for EntryRecord
impl Debug for EntryRecord
Source§impl PartialEq for EntryRecord
impl PartialEq for EntryRecord
Source§fn eq(&self, other: &EntryRecord) -> bool
fn eq(&self, other: &EntryRecord) -> bool
self and other values to be equal, and is used by ==.