Skip to main content

VectorStoragePolicy

Struct VectorStoragePolicy 

Source
pub struct VectorStoragePolicy {
Show 16 fields pub column_name: String, pub dim: u32, pub metric: VectorMetric, pub precision: VectorPrecision, pub pq: Option<PQConfig>, pub keep_raw_for_reranking: bool, pub pre_normalize: bool, pub hnsw_m: Option<u32>, pub hnsw_ef_construction: Option<u32>, pub ivf_residual: bool, pub embedding_model: Option<EmbeddingModelInfo>, pub modality: Option<VectorModality>, pub partition_by: Option<String>, pub partition_value: Option<String>, pub partition_column_type: Option<String>, pub partition_fields: Vec<PartitionDef>,
}
Expand description

Vector storage configuration applied at table creation time. Stored in Iceberg metadata.json properties.

Fields§

§column_name: String§dim: u32§metric: VectorMetric§precision: VectorPrecision§pq: Option<PQConfig>§keep_raw_for_reranking: bool§pre_normalize: bool

Normalize each input vector to unit L2 length before indexing. Enables the NormalizedCosine fast path in HNSW: distance = 1 - dot(a, b), no sqrt, ~2× faster distance computation. Semantics unchanged — same top-k results as Cosine. Most embedding models (OpenAI, Cohere, etc.) produce nearly-unit vectors; enabling this adds negligible write overhead.

§hnsw_m: Option<u32>

HNSW M parameter — connections per node. None = default (16). Higher M → better recall, more memory, slower build. Recommended values: 8 (low-memory), 16 (default), 32 (high-recall), 64 (max).

§hnsw_ef_construction: Option<u32>

HNSW ef_construction — candidate pool size during build. None = default (150). Higher ef_construction → better graph quality, slower build. Recommended values: 100 (fast), 150 (default), 200 (quality), 400 (max quality).

§ivf_residual: bool

IVF-PQ residual encoding — train PQ on per-cluster residuals (vec - coarse_centroid). Same bytes/vector, ~2-4pp better recall@10. Only applies when IVF-PQ index is used.

§embedding_model: Option<EmbeddingModelInfo>

Optional embedding model metadata. When set:

  • Stored as ailake.embedding-model in Iceberg table properties.
  • Validated on every write_batch: dim mismatch → hard error; name mismatch → warning.
  • Required for migrate_embeddings to track the model transition.
§modality: Option<VectorModality>

Modality tag for this vector column (text / image / audio / video). Stored as ailake.modality-<col> in Iceberg properties and Parquet KV metadata. Allows readers to select the correct HNSW by modality without reading data.

§partition_by: Option<String>

Column to partition by (e.g. “agent_id”). Stored in Iceberg metadata as an identity partition spec, enabling file-level pruning for per-agent search without post-scan filtering. Set this at table creation time; all files written to this table carry the partition column.

§partition_value: Option<String>

Runtime partition value for this writer instance (not stored in table metadata). When set, each file written by this TableWriter is tagged with this value, enabling the search path to prune files from other partitions (e.g., other agents). Typical usage: set to agent_id in Agent.init.

§partition_column_type: Option<String>

Iceberg type of the partition column (“string”, “uuid”, “int”, “long”). Used when writing the Iceberg schema and partition spec at table creation. Defaults to “string” when None. Only relevant when partition_by is set.

§partition_fields: Vec<PartitionDef>

Multi-column / non-identity partition spec (Phase K).

When non-empty, takes precedence over partition_by + partition_column_type for table creation. Supports identity and truncate[W] transforms. Values at write time are provided via partition_value encoded as \x1f-separated compound string (“val1\x1fval2”) matching field order.

Example (two-column identity):

partition_fields: vec![
    PartitionDef::identity("agent_id", "string"),
    PartitionDef::identity("session_id", "string"),
]

Implementations§

Source§

impl VectorStoragePolicy

Source

pub fn default_f16(column: &str, dim: u32, metric: VectorMetric) -> Self

Trait Implementations§

Source§

impl Clone for VectorStoragePolicy

Source§

fn clone(&self) -> VectorStoragePolicy

Returns a duplicate of the value. Read more
1.0.0 (const: unstable) · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for VectorStoragePolicy

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl<'de> Deserialize<'de> for VectorStoragePolicy

Source§

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>
where __D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more
Source§

impl Serialize for VectorStoragePolicy

Source§

fn serialize<__S>(&self, __serializer: __S) -> Result<__S::Ok, __S::Error>
where __S: Serializer,

Serialize this value into the given Serde serializer. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,

Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.