pub struct VectorStoragePolicy {Show 16 fields
pub column_name: String,
pub dim: u32,
pub metric: VectorMetric,
pub precision: VectorPrecision,
pub pq: Option<PQConfig>,
pub keep_raw_for_reranking: bool,
pub pre_normalize: bool,
pub hnsw_m: Option<u32>,
pub hnsw_ef_construction: Option<u32>,
pub ivf_residual: bool,
pub embedding_model: Option<EmbeddingModelInfo>,
pub modality: Option<VectorModality>,
pub partition_by: Option<String>,
pub partition_value: Option<String>,
pub partition_column_type: Option<String>,
pub partition_fields: Vec<PartitionDef>,
}Expand description
Vector storage configuration applied at table creation time. Stored in Iceberg metadata.json properties.
Fields§
§column_name: String§dim: u32§metric: VectorMetric§precision: VectorPrecision§pq: Option<PQConfig>§keep_raw_for_reranking: bool§pre_normalize: boolNormalize each input vector to unit L2 length before indexing. Enables the NormalizedCosine fast path in HNSW: distance = 1 - dot(a, b), no sqrt, ~2× faster distance computation. Semantics unchanged — same top-k results as Cosine. Most embedding models (OpenAI, Cohere, etc.) produce nearly-unit vectors; enabling this adds negligible write overhead.
hnsw_m: Option<u32>HNSW M parameter — connections per node. None = default (16).
Higher M → better recall, more memory, slower build.
Recommended values: 8 (low-memory), 16 (default), 32 (high-recall), 64 (max).
hnsw_ef_construction: Option<u32>HNSW ef_construction — candidate pool size during build. None = default (150).
Higher ef_construction → better graph quality, slower build.
Recommended values: 100 (fast), 150 (default), 200 (quality), 400 (max quality).
ivf_residual: boolIVF-PQ residual encoding — train PQ on per-cluster residuals (vec - coarse_centroid). Same bytes/vector, ~2-4pp better recall@10. Only applies when IVF-PQ index is used.
embedding_model: Option<EmbeddingModelInfo>Optional embedding model metadata. When set:
- Stored as
ailake.embedding-modelin Iceberg table properties. - Validated on every
write_batch: dim mismatch → hard error; name mismatch → warning. - Required for
migrate_embeddingsto track the model transition.
modality: Option<VectorModality>Modality tag for this vector column (text / image / audio / video).
Stored as ailake.modality-<col> in Iceberg properties and Parquet KV metadata.
Allows readers to select the correct HNSW by modality without reading data.
partition_by: Option<String>Column to partition by (e.g. “agent_id”). Stored in Iceberg metadata as an identity partition spec, enabling file-level pruning for per-agent search without post-scan filtering. Set this at table creation time; all files written to this table carry the partition column.
partition_value: Option<String>Runtime partition value for this writer instance (not stored in table metadata).
When set, each file written by this TableWriter is tagged with this value, enabling
the search path to prune files from other partitions (e.g., other agents).
Typical usage: set to agent_id in Agent.init.
partition_column_type: Option<String>Iceberg type of the partition column (“string”, “uuid”, “int”, “long”).
Used when writing the Iceberg schema and partition spec at table creation.
Defaults to “string” when None. Only relevant when partition_by is set.
partition_fields: Vec<PartitionDef>Multi-column / non-identity partition spec (Phase K).
When non-empty, takes precedence over partition_by + partition_column_type
for table creation. Supports identity and truncate[W] transforms.
Values at write time are provided via partition_value encoded as
\x1f-separated compound string (“val1\x1fval2”) matching field order.
Example (two-column identity):
partition_fields: vec![
PartitionDef::identity("agent_id", "string"),
PartitionDef::identity("session_id", "string"),
]Implementations§
Source§impl VectorStoragePolicy
impl VectorStoragePolicy
pub fn default_f16(column: &str, dim: u32, metric: VectorMetric) -> Self
Trait Implementations§
Source§impl Clone for VectorStoragePolicy
impl Clone for VectorStoragePolicy
Source§fn clone(&self) -> VectorStoragePolicy
fn clone(&self) -> VectorStoragePolicy
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more