pub struct VectorStoragePolicy {
pub column_name: String,
pub dim: u32,
pub metric: VectorMetric,
pub precision: VectorPrecision,
pub pq: Option<PQConfig>,
pub keep_raw_for_reranking: bool,
pub pre_normalize: bool,
pub hnsw_m: Option<u32>,
pub hnsw_ef_construction: Option<u32>,
pub ivf_residual: bool,
pub embedding_model: Option<EmbeddingModelInfo>,
pub modality: Option<VectorModality>,
}Expand description
Vector storage configuration applied at table creation time. Stored in Iceberg metadata.json properties.
Fields§
§column_name: String§dim: u32§metric: VectorMetric§precision: VectorPrecision§pq: Option<PQConfig>§keep_raw_for_reranking: bool§pre_normalize: boolNormalize each input vector to unit L2 length before indexing. Enables the NormalizedCosine fast path in HNSW: distance = 1 - dot(a, b), no sqrt, ~2× faster distance computation. Semantics unchanged — same top-k results as Cosine. Most embedding models (OpenAI, Cohere, etc.) produce nearly-unit vectors; enabling this adds negligible write overhead.
hnsw_m: Option<u32>HNSW M parameter — connections per node. None = default (16).
Higher M → better recall, more memory, slower build.
Recommended values: 8 (low-memory), 16 (default), 32 (high-recall), 64 (max).
hnsw_ef_construction: Option<u32>HNSW ef_construction — candidate pool size during build. None = default (150).
Higher ef_construction → better graph quality, slower build.
Recommended values: 100 (fast), 150 (default), 200 (quality), 400 (max quality).
ivf_residual: boolIVF-PQ residual encoding — train PQ on per-cluster residuals (vec - coarse_centroid). Same bytes/vector, ~2-4pp better recall@10. Only applies when IVF-PQ index is used.
embedding_model: Option<EmbeddingModelInfo>Optional embedding model metadata. When set:
- Stored as
ailake.embedding-modelin Iceberg table properties. - Validated on every
write_batch: dim mismatch → hard error; name mismatch → warning. - Required for
migrate_embeddingsto track the model transition.
modality: Option<VectorModality>Modality tag for this vector column (text / image / audio / video).
Stored as ailake.modality-<col> in Iceberg properties and Parquet KV metadata.
Allows readers to select the correct HNSW by modality without reading data.
Implementations§
Source§impl VectorStoragePolicy
impl VectorStoragePolicy
pub fn default_f16(column: &str, dim: u32, metric: VectorMetric) -> Self
Trait Implementations§
Source§impl Clone for VectorStoragePolicy
impl Clone for VectorStoragePolicy
Source§fn clone(&self) -> VectorStoragePolicy
fn clone(&self) -> VectorStoragePolicy
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more