Skip to main content

HnswConfig

Struct HnswConfig 

Source
pub struct HnswConfig {
    pub dimension: usize,
    pub m: usize,
    pub ef_construction: usize,
    pub ef_search: usize,
    pub ml: u8,
    pub distance_metric: DistanceMetric,
    pub enable_multilayer: bool,
    pub multilayer_level_distribution_base: Option<usize>,
    pub multilayer_deterministic_seed: Option<u64>,
}
Expand description

HNSW algorithm configuration parameters

This struct defines all parameters that control HNSW index behavior. These parameters significantly impact search quality, construction time, and memory usage patterns.

§Field Descriptions

§dimension

Vector dimension count. Must match all vectors inserted into the index. Typical values: 128-4096 depending on embedding model used.

§m

Number of bi-directional links created for each node during construction. This is the primary parameter controlling index connectivity.

  • Lower values (5-12): Faster construction, less memory, lower recall
  • Medium values (16-24): Balanced performance (recommended)
  • Higher values (32-48): Better recall, more memory, slower construction

§ef_construction

Size of dynamic candidate list during index construction. Controls how thoroughly the algorithm explores the graph during insertion.

  • Lower values (100-200): Faster construction
  • Higher values (400-800): Better index quality, slower construction

Size of dynamic candidate list during search operations. Controls search accuracy vs speed trade-off.

  • Lower values (10-50): Faster search, potentially lower accuracy
  • Higher values (100-200): Better recall, slower search

§ml

Maximum number of layers in the HNSW structure. Calculated as floor(-ln(N) * ml_scale) where N is data size. Higher values create deeper graphs for better performance on large datasets.

§distance_metric

Distance function used for vector similarity calculation. Choose based on your vector data characteristics and use case requirements.

§enable_multilayer

Controls whether multi-layer HNSW functionality is enabled. When false (default), all vectors are inserted into the base layer only, providing backward compatibility and avoiding node ID conflicts. When true, proper multi-layer HNSW with exponential distribution is used.

§multilayer_level_distribution_base

Base value for exponential level distribution in multi-layer mode. Higher values create flatter layer distributions (more vectors in higher layers). Default value equals m for optimal performance.

§multilayer_deterministic_seed

Seed for deterministic random number generation in multi-layer operations. When Some(seed), reproducible level assignments are ensured. When None, non-deterministic behavior is used (default for production).

§Default Configuration

The default configuration provides good performance for most use cases:

  • Balanced search quality vs speed
  • Reasonable memory usage (~2.5x vector size)
  • Fast construction time
  • Robust to various data distributions
  • Single-layer mode for backward compatibility

§Multi-layer vs Single-layer Mode

§Single-layer mode (enable_multilayer = false)

  • All vectors inserted into base layer (L0)
  • No node ID conflicts
  • Faster insertion, simpler search
  • Recommended for small datasets (<10k vectors) or when compatibility is critical

§Multi-layer mode (enable_multilayer = true)

  • Exponential level distribution for optimal search performance
  • 3-10x faster search for large datasets (>10k vectors)
  • More complex insertion algorithm with bidirectional ID mapping
  • Recommended for large datasets where search performance is critical

§Examples

use sqlitegraph::hnsw::{HnswConfig, DistanceMetric};

// High-precision configuration
let precise_config = HnswConfig {
    dimension: 768,
    m: 32,
    ef_construction: 400,
    ef_search: 100,
    ml: 24,
    distance_metric: DistanceMetric::Cosine,
    enable_multilayer: false,
    multilayer_level_distribution_base: None,
    multilayer_deterministic_seed: None,
};

// Multi-layer configuration for large datasets
let multilayer_config = HnswConfig {
    dimension: 768,
    m: 16,
    ef_construction: 200,
    ef_search: 50,
    ml: 16,
    distance_metric: DistanceMetric::Cosine,
    enable_multilayer: true,
    multilayer_level_distribution_base: Some(16),
    multilayer_deterministic_seed: Some(42),
};

Fields§

§dimension: usize

Vector dimension count Must match all vectors inserted into the index Range: 1-4096 (practical limits)

§m: usize

Number of connections per node (M parameter) Controls graph connectivity and memory usage Range: 5-48 (typical), higher values require more memory

§ef_construction: usize

Construction ef parameter Dynamic candidate list size during index building Range: 100-800 (typical)

§ef_search: usize

Search ef parameter Dynamic candidate list size during search Range: 10-200 (typical)

§ml: u8

Maximum number of layers Controls maximum graph depth Range: 8-32 (typical)

§distance_metric: DistanceMetric

Distance metric for similarity calculation

§enable_multilayer: bool

Enable multi-layer HNSW functionality When false, uses single-layer mode for backward compatibility When true, enables proper multi-layer HNSW with exponential distribution

§multilayer_level_distribution_base: Option<usize>

Base value for exponential level distribution in multi-layer mode When None, uses m value as default Higher values create flatter distributions (more vectors in higher layers)

§multilayer_deterministic_seed: Option<u64>

Seed for deterministic random number generation in multi-layer operations When Some(seed), ensures reproducible level assignments When None, uses non-deterministic behavior (default for production)

Implementations§

Source§

impl HnswConfig

Source

pub fn new( dimension: usize, m: usize, ef_construction: usize, distance_metric: DistanceMetric, ) -> Self

Create a new HnswConfig with the specified parameters

§Arguments
  • dimension - Vector dimension count
  • m - Number of connections per node (M parameter)
  • ef_construction - Dynamic candidate list size during construction
  • distance_metric - Distance metric for similarity calculation
§Returns

A new HnswConfig instance with sensible defaults for other parameters

§Examples
use sqlitegraph::hnsw::{HnswConfig, DistanceMetric};

let config = HnswConfig::new(128, 16, 200, DistanceMetric::Cosine);

Trait Implementations§

Source§

impl Clone for HnswConfig

Source§

fn clone(&self) -> HnswConfig

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for HnswConfig

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Default for HnswConfig

Source§

fn default() -> Self

Returns the “default value” for a type. Read more
Source§

impl PartialEq for HnswConfig

Source§

fn eq(&self, other: &HnswConfig) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
Source§

impl StructuralPartialEq for HnswConfig

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V