Installation
cargo add hnswx
Test
The test inserts 1 million 32-dimensional vectors into the HNSW index and performs query and delete operations.
cargo test test_hnsw_1_million_vectors -- --nocapture
result
=== HNSW 1 Million Vectors Performance Test ===
Dimension: 32D
Number of vectors: 1000000
Configuration: m=16, ef_construction=200, ef_search=100
test massive_scale_test::test_hnsw_1_million_vectors has been running for over 60 seconds
Inserted: 100000 (10.0%)
Inserted: 200000 (20.0%)
Current stats: nodes=200001, max_level=3, avg_connections=15.34
Inserted: 300000 (30.0%)
Inserted: 400000 (40.0%)
Current stats: nodes=400001, max_level=3, avg_connections=15.33
Inserted: 500000 (50.0%)
Inserted: 600000 (60.0%)
Current stats: nodes=600001, max_level=3, avg_connections=15.32
Inserted: 700000 (70.0%)
Inserted: 800000 (80.0%)
Current stats: nodes=800001, max_level=3, avg_connections=15.32
Inserted: 900000 (90.0%)
Insertion completed, total time: 950.1687562s
Average insertion time per vector: 950.168µs
Index statistics after insertion:
Total nodes: 1000000
Maximum level: 3
Average connections: 15.32
Maximum connections: 152
Entry point ID: Some(12931)
Storage size: 32000000 floats
Executing 200 query tests...
Completed queries: 40 (average 1141.37μs)
Completed queries: 80 (average 1113.62μs)
Completed queries: 120 (average 1077.25μs)
Completed queries: 160 (average 1074.22μs)
Query performance statistics (200 queries average):
Average query time: 1068.71μs
Minimum query time: 41μs
Maximum query time: 2506μs
Median (P50): 1089μs
P90: 1553μs
P95: 1976μs
QPS (queries per second): 935.71
k= 1: Average time 1348.50μs, results 1
k= 5: Average time 1151.60μs, results 5
k= 10: Average time 1126.40μs, results 10
k= 20: Average time 1433.50μs, results 20
k= 50: Average time 1331.70μs, results 50
k=100: Average time 1152.40μs, results 100
Batch delete 1: Delete first 100000 nodes
Deleted: 20000 nodes
Deleted: 40000 nodes
Deleted: 60000 nodes
Deleted: 80000 nodes
Delete completed, time: 368.2621ms
Successfully deleted: 100000 nodes
Average delete time: 3.682µs
Statistics after first delete:
Active nodes: 900000
Deleted nodes: 100000
Average connections: 15.07
Query after deletion: 10 nearest neighbors, time 1612μs
Batch delete 2: Delete nodes with ID 100000~150000
Deleted: 10000 nodes
Deleted: 20000 nodes
Deleted: 30000 nodes
Deleted: 40000 nodes
Delete completed, time: 213.4648ms
Successfully deleted: 50000 nodes
Average delete time: 4.269µs
Final statistics:
Active nodes: 850000
Deleted nodes: 150000
Total inserted nodes: 1000000
Remaining node percentage: 85.0%
Final query performance (20 queries average): 1313.65μs
=== Test completed ===
Total test time: 951.266665s
Memory usage estimate: ~122.1 MB
Performance summary:
Insertion speed: 1052.4 vectors/sec
Query speed: 935.7 queries/sec
Delete speed: 257853.0 deletes/sec
Quick Start
Basic Usage
use *;
Using Cosine Similarity
use *;
Custom Configuration
use *;
Configuration Parameters
| Parameter | Description | Default |
|---|---|---|
max_elements |
Maximum capacity of the index | 1000 |
m |
Number of established connections per layer | 16 |
m_max |
Maximum connections at layer 0 | 32 |
m_max_0 |
Maximum connections at highest layer | 64 |
ef_construction |
Size of dynamic candidate list during construction | 200 |
ef_search |
Size of dynamic candidate list during search | 10 |
level_multiplier |
Level multiplier affecting level distribution | 1/ln(16) |
allow_replace_deleted |
Whether to allow replacing deleted elements | true |
Distance Metrics
The library provides two distance metrics:
-
EuclideanDistance - Euclidean distance
- Suitable for ordinary vector spaces
- Distance formula: √Σ(xᵢ - yᵢ)²
-
CosineSimilarity - Cosine similarity (converted to distance)
- Suitable for directional data
- Distance formula: 1 - (a·b)/(||a||·||b||)
Performance Characteristics
-
Time Complexity:
- Insertion: O(log n)
- Search: O(log n)
- Deletion: O(k), where k is number of neighbors
-
Space Complexity: O(n × m), where n is number of nodes, m is average connections