sqlite-vector-rs
A Rust SQLite extension providing PGVector-style typed vector columns with HNSW approximate nearest-neighbor search, scalar SQL functions, and Arrow IPC bulk I/O.
Vectors are stored as raw typed byte blobs inside SQLite tables — no external service required. The HNSW index (powered by usearch) lives in a shadow table and is persisted across connections.
Features
- 6 vector types —
float2(f16),float4(f32),float8(f64),int1(i8),int2(i16),int4(i32) - 3 distance metrics — L2 (squared Euclidean), cosine, inner product
- HNSW approximate nearest-neighbor search via usearch with configurable parameters
- Arrow IPC bulk import/export for efficient batch operations
- Full virtual table with INSERT, UPDATE, DELETE, and transaction rollback
- Optional metadata columns alongside vectors (TEXT, INTEGER, REAL, BLOB)
- Works three ways — loadable SQLite extension, Rust library, or standalone CLI
Quick Start
Build
This produces target/release/libsqlite_vector_rs.so (Linux),
.dylib (macOS), or .dll (Windows).
Load into any SQLite client
.load target/release/libsqlite_vector_rs
CREATE VIRTUAL TABLE embeddings USING vector(
dim=384,
type=float4,
metric=cosine
);
INSERT INTO embeddings(vector)
VALUES (vector_from_json('[0.1, 0.2, 0.3, ...]', 'float4'));
SELECT rowid, distance
FROM embeddings
WHERE knn_match(distance, vector_from_json('[0.15, 0.25, 0.35, ...]', 'float4'))
LIMIT 10;
Use from Rust
Add to your Cargo.toml:
[]
= { = "../sqlite-vector-rs", = ["library"] }
= { = "0.39", = ["bundled"] }
use Connection;
let conn = open?;
register?;
conn.execute_batch?;
conn.execute?;
Standalone CLI
sqlite3-vector v0.1.0 (SQLite 3.49.1)
Enter ".help" for usage hints.
sqlite3-vector> CREATE VIRTUAL TABLE docs USING vector(dim=3, type=float4, metric=cosine);
sqlite3-vector> INSERT INTO docs(vector) VALUES (vector_from_json('[1,0,0]', 'float4'));
sqlite3-vector> SELECT rowid, distance FROM docs
...> WHERE knn_match(distance, vector_from_json('[0.9,0.1,0]', 'float4'))
...> LIMIT 5;
rowid distance
----- --------
1 0.006116
SQL Reference
CREATE VIRTUAL TABLE
CREATE VIRTUAL TABLE <name> USING vector(
dim=<integer>, -- vector dimension (required)
type=<vector_type>, -- element type (required)
metric=<distance_metric>, -- distance metric (required)
m=<integer>, -- HNSW M parameter (default: 16)
ef_construction=<integer>, -- HNSW build quality (default: 128)
ef_search=<integer>, -- HNSW query quality (default: 64)
metadata='col1 TYPE, ...' -- optional metadata columns
);
Vector types: float2, float4, float8, int1, int2, int4
Distance metrics: l2, cosine, inner_product
KNN Search
SELECT rowid, distance [, metadata_cols...]
FROM <table>
WHERE knn_match(distance, <query_vector_blob>)
LIMIT <k>;
The distance column is a hidden virtual column that returns the distance
between each stored vector and the query. knn_match activates the HNSW index
for efficient approximate search.
Scalar Functions
| Function | Description |
|---|---|
vector_from_json(json, type) |
Parse a JSON array into a vector blob |
vector_to_json(blob, type) |
Convert a vector blob to a JSON array string |
vector_distance(blob_a, blob_b, metric, type) |
Compute distance between two vectors |
vector_dims(blob, type) |
Return the number of dimensions |
vector_rebuild_index(table, type, metric) |
Rebuild the HNSW index from shadow data |
vector_export_arrow(table, type) |
Export all vectors as an Arrow IPC blob |
vector_insert_arrow(table, type, ipc_blob) |
Import vectors from an Arrow IPC blob |
Metadata Columns
CREATE VIRTUAL TABLE docs USING vector(
dim=384,
type=float4,
metric=cosine,
metadata='title TEXT, source TEXT, page INTEGER'
);
INSERT INTO docs(vector, title, source, page)
VALUES (vector_from_json('[...]', 'float4'), 'Chapter 1', 'book.pdf', 42);
SELECT rowid, distance, title, page
FROM docs
WHERE knn_match(distance, vector_from_json('[...]', 'float4'))
LIMIT 5;
Arrow IPC Bulk Operations
Export all vectors to an Arrow IPC stream, then re-import into another table:
-- Export
SELECT vector_export_arrow('source_table', 'float4');
-- Import (returns row count)
SELECT vector_insert_arrow('dest_table', 'float4', <ipc_blob>);
-- Rebuild the HNSW index after bulk import
SELECT vector_rebuild_index('dest_table', 'float4', 'cosine');
Architecture
┌─────────────────────────────────────────────────┐
│ SQLite │
│ ┌────────────────────────────────────────────┐ │
│ │ vector virtual table module │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │
│ │ │ Config │ │ Cursor │ │ Txn │ │ │
│ │ │ parsing │ │ scan/KNN │ │ rollback │ │ │
│ │ └──────────┘ └──────────┘ └──────────┘ │ │
│ └──────────────────┬─────────────────────────┘ │
│ │ │
│ ┌──────────────────▼─────────────────────────┐ │
│ │ Shadow tables │ │
│ │ {name}_data → rowid, vector, metadata │ │
│ │ {name}_index → serialized HNSW graph │ │
│ └─────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────┐ │
│ │ Scalar functions │ │
│ │ vector_from_json, vector_distance, etc. │ │
│ └─────────────────────────────────────────────┘ │
└───────────────────────────────────────────────────┘
│
▼
┌────────────────┐ ┌────────────────┐
│ usearch HNSW │ │ Arrow IPC │
│ (in-memory) │ │ (bulk I/O) │
└────────────────┘ └────────────────┘
Dependencies
| Crate | Purpose |
|---|---|
| sqlite3_ext | SQLite extension + virtual table API |
| usearch | HNSW approximate nearest-neighbor index |
| arrow-* (v58) | Arrow IPC stream encoding for bulk I/O |
| half | IEEE 754 half-precision (f16) support |
| bytemuck | Zero-copy byte casting |
| serde_json | JSON vector parsing |
| rusqlite | Library-mode API (optional, library feature) |
Testing
# Build the extension first (required for integration tests)
# Run all 271 tests
The test suite includes unit tests for every module, integration tests for all SQL interfaces, a Shakespeare PDF ingestion pipeline, and GGUF embedding tests (which download a small model on first run).
License
Licensed under either of Apache License, Version 2.0 or MIT License, at your option.