Crate edgevec

Crate edgevec 

Source
Expand description

§EdgeVec

High-performance embedded vector database for Browser, Node, and Edge.

§Current Status

PHASE 3: Implementation (Week 7 Complete)

Status: Week 7 Complete — Persistence Hardened

Core vector storage, HNSW graph indexing, and full durability (WAL + Snapshots) are implemented and verified.

§Implemented Features

  • HNSW Graph: Full insertion and search implementation with heuristic optimization.
  • Vector Storage: Contiguous memory layout for fast access.
  • Scalar Quantization (SQ8): 4x memory reduction (f32 -> u8) with high accuracy.
  • Durability: Write-Ahead Log (WAL) with CRC32 checksums, crash recovery, and atomic snapshots.
  • Metrics: L2 (Euclidean), Cosine, and Dot Product distance functions.

§Development Protocol

EdgeVec follows a military-grade development protocol:

  1. Architecture Phase — Design docs must be approved before planning
  2. Planning Phase — Roadmap must be approved before coding
  3. Implementation Phase — Weekly tasks must be approved before coding
  4. All gates require HOSTILE_REVIEWER approval

§Example

use edgevec::{HnswConfig, HnswIndex, Metric, VectorStorage};

// 1. Create Config
let config = HnswConfig::new(128);

// 2. Initialize Storage and Index
let mut storage = VectorStorage::new(&config, None);
let mut index = HnswIndex::new(config, &storage).expect("failed to create index");

// 3. Insert Vectors
let vector = vec![0.5; 128];
let id = index.insert(&vector, &mut storage).expect("failed to insert");

// 4. Search
let query = vec![0.5; 128];
let results = index.search(&query, 10, &storage).expect("failed to search");

assert!(!results.is_empty());
assert_eq!(results[0].vector_id, id);

§Persistence Example

use edgevec::{HnswConfig, HnswIndex, VectorStorage};
use edgevec::persistence::{write_snapshot, read_snapshot, MemoryBackend};

// Create index and storage
let config = HnswConfig::new(128);
let mut storage = VectorStorage::new(&config, None);
let mut index = HnswIndex::new(config, &storage).expect("failed to create");

// Save snapshot using storage backend
let mut backend = MemoryBackend::new();
write_snapshot(&index, &storage, &mut backend).expect("failed to save");

// Load snapshot
let (loaded_index, loaded_storage) = read_snapshot(&backend).expect("failed to load");

§Next Steps (Phase 5)

  1. Documentation: Finalize API docs.
  2. NPM Package: Release to npm registry.
  3. Performance: Final tuning and benchmarks.

§Documentation

§EdgeVec

CI Crates.io npm License

The first WASM-native vector database. Binary quantization, metadata filtering, memory management — all in the browser.

EdgeVec is an embedded vector database built in Rust with first-class WebAssembly support. It brings server-grade vector database features to the browser: 32x memory reduction via binary quantization, metadata filtering, soft delete, persistence, and sub-millisecond search.


§Why EdgeVec?

FeatureEdgeVechnswlib-wasmPinecone
Vector SearchYesYesYes
Binary QuantizationYes (32x)NoNo
Metadata FilteringYesNoYes
SQL-like QueriesYesNoYes
Memory Pressure APIYesNoNo
Soft DeleteYesNoYes
PersistenceYesNoYes
Browser-nativeYesYesNo
No server requiredYesYesNo
Offline capableYesYesNo

EdgeVec is the only WASM vector database with binary quantization and filtered search.


§Try It Now

Build filters visually, see live results, copy-paste ready code:

Filter Playground - Interactive filter builder with live sandbox

  • Visual filter construction
  • 10 ready-to-use examples
  • Live WASM execution
  • Copy-paste code snippets (JS/TS/React)

§Quick Start

npm install edgevec
import init, { EdgeVec } from 'edgevec';

await init();

// Create index (768D for embeddings like OpenAI, Cohere)
const db = new EdgeVec({ dimensions: 768 });

// Insert vectors with metadata (v0.6.0)
const vector = new Float32Array(768).map(() => Math.random());
const id = db.insertWithMetadata(vector, {
    category: "books",
    price: 29.99,
    inStock: true
});

// Search with filter expression (v0.6.0)
const query = new Float32Array(768).map(() => Math.random());
const results = db.searchWithFilter(query, 'category = "books" AND price < 50', 10);

// Fast BQ search with rescoring — 32x less memory, 95% recall (v0.6.0)
const fastResults = db.searchBQ(query, 10);

// Monitor memory pressure (v0.6.0)
const pressure = db.getMemoryPressure();
if (pressure.level === 'warning') {
    db.compact();  // Free deleted vectors
}

§Interactive Demos

Try EdgeVec directly in your browser:

DemoDescription
Filter Playground v0.7.0Visual filter builder with live sandbox (NEW!)
v0.6.0 Cyberpunk DemoBQ vs F32 comparison, metadata filtering, memory pressure
Demo HubAll demos in one place

Run locally:

DemoPath
SIMD Benchmarkwasm/examples/simd_benchmark.html
Benchmark Dashboardwasm/examples/benchmark-dashboard.html
Soft Delete Demowasm/examples/soft_delete.html
Main Demowasm/examples/index.html
# Run demos locally
git clone https://github.com/matte1782/edgevec.git
cd edgevec
python -m http.server 8080
# Open http://localhost:8080/wasm/examples/index.html

§Performance

EdgeVec v0.7.0 uses SIMD instructions for 2x+ faster vector operations on modern browsers.

§Distance Calculation (Native Benchmark)

DimensionDot ProductL2 DistanceThroughput
12855 ns66 ns2.3 Gelem/s
384188 ns184 ns2.1 Gelem/s
768374 ns358 ns2.1 Gelem/s
1536761 ns693 ns2.1 Gelem/s

§Search Latency (768D vectors, k=10)

ScaleEdgeVecTargetStatus
1k vectors380 us<1 ms2.6x under
10k vectors938 us<1 msPASS

§Hamming Distance (Binary Quantization)

OperationTimeThroughput
768-bit pair4.5 ns40 GiB/s
Batch 10k79 us127 Melem/s

§Browser Support

BrowserSIMDPerformance
Chrome 91+YESFull speed
Firefox 89+YESFull speed
Safari 16.4+YESFull speed (macOS)
Edge 91+YESFull speed
iOS SafariNOScalar fallback

Note: iOS Safari doesn’t support WASM SIMD. EdgeVec automatically uses scalar fallback, which is ~2x slower but still functional.

§Bundle Size

PackageSize (gzip)Notes
edgevec217 KBSIMD enabled (541 KB uncompressed)

Full benchmark report ->


§Database Features

§Binary Quantization (v0.6.0)

32x memory reduction with minimal recall loss:

// BQ is auto-enabled for dimensions divisible by 8
const db = new EdgeVec({ dimensions: 768 });

// Raw BQ search (~85% recall, ~5x faster)
const bqResults = db.searchBQ(query, 10);

// BQ + rescore (~95% recall, ~3x faster)
const rescoredResults = db.searchBQRescored(query, 10, 5);
ModeMemory (100k × 768D)SpeedRecall@10
F32 (baseline)~300 MB1x100%
BQ raw~10 MB5x~85%
BQ + rescore(5)~10 MB3x~95%

§Metadata Filtering (v0.6.0)

Insert vectors with metadata, search with SQL-like filter expressions:

// Insert with metadata
db.insertWithMetadata(vector, {
    category: "electronics",
    price: 299.99,
    tags: ["featured", "sale"]
});

// Search with filter
db.searchWithFilter(query, 'category = "electronics" AND price < 500', 10);
db.searchWithFilter(query, 'tags ANY ["featured"]', 10);  // Array membership

// Complex expressions
db.searchWithFilter(query,
    '(category = "electronics" OR category = "books") AND price < 100',
    10
);

Operators: =, !=, >, <, >=, <=, AND, OR, NOT, ANY

Filter syntax documentation ->

§Memory Pressure API (v0.6.0)

Monitor and control WASM heap usage:

const pressure = db.getMemoryPressure();
// { level: 'normal', usedBytes: 52428800, totalBytes: 268435456, usagePercent: 19.5 }

if (pressure.level === 'warning') {
    db.compact();  // Free deleted vectors
}

if (!db.canInsert()) {
    console.warn('Memory critical, inserts blocked');
}

§Soft Delete & Compaction

// O(1) soft delete
db.softDelete(id);

// Check status
console.log('Live:', db.liveCount());
console.log('Deleted:', db.deletedCount());

// Reclaim space when needed
if (db.needsCompaction()) {
    const result = db.compact();
    console.log(`Removed ${result.tombstones_removed} tombstones`);
}

§Persistence

// Save to IndexedDB (browser) or filesystem
await db.save("my-vector-db");

// Load existing database
const db = await EdgeVec.load("my-vector-db");

§Scalar Quantization

const config = new EdgeVecConfig(768);
config.quantized = true;  // Enable SQ8 quantization

// 3.6x memory reduction: 3.03 GB -> 832 MB at 1M vectors

§Rust Usage

use edgevec::{HnswConfig, HnswIndex, VectorStorage};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let config = HnswConfig::new(768);
    let mut storage = VectorStorage::new(&config, None);
    let mut index = HnswIndex::new(config, &storage)?;

    // Insert
    let vector = vec![0.1; 768];
    let id = index.insert(&vector, &mut storage)?;

    // Search
    let query = vec![0.1; 768];
    let results = index.search(&query, 10, &storage)?;

    // Soft delete
    index.soft_delete(id)?;

    Ok(())
}

§Documentation

DocumentDescription
TutorialGetting started guide
Filter SyntaxComplete filter expression reference
Database OperationsCRUD operations guide
Performance TuningHNSW parameter optimization
Migration GuideMigrating from hnswlib, FAISS, Pinecone
ComparisonWhen to use EdgeVec vs alternatives

§Limitations

EdgeVec is designed for client-side vector search. It is NOT suitable for:

  • Billion-scale datasets — Browser memory limits apply (~1GB practical limit)
  • Multi-user concurrent access — Single-user, single-tab design
  • Distributed deployments — Runs locally only

For these use cases, consider Pinecone, Qdrant, or Weaviate.


§Version History

  • v0.7.0 — SIMD acceleration (2x+ speedup), First Community Contribution (@jsonMartin — 8.75x Hamming)
  • v0.6.0 — Binary quantization (32x memory), metadata storage, memory pressure API
  • v0.5.4 — iOS Safari compatibility fixes
  • v0.5.3 — crates.io publishing fix (package size reduction)
  • v0.5.2 — npm TypeScript compilation fix
  • v0.5.0 — Metadata filtering with SQL-like syntax, Filter Playground demo
  • v0.4.0 — Documentation sprint, benchmark dashboard, chaos testing
  • v0.3.0 — Soft delete API, compaction, persistence format v3
  • v0.2.0 — Scalar quantization (SQ8), SIMD optimization
  • v0.1.0 — Initial release with HNSW indexing

§Contributors

Thank you to everyone who has contributed to EdgeVec!

ContributorContribution
@jsonMartinSIMD Hamming distance (PR #4) — 8.75x speedup

§License

Licensed under either of:

at your option.


Built with Rust + WebAssembly

GitHub | npm | crates.io | Demos

Re-exports§

pub use batch::BatchInsertable;
pub use error::BatchError;
pub use hnsw::BatchDeleteError;
pub use hnsw::BatchDeleteResult;
pub use hnsw::HnswConfig;
pub use hnsw::HnswIndex;
pub use hnsw::SearchResult;
pub use metric::Metric;
pub use persistence::ChunkedWriter;
pub use quantization::BinaryQuantizer;
pub use quantization::QuantizedVector;
pub use quantization::QuantizerConfig;
pub use quantization::ScalarQuantizer;
pub use simd::capabilities;
pub use simd::detect_neon;
pub use simd::select_backend;
pub use simd::warn_if_suboptimal;
pub use simd::SimdBackend;
pub use simd::SimdCapabilities;
pub use storage::VectorStorage;

Modules§

batch
Batch insertion API. Batch insertion API for HNSW indexes.
error
Unified error handling. Unified error hierarchy for EdgeVec.
filter
Filter expression parsing and evaluation. Filter expression module for EdgeVec.
hnsw
HNSW Graph implementation. HNSW module containing graph logic, configuration, and search.
metadata
Metadata storage for vector annotations. Metadata storage system for EdgeVec.
metric
Distance metrics. Distance metrics for vector comparison.
persistence
Persistence and file format definitions. Persistence module for EdgeVec.
quantization
Quantization support. Quantization logic for vector compression.
simd
SIMD capability detection and runtime optimization. SIMD capability detection and runtime optimization.
storage
Vector storage. Vector Storage Module.
wasm
WASM bindings. WASM Bindings for EdgeVec.

Macros§

simd_dispatch
Unified SIMD dispatch macro for compile-time platform selection.

Constants§

VERSION
The crate version string.

Functions§

version
Returns the crate version string.