Expand description
Β§EdgeVec
High-performance embedded vector database for Browser, Node, and Edge.
Β§Current Status
PHASE 3: Implementation (Week 7 Complete)
Status: Week 7 Complete β Persistence Hardened
Core vector storage, HNSW graph indexing, and full durability (WAL + Snapshots) are implemented and verified.
Β§Implemented Features
- HNSW Graph: Full insertion and search implementation with heuristic optimization.
- Vector Storage: Contiguous memory layout for fast access.
- Scalar Quantization (SQ8): 4x memory reduction (f32 -> u8) with high accuracy.
- Durability: Write-Ahead Log (WAL) with CRC32 checksums, crash recovery, and atomic snapshots.
- Metrics: L2 (Euclidean), Cosine, and Dot Product distance functions.
Β§Development Protocol
EdgeVec follows a military-grade development protocol:
- Architecture Phase β Design docs must be approved before planning
- Planning Phase β Roadmap must be approved before coding
- Implementation Phase β Weekly tasks must be approved before coding
- All gates require
HOSTILE_REVIEWERapproval
Β§Example
use edgevec::{HnswConfig, HnswIndex, Metric, VectorStorage};
// 1. Create Config
let config = HnswConfig::new(128);
// 2. Initialize Storage and Index
let mut storage = VectorStorage::new(&config, None);
let mut index = HnswIndex::new(config, &storage).expect("failed to create index");
// 3. Insert Vectors
let vector = vec![0.5; 128];
let id = index.insert(&vector, &mut storage).expect("failed to insert");
// 4. Search
let query = vec![0.5; 128];
let results = index.search(&query, 10, &storage).expect("failed to search");
assert!(!results.is_empty());
assert_eq!(results[0].vector_id, id);Β§Persistence Example
use edgevec::{HnswConfig, HnswIndex, VectorStorage};
use edgevec::persistence::{write_snapshot, read_snapshot, MemoryBackend};
// Create index and storage
let config = HnswConfig::new(128);
let mut storage = VectorStorage::new(&config, None);
let mut index = HnswIndex::new(config, &storage).expect("failed to create");
// Save snapshot using storage backend
let mut backend = MemoryBackend::new();
write_snapshot(&index, &storage, &mut backend).expect("failed to save");
// Load snapshot
let (loaded_index, loaded_storage) = read_snapshot(&backend).expect("failed to load");Β§Next Steps (Phase 5)
- Documentation: Finalize API docs.
- NPM Package: Release to npm registry.
- Performance: Final tuning and benchmarks.
Β§Documentation
Β§π EdgeVec
High-performance vector search for Browser, Node, and Edge
β STATUS: Alpha Release Ready β All performance targets exceeded.
Β§Whatβs New in v0.3.0
Β§Soft Delete API (RFC-001)
soft_delete(id)β O(1) tombstone-based deletionis_deleted(id)β Check deletion statusdeleted_count()/live_count()β Vector statisticstombstone_ratio()β Monitor index health
Β§Compaction API
compact()β Rebuild index removing all tombstonesneeds_compaction()β Check if compaction recommendedcompaction_warning()β Get actionable warning message- Configurable threshold (default: 30% tombstones)
Β§WASM Bindings
- Full soft delete API exposed to JavaScript/TypeScript
softDelete(),isDeleted(),deletedCount(),liveCount()compact(),needsCompaction(),compactionWarning()- Interactive browser demo at
/wasm/examples/soft_delete.html
Β§Persistence Format v0.3
- Automatic migration from v0.2 snapshots
- Tombstone state preserved across save/load cycles
Β§Previous (v0.2.1)
- Safety hardening with
bytemuckfor alignment-verified operations - Batch insert API with progress callback
- 24x faster search than voy (fastest pure-WASM competitor)
Β§What is EdgeVec?
EdgeVec is an embedded vector database built in Rust with first-class WASM support. Itβs designed to run anywhere: browsers, Node.js, mobile apps, and edge devices.
Β§Key Features
- Sub-millisecond search β 0.23ms at 100k vectors (768d, quantized)
- HNSW Indexing β O(log n) approximate nearest neighbor search
- Scalar Quantization (SQ8) β 3.6x memory compression
- WASM-First β Native browser support via WebAssembly
- Persistent Storage β
IndexedDBin browser, file system elsewhere - Minimal Dependencies β No C compiler required, WASM-ready
- Tiny Bundle β 213 KB gzipped (57% under 500KB target)
Β§Quick Start
Β§Installation
npm install edgevecFor Rust users: To achieve optimal performance, ensure your .cargo/config.toml includes:
[build]
rustflags = ["-C", "target-cpu=native"]Without this configuration, performance will be 60-78% slower due to missing SIMD optimizations.
Β§Browser/Node.js Usage
import init, { EdgeVec, EdgeVecConfig } from 'edgevec';
async function main() {
// 1. Initialize WASM (required once)
await init();
// 2. Create Config and Index
const config = new EdgeVecConfig(128); // 128 dimensions
config.metric = 'cosine'; // Optional: 'l2', 'cosine', or 'dot'
const index = new EdgeVec(config);
// 3. Insert Vectors
const vector = new Float32Array(128).fill(0.1);
const id = index.insert(vector);
console.log(`Inserted vector with ID: ${id}`);
// 4. Search
const query = new Float32Array(128).fill(0.1);
const results = index.search(query, 10);
console.log("Results:", results);
// Results: [{ id: 0, score: 0.0 }, ...]
// 5. Save to IndexedDB (browser) or file system
await index.save("my-vector-db");
}
main().catch(console.error);Β§Load Existing Index
import init, { EdgeVec } from 'edgevec';
await init();
const index = await EdgeVec.load("my-vector-db");
const results = index.search(queryVector, 10);Β§Rust Usage
use edgevec::{HnswConfig, HnswIndex, VectorStorage};
use edgevec::persistence::{write_snapshot, MemoryBackend};
fn main() -> Result<(), Box<dyn std::error::Error>> {
// 1. Create Config & Storage
let config = HnswConfig::new(128);
let mut storage = VectorStorage::new(&config, None);
// 2. Create Index
let mut index = HnswIndex::new(config, &storage)?;
// 3. Insert Vectors
let vec1 = vec![1.0; 128];
let _id1 = index.insert(&vec1, &mut storage)?;
// 4. Search
let query = vec![1.0; 128];
let results = index.search(&query, 10, &storage)?;
println!("Found {} results", results.len());
// 5. Save Snapshot
let mut backend = MemoryBackend::new();
write_snapshot(&index, &storage, &mut backend)?;
Ok(())
}Β§Batch Insert (Rust)
For inserting many vectors efficiently, use the batch insert API:
use edgevec::{HnswConfig, HnswIndex, VectorStorage};
use edgevec::batch::BatchInsertable;
use edgevec::error::BatchError;
fn main() -> Result<(), BatchError> {
let config = HnswConfig::new(128);
let mut storage = VectorStorage::new(&config, None);
let mut index = HnswIndex::new(config, &storage).unwrap();
// Prepare vectors as (id, data) tuples
let vectors: Vec<(u64, Vec<f32>)> = (1..=1000)
.map(|i| (i as u64, vec![i as f32; 128]))
.collect();
// Batch insert with progress tracking
let ids = index.batch_insert(vectors, &mut storage, Some(|inserted, total| {
println!("Progress: {}/{}", inserted, total);
}))?;
println!("Inserted {} vectors", ids.len());
Ok(())
}Features: Progress tracking, best-effort semantics, and unified error handling.
Β§Soft Delete (Rust)
Delete vectors without rebuilding the index (v0.3.0+):
use edgevec::{HnswConfig, HnswIndex, VectorStorage};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let config = HnswConfig::new(128);
let mut storage = VectorStorage::new(&config, None);
let mut index = HnswIndex::new(config, &storage)?;
// Insert a vector
let vector = vec![1.0; 128];
let id = index.insert(&vector, &mut storage)?;
// Soft delete (O(1) operation)
let was_deleted = index.soft_delete(id)?;
println!("Deleted: {}", was_deleted);
// Check deletion status
println!("Is deleted: {}", index.is_deleted(id)?);
// Get statistics
println!("Live: {}, Deleted: {}", index.live_count(), index.deleted_count());
println!("Tombstone ratio: {:.1}%", index.tombstone_ratio() * 100.0);
// Compact when tombstones accumulate (rebuilds index)
if index.needs_compaction() {
let (new_index, new_storage, result) = index.compact(&mut storage)?;
println!("Removed {} tombstones", result.tombstones_removed);
// Use new_index and new_storage for future operations
}
Ok(())
}Β§Soft Delete (JavaScript)
import init, { EdgeVec, EdgeVecConfig } from 'edgevec';
await init();
const config = new EdgeVecConfig(128);
const index = new EdgeVec(config);
// Insert vectors
const vector = new Float32Array(128).fill(0.5);
const id = index.insert(vector);
// Soft delete
const wasDeleted = index.softDelete(id);
console.log('Deleted:', wasDeleted);
// Statistics
console.log('Live:', index.liveCount());
console.log('Deleted:', index.deletedCount());
console.log('Tombstone ratio:', index.tombstoneRatio());
// Compact when needed
if (index.needsCompaction()) {
const result = index.compact();
console.log(`Removed ${result.tombstones_removed} tombstones`);
}| Operation | Time Complexity | Notes |
|---|---|---|
soft_delete() | O(1) | Set tombstone byte |
is_deleted() | O(1) | Read tombstone byte |
search() | O(log n) | Automatically excludes tombstones |
compact() | O(n log n) | Full index rebuild |
Β§Development Status
EdgeVec follows a military-grade development protocol. No code is written without an approved plan.
Β§β Alpha Release Ready (v0.1.0)
All Performance Targets Exceeded:
- β Search Mean: 0.23ms (4.3x under 1ms target)
- β Search P99 (estimated): <600Β΅s (based on Mean + 2Ο)
- β Memory: 832 MB for 1M vectors (17% under 1GB target)
- β Bundle Size: 213 KB (57% under 500KB target)
What Works Now:
- β HNSW Indexing β Sub-millisecond search at 100k scale
- β Scalar Quantization (SQ8) β 3.6x memory reduction
- β SIMD Optimization β AVX2/FMA for 60-78% speedup
- β Crash Recovery (WAL) β Log-based replay
- β Atomic Snapshots β Safe background saving
- β Browser Integration β WASM Bindings + IndexedDB
- β
npm Package β
edgevec@0.3.0published
Development Progress:
- Phase 0: Environment Setup β β COMPLETE
- Phase 1: Architecture β β COMPLETE
- Phase 2: Planning β β COMPLETE
- Phase 3: Implementation β β COMPLETE
- Phase 4: WASM Integration β β COMPLETE
- Phase 5: Alpha Release β β READY
Β§Whatβs Next (v0.4.0)
- Multi-vector Delete β Batch delete API
- P99 Tracking β Latency distribution metrics in CI
- ARM/NEON Optimization β Cross-platform SIMD verification
- Mobile Support β iOS Safari and Android Chrome formalized
Β§π Performance (Alpha Release)
Β§Search Latency (768-dimensional vectors, k=10)
| Scale | Float32 | Quantized (SQ8) | Target | Status |
|---|---|---|---|---|
| 10k vectors | 203 Β΅s | 88 Β΅s | <1 ms | β 11x under |
| 50k vectors | 480 Β΅s | 167 Β΅s | <1 ms | β 6x under |
| 100k vectors | 572 Β΅s | 329 Β΅s | <1 ms | β 3x under |
Note: Mean latencies from Criterion benchmarks (10 samples). Max observed: 622Β΅s (100k Float32). Outliers: 0-20% (mostly high mild/severe). P99 estimates are all <650Β΅s. See docs/benchmarks/ for full analysis.
Β§Memory Efficiency (768-dimensional vectors)
| Mode | Memory per Vector | 1M Vectors | Compression |
|---|---|---|---|
| Float32 | 3,176 bytes | 3.03 GB | Baseline |
| Quantized (SQ8) | 872 bytes | 832 MB | 3.6x smaller |
Memory per vector includes: vector storage + HNSW graph overhead (node metadata + neighbor pool).
Measured using index.memory_usage() + storage.memory_usage() after building 100k index.
Β§Bundle Size
| Package | Size (Gzipped) | Target | Status |
|---|---|---|---|
edgevec@0.3.0 | 213 KB | <500 KB | β 57% under |
Β§Competitive Comparison (10k vectors, 128 dimensions)
| Library | Search P50 | Insert P50 | Type | Notes |
|---|---|---|---|---|
| EdgeVec | 0.20ms | 0.83ms | WASM | Fastest WASM solution |
| hnswlib-node | 0.05ms | 1.56ms | Native C++ | Requires compilation |
| voy | 4.78ms | 0.03ms | WASM | KD-tree, batch-only |
EdgeVec is 24x faster than voy for search while both are pure WASM. Native bindings (hnswlib-node) are faster but require C++ compilation and donβt work in browsers.
Β§Key Advantages
- β Sub-millisecond search at 100k scale
- β Fastest pure-WASM solution β 24x faster than voy
- β Zero network latency β runs 100% locally (browser, Node, edge)
- β Privacy-preserving β no data leaves the device
- β Tiny bundle β 213 KB gzipped
- β No compilation required β unlike native bindings
Β§Test Environment
- Hardware: AMD Ryzen 7 5700U, 16GB RAM
- OS: Windows 11
- Rust: 1.94.0-nightly (2025-12-05)
- Criterion: 0.5.x
- Compiler flags:
-C target-cpu=native(AVX2 SIMD enabled)
Β§Development Protocol
Β§The Agents
| Agent | Role |
|---|---|
| META_ARCHITECT | System design, data layouts |
| PLANNER | Roadmaps, weekly task plans |
RUST_ENGINEER | Core Rust implementation |
WASM_SPECIALIST | WASM bindings, browser integration |
BENCHMARK_SCIENTIST | Performance testing |
| HOSTILE_REVIEWER | Quality gate (has veto power) |
| DOCWRITER | Documentation, README |
Β§Origins
EdgeVec builds upon lessons learned from binary_semantic_cache, a high-performance semantic caching library. Specifically:
Salvaged (MIT Licensed):
- Hamming distance implementation (~10 lines)
- Binary quantization math (~100 lines)
Built Fresh:
- HNSW graph indexing
- WASM-native architecture
IndexedDBpersistence- Everything else
Β§Acknowledgments
- Thanks to the Reddit community for identifying a potential alignment issue in the persistence layer, which led to improved safety via
bytemuckin v0.2.1. - Thanks to the Hacker News community for feedback on competitive positioning and benchmarking.
Β§License
MIT β See LICENSE
Built with π¦ Rust + πΈοΈ WebAssembly
Correctness by Construction
Re-exportsΒ§
pub use batch::BatchInsertable;pub use error::BatchError;pub use hnsw::HnswConfig;pub use hnsw::HnswIndex;pub use hnsw::SearchResult;pub use metric::Metric;pub use persistence::ChunkedWriter;pub use quantization::BinaryQuantizer;pub use quantization::QuantizedVector;pub use quantization::QuantizerConfig;pub use quantization::ScalarQuantizer;pub use simd::capabilities;pub use simd::warn_if_suboptimal;pub use simd::SimdCapabilities;pub use storage::VectorStorage;
ModulesΒ§
- batch
- Batch insertion API. Batch insertion API for HNSW indexes.
- error
- Unified error handling. Unified error hierarchy for EdgeVec.
- hnsw
- HNSW Graph implementation. HNSW module containing graph logic, configuration, and search.
- metric
- Distance metrics. Distance metrics for vector comparison.
- persistence
- Persistence and file format definitions. Persistence module for EdgeVec.
- quantization
- Quantization support. Quantization logic for vector compression.
- simd
- SIMD capability detection and runtime optimization. SIMD capability detection and runtime optimization.
- storage
- Vector storage. Vector Storage Module.
- wasm
- WASM bindings. WASM Bindings for EdgeVec.
ConstantsΒ§
- VERSION
- The crate version string.
FunctionsΒ§
- version
- Returns the crate version string.