semstore 0.1.0 - Docs.rs

Semantic search in Rust applications is harder than it should be. You either wire up a cloud embedding API (latency, cost, data leaving your machine), run a separate vector database process, or write the plumbing yourself. None of those are reasonable for a library dependency.

semstore is a self-contained semantic index you embed directly in your Rust binary. One struct, four methods, zero infrastructure.

let mut idx = SemanticIndex::open("./index.db")?;

idx.insert("Rust ownership prevents data races at compile time", json!({ "lang": "rust" }))?;
idx.insert("Python uses reference counting for memory management", json!({ "lang": "python" }))?;

let results = idx.search("memory safety", 5)?;
// [0.87] Rust ownership prevents data races at compile time
// [0.74] Python uses reference counting for memory management

No API key. No server. No Python. The BGE-Small model (~23 MB) runs on CPU via ONNX and is cached locally after the first use.

Install

[dependencies]
semstore  = "0.1"
serde_json = "1"

The BGE-Small model (~23 MB) is downloaded from HuggingFace on first use and cached locally.

Feature flags

Feature	Default	Description
`default-embedder`	✓	Bundles BGE-Small-EN-v1.5 via fastembed
`bundled-sqlite`	✓	Statically links SQLite (no system library required)

Bring your own embedder by disabling default-embedder:

semstore = { version = "0.1", default-features = false, features = ["bundled-sqlite"] }

What's inside

BGE-Small-EN-v1.5 — 23 MB ONNX embedding model, runs on CPU
HNSW — approximate nearest-neighbour search via usearch
SQLite — persistent storage for entries and embeddings (survives restarts)

Use case	What it solves
RAG	Retrieve relevant context before calling an LLM
Semantic cache	Avoid redundant LLM calls for similar questions
Knowledge base	Search docs, notes, code by meaning
Deduplication	Find near-duplicate entries in a dataset

Quick start

use semstore::SemanticIndex;
use serde_json::json;

fn main() -> semstore::Result<()> {
    let mut idx = SemanticIndex::open("./index.db")?;

    idx.insert("Rust ownership prevents data races at compile time",
               json!({ "lang": "rust", "topic": "memory" }))?;
    idx.insert("Python uses reference counting for memory management",
               json!({ "lang": "python", "topic": "memory" }))?;

    for r in idx.search("memory safety", 5)? {
        println!("[{:.2}] {}", r.score, r.content);
    }
    Ok(())
}

[0.87] Rust ownership prevents data races at compile time
[0.74] Python uses reference counting for memory management

RAG pattern

fn build_prompt(question: &str, idx: &SemanticIndex) -> semstore::Result<String> {
    let context = idx.search(question, 3)?
        .iter()
        .map(|r| format!("- {}", r.content))
        .collect::<Vec<_>>()
        .join("\n");

    Ok(format!("Context:\n{context}\n\nQuestion: {question}\nAnswer:"))
}

Custom embedder

use semstore::{Embedder, Error, SemanticIndex};

struct OpenAiEmbedder { /* your HTTP client */ }

impl Embedder for OpenAiEmbedder {
    fn embed(&self, text: &str) -> Result<Vec<f32>, Error> {
        // POST to https://api.openai.com/v1/embeddings
        todo!()
    }
    fn dimensions(&self) -> usize { 1536 } // text-embedding-3-small
}

let mut idx = SemanticIndex::builder()
    .embedder(OpenAiEmbedder { /* … */ })
    .path("./index.db")
    .threshold(0.80)
    .build()?;

API

// Constructors
SemanticIndex::open("./index.db")?     // persistent
SemanticIndex::in_memory()?            // ephemeral (tests)
SemanticIndex::builder()               // full configuration
    .path("./index.db")
    .threshold(0.75)
    .embedder(my_embedder)
    .build()?

// Write
idx.insert("content", json!({ "key": "value" }))?        // → u64
idx.insert_batch([("a", json!({})), ("b", json!({}))])?  // → Vec<u64>
idx.remove(id)?                                          // → bool

// Read
idx.search("query", limit)?  // → Vec<SearchResult> sorted by score
idx.len()                    // → usize
idx.stats()?                 // → Stats { total: u64 }

// SearchResult
r.id        // u64
r.content   // String
r.metadata  // serde_json::Value
r.score     // f32 in [0.0, 1.0]

Performance

Operation	Typical (Apple M2 CPU)
First load (model init)	~1–2 s
`embed()` single text	~5 ms
`insert()`	~6 ms
`search()` 10k entries	~6 ms

Examples

cargo run --example basic            # in-memory index
cargo run --example rag              # RAG context building
cargo run --example custom_embedder  # plug in your own model

License

MIT — see LICENSE.