Crate frame_catalog

Source

Expand description

§Frame Catalog - Vector Similarity Search and RAG Infrastructure

High-performance vector search, embeddings, and retrieval-augmented generation (RAG) for AI systems.

§Features

§🔍 HNSW Vector Search

Fast approximate nearest neighbor search using Hierarchical Navigable Small World graphs:

Sub-millisecond queries: ~0.5-2ms for 10K documents
384-dimensional embeddings: MiniLM-L6-v2 compatible
In-memory index: Optimized for speed
Thread-safe: Concurrent read access with RwLock

§🧠 ONNX Embeddings

Text-to-vector conversion using ONNX Runtime:

MiniLM-L6-v2 model (87MB, 384-dim vectors)
Batch processing: Encode multiple texts efficiently
Normalization: L2-normalized embeddings
Fallback: Simple hash-based embeddings for testing

§💾 Persistent Storage

SQLite-backed vector store with optional compression:

Document references: Store file paths or spool offsets
BytePunch compression: 40-70% space savings
DataSpool integration: Bundle multiple documents
Lazy loading: Load embeddings on demand

§📚 RAG System

High-level retrieval interface:

Automatic chunking: Split documents with overlap
Index + search: One-step document indexing
Configurable: Chunk size, overlap, HNSW parameters

§🗄️ Event Database

Conversation and event storage:

Conversation tracking: Session-based organization
Event history: Timestamped event log
Metadata storage: JSON metadata per event
Search support: Retrieve events by conversation ID

§Usage

use frame_catalog::{VectorStore, VectorStoreConfig};
use frame_catalog::{OnnxEmbeddingGenerator, EmbeddingGenerator};
use frame_catalog::DocumentChunk;

// Create embedding generator
let embedder = OnnxEmbeddingGenerator::new().unwrap();

// Create vector store
let config = VectorStoreConfig::default();
let mut store = VectorStore::new(config).unwrap();

// Index documents
let chunk = DocumentChunk {
    id: "doc1".to_string(),
    content: "Rust is a systems programming language".to_string(),
    source: "rust-docs".to_string(),
    metadata: None,
};

let embedding = embedder.generate(&chunk.content).unwrap();
store.add_chunk(chunk, &embedding).unwrap();

// Search
let query_embedding = embedder.generate("programming languages").unwrap();
let results = store.search(&query_embedding, 5).unwrap();

for result in results {
    println!("{:.3}: {}", result.score, result.chunk.content);
}

Re-exports§

pub use vector_store::VectorStore;
pub use vector_store::VectorStoreConfig;
pub use vector_store::VectorStoreError;
pub use vector_store::DocumentChunk;
pub use vector_store::SearchResult;
pub use vector_store::EMBEDDING_DIM;
pub use embeddings::EmbeddingGenerator;
pub use embeddings::EmbeddingError;
pub use embeddings::SimpleEmbeddingGenerator;
pub use database::Database;
pub use database::DatabaseError;
pub use database::StoredEvent;
pub use database::Conversation;
pub use retrieval::RetrievalSystem;
pub use retrieval::RetrievalConfig;
pub use retrieval::RetrievalError;

Modules§

database: SQLite database for persistent storage
embeddings: Embedding generation for text chunks
retrieval: Document indexing and retrieval system
vector_store: Vector store for knowledge retrieval using HNSW

Crate frame_catalog

Crate frame_catalog Copy item path

§Frame Catalog - Vector Similarity Search and RAG Infrastructure

§Features

§🔍 HNSW Vector Search

§🧠 ONNX Embeddings

§💾 Persistent Storage

§📚 RAG System

§🗄️ Event Database

§Usage

Re-exports§

Modules§

Crate frame_catalog