brainwires-storage 0.6.0

Backend-agnostic storage, tiered memory, and document management for the Brainwires Agent Framework
Documentation

brainwires-storage

Crates.io Documentation License

Backend-agnostic storage, tiered memory, and document management for the Brainwires Agent Framework.

Overview

brainwires-storage is the persistent backend for the Brainwires Agent Framework's infinite context memory system. The crate provides conversation storage with semantic search, document ingestion with hybrid retrieval, three-tier memory hierarchy, image analysis storage, entity extraction with contradiction detection, cross-process lock coordination, and reusable plan templates — enabling agents to maintain unbounded context, coordinate safely, and retrieve relevant knowledge across sessions.

Design principles:

  • Backend-agnostic — domain stores are generic over StorageBackend; swap databases by changing a feature flag, not your application code
  • One struct, one connection — each database backend is a single struct (e.g. LanceDatabase, PostgresDatabase) that implements one or both core traits and shares a single connection for all operations
  • Semantic-first retrieval — all stores embed content via all-MiniLM-L6-v2 (384 dimensions) and search by vector similarity, so queries match meaning rather than keywords
  • Hybrid search — document retrieval combines vector similarity with BM25 keyword scoring via Reciprocal Rank Fusion (RRF) for best-of-both-worlds accuracy
  • Three-tier memory — hot (full messages with TTL), warm (compressed summaries), cold (extracted facts) with automatic demotion/promotion based on importance and access patterns
  • Memory safety — contradiction detection flags conflicting facts for human review; canonical write tokens gate long-lived writes; session TTL auto-expires ephemeral data
  • Cross-process coordination — SQLite-backed locks with WAL mode, stale lock detection via PID/hostname, and automatic cleanup for multi-instance deployments
  • Feature-gated portability — pure types and logic compile everywhere; native-only modules (LanceDB, Arrow, SQLite) are behind the native feature for WASM compatibility
  +-----------------------------------------------------------------------+
  |                        brainwires-storage                              |
  |                                                                        |
  |  +--- Unified Database Layer (databases/) -------------------------+  |
  |  |                                                                  |  |
  |  |  Core Traits:                                                    |  |
  |  |    StorageBackend --- generic CRUD + vector search               |  |
  |  |    VectorDatabase --- RAG embedding storage + hybrid search      |  |
  |  |                                                                  |  |
  |  |  Backends:                                                       |  |
  |  |    LanceDatabase ---- StorageBackend + VectorDatabase (default)  |  |
  |  |    PostgresDatabase - StorageBackend + VectorDatabase            |  |
  |  |    MySqlDatabase ---- StorageBackend only                        |  |
  |  |    SurrealDatabase -- StorageBackend + VectorDatabase            |  |
  |  |    QdrantDatabase --- VectorDatabase only                        |  |
  |  |    PineconeDatabase - VectorDatabase only                        |  |
  |  |    MilvusDatabase --- VectorDatabase only                        |  |
  |  |    WeaviateDatabase - VectorDatabase only                        |  |
  |  |    NornicDatabase --- VectorDatabase only                        |  |
  |  |                                                                  |  |
  |  |  Supporting modules:                                             |  |
  |  |    types.rs --- Record, FieldDef, Filter, ScoredRecord           |  |
  |  |    capabilities.rs --- BackendCapabilities discovery              |  |
  |  |    sql/ --- shared SQL dialect layer|  |
  |  |    bm25_helpers.rs --- BM25 scoring for client-side keyword search|  |
  |  +------------------------------------------------------------------+  |
  |                                                                        |
  |  +--- Core Infrastructure -------------------------------------------+  |
  |  |  EmbeddingProvider --- all-MiniLM-L6-v2 with LRU cache (1000)    |  |
  |  +------------------------------------------------------------------+  |
  |                                                                        |
  |  +--- Domain Stores (stores/) --------------------------------------+  |
  |  |  MessageStore --- vector search, TTL expiry, batch ops            |  |
  |  |  ConversationStore --- metadata, listing by recency               |  |
  |  |  TaskStore / AgentStateStore --- task & agent persistence          |  |
  |  |  PlanStore --- execution plans with markdown export               |  |
  |  +------------------------------------------------------------------+  |
  |                                                                        |
  |  +--- Tiered Memory System -----------------------------------------+  |
  |  |  Hot --- full messages (MessageStore, session TTL)                |  |
  |  |  Warm --- compressed summaries (SummaryStore)                     |  |
  |  |  Cold --- extracted facts (FactStore)                             |  |
  |  |  TierMetadataStore --- access tracking, importance scoring        |  |
  |  |  TieredMemory --- adaptive search, demotion/promotion             |  |
  |  +------------------------------------------------------------------+  |
  |                                                                        |
  |  +--- Document Management ------------------------------------------+  |
  |  |  DocumentProcessor --- PDF, DOCX, Markdown, plain text            |  |
  |  |  DocumentChunker --- paragraph/sentence-aware segmentation        |  |
  |  |  DocumentStore --- hybrid search (vector + BM25 via RRF)         |  |
  |  |  DocumentMetadataStore --- hash-based deduplication               |  |
  |  +------------------------------------------------------------------+  |
  |                                                                        |
  |  +--- Images -------------------------------------------------------+  |
  |  |  ImageStore --- analyzed images with semantic search              |  |
  |  +------------------------------------------------------------------+  |
  |  Note: EntityStore and RelationshipGraph moved to brainwires-cognition |
  |                                                                        |
  |  +--- Coordination & Templates -------------------------------------+  |
  |  |  LockStore --- SQLite WAL locks, stale detection, cleanup         |  |
  |  |  TemplateStore --- reusable plans with {{variable}} substitution  |  |
  |  +------------------------------------------------------------------+  |
  +------------------------------------------------------------------------+

Quick Start

Add to your Cargo.toml:

[dependencies]
brainwires-storage = "0.6"

Store and search conversation messages:

use brainwires_storage::{LanceDatabase, EmbeddingProvider, MessageStore, MessageMetadata};
use std::sync::Arc;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    // Initialize storage — one struct, one connection
    let db = Arc::new(LanceDatabase::new("~/.brainwires/db").await?);
    let embeddings = Arc::new(EmbeddingProvider::new()?);
    db.initialize(embeddings.dimension()).await?;

    let store = MessageStore::new(db.clone(), embeddings.clone());

    // Store a message
    store.add(MessageMetadata {
        message_id: "msg-001".into(),
        conversation_id: "conv-001".into(),
        role: "assistant".into(),
        content: "The auth module uses JWT tokens with RS256 signing".into(),
        token_count: Some(42),
        model_id: Some("claude-opus-4-6".into()),
        images: None,
        created_at: chrono::Utc::now().timestamp(),
        expires_at: None,
    }).await?;

    // Semantic search across all conversations
    let results = store.search("how does authentication work?", 5, 0.7).await?;
    for (msg, score) in &results {
        println!("[{:.2}] {}: {}", score, msg.role, msg.content);
    }

    Ok(())
}

Database Backends

The databases/ module provides a unified abstraction layer. Each database is a single struct implementing one or both core traits:

  • StorageBackend — generic CRUD + vector search for domain stores (messages, conversations, tasks, etc.)
  • VectorDatabase — RAG-style embedding storage with hybrid search for codebase indexing

Trait implementation matrix

Database Struct StorageBackend VectorDatabase Feature flag
LanceDB LanceDatabase YES YES lance-backend (default via native)
PostgreSQL + pgvector PostgresDatabase YES YES postgres-backend
MySQL / MariaDB MySqlDatabase YES NO mysql-backend
SurrealDB SurrealDatabase YES YES surrealdb-backend
Qdrant QdrantDatabase NO YES qdrant-backend
Pinecone PineconeDatabase NO YES pinecone-backend
Milvus MilvusDatabase NO YES milvus-backend
Weaviate WeaviateDatabase NO YES weaviate-backend
NornicDB NornicDatabase NO YES nornicdb-backend

Connection sharing

Backends that implement both traits share a single connection. This means domain stores and the RAG subsystem can use the same database instance without opening separate connections:

use brainwires_storage::{LanceDatabase, StorageBackend};
use brainwires_storage::databases::VectorDatabase;
use std::sync::Arc;

let db = Arc::new(LanceDatabase::new("/path/to/db").await?);

// Domain stores use the StorageBackend trait
let messages = MessageStore::new(db.clone(), embeddings.clone());
let conversations = ConversationStore::new(db.clone());

// RAG system uses the VectorDatabase trait on the same connection
let rag = RagClient::with_vector_db(db.clone());

Module structure

databases/
  mod.rs              -- top-level module, re-exports
  traits.rs           -- StorageBackend + VectorDatabase trait definitions
  types.rs            -- Record, FieldDef, FieldValue, Filter, ScoredRecord
  capabilities.rs     -- BackendCapabilities runtime discovery
  bm25_helpers.rs     -- shared BM25 scoring for client-side keyword search
  sql/                -- shared SQL dialect layer
    mod.rs            -- SqlDialect trait
    postgres.rs       -- PostgreSQL dialect
    mysql.rs          -- MySQL dialect
    surrealdb.rs      -- SurrealDB dialect
  lance/              -- LanceDB backend (default)
    mod.rs            -- LanceDatabase struct
    arrow_convert.rs  -- Arrow <-> Record conversion
  postgres/           -- PostgreSQL + pgvector backend
  mysql/              -- MySQL / MariaDB backend
  surrealdb/          -- SurrealDB backend (MTREE vector search)
  qdrant/             -- Qdrant backend
  pinecone/           -- Pinecone backend
  milvus/             -- Milvus backend
  weaviate/           -- Weaviate backend
  nornicdb/           -- NornicDB backend
    mod.rs            -- NornicDatabase struct
    transport.rs      -- REST/Bolt/gRPC transport layer

Features

Feature Default Description
native Yes Enables LanceDB backend, FastEmbed, SQLite locks, and all native-only stores
lance-backend Yes (via native) LanceDB embedded vector database
postgres-backend No PostgreSQL + pgvector (both traits)
mysql-backend No MySQL / MariaDB (StorageBackend only)
surrealdb-backend No SurrealDB with native MTREE vector search (both traits)
qdrant-backend No Qdrant vector search server
pinecone-backend No Pinecone managed cloud vectors
milvus-backend No Milvus open-source vectors
weaviate-backend No Weaviate vector search engine
nornicdb-backend No NornicDB graph + vector database
nornicdb-bolt No NornicDB with Neo4j Bolt transport
nornicdb-grpc No NornicDB with Qdrant-compatible gRPC
nornicdb-full No NornicDB with all transports
vector-db No Backward-compat alias for lance-backend
wasm No WASM-compatible compilation (pure types only)
# Default (LanceDB + full native functionality)
brainwires-storage = "0.6"

# WASM-compatible (pure types and logic only)
brainwires-storage = { version = "0.6", default-features = false, features = ["wasm"] }

# With Qdrant backend (in addition to LanceDB)
brainwires-storage = { version = "0.6", features = ["qdrant-backend"] }

# PostgreSQL as primary backend
brainwires-storage = { version = "0.6", features = ["postgres-backend"] }

# MySQL / MariaDB backend
brainwires-storage = { version = "0.6", features = ["mysql-backend"] }

# SurrealDB backend (native vector search)
brainwires-storage = { version = "0.6", features = ["surrealdb-backend"] }

# NornicDB with all transports
brainwires-storage = { version = "0.6", features = ["nornicdb-full"] }

Module availability by feature:

Module Always native Backend-specific
databases (traits, types, capabilities) Yes -- --
image_types Yes -- --
template_store Yes -- --
databases::lance -- Yes lance-backend
databases::qdrant -- -- qdrant-backend
databases::postgres -- -- postgres-backend
databases::mysql -- -- mysql-backend
databases::surrealdb -- -- surrealdb-backend
databases::pinecone -- -- pinecone-backend
databases::milvus -- -- milvus-backend
databases::weaviate -- -- weaviate-backend
databases::nornicdb -- -- nornicdb-backend
databases::sql -- -- any SQL backend
embeddings -- Yes --
message_store, conversation_store -- Yes --
task_store, plan_store, lock_store -- Yes --
document_store, document_processor -- Yes --
image_store -- Yes --
tiered_memory, summary_store, fact_store -- Yes --
tier_metadata_store, file_context -- Yes --
bm25_search, glob_utils, paths -- -- lance-backend

Architecture

StorageBackend trait

Backend-agnostic storage operations. Domain stores are generic over this trait.

Method Description
ensure_table(name, schema) Ensure table exists (idempotent)
insert(table, records) Insert one or more records
query(table, filter, limit) Query with optional filter
delete(table, filter) Delete matching records
count(table, filter) Count matching records
vector_search(table, column, vector, limit, filter) Vector similarity search

VectorDatabase trait

RAG-style embedding storage used by the codebase indexing subsystem.

Method Description
initialize(dimension) Initialize collections
store_embeddings(embeddings, metadata, contents, root_path) Store embeddings with metadata
search(vector, text, limit, min_score, project, root, hybrid) Vector/hybrid search
search_filtered(...) Search with extension/language/path filters
search_with_embeddings(...) Search returning raw embedding vectors
delete_by_file(path) Delete embeddings for a file
clear() Clear all embeddings
get_statistics() Get storage statistics
flush() Flush changes to disk
count_by_root_path(root) Count embeddings per project
get_indexed_files(root) List indexed file paths

EmbeddingProvider

Text embedding with LRU caching, backed by FastEmbed (all-MiniLM-L6-v2, 384 dimensions).

Method Description
new() Create provider with default model
embed(text) Embed single text -> Vec<f32>
embed_cached(text) Embed with LRU cache (1000 entries) -> Vec<f32>
embed_batch(texts) Embed multiple texts -> Vec<Vec<f32>>
dimension() Get embedding dimension (384)
cache_len() Get current cache size
clear_cache() Clear the LRU cache

MessageStore

Conversation messages with vector search and TTL expiry support.

Method Description
new(client, embeddings) Create store
add(message) Add a single message
add_batch(messages) Add multiple messages
get(message_id) Get message by ID
get_by_conversation(conversation_id) Get all messages in a conversation
search(query, limit, min_score) Semantic search across all messages
search_conversation(conversation_id, query, limit, min_score) Search within a conversation
delete(message_id) Delete a single message
delete_by_conversation(conversation_id) Delete all messages in a conversation
delete_expired() Delete TTL-expired messages -> count

MessageMetadata:

Field Type Description
message_id String Unique message identifier
conversation_id String Parent conversation
role String Message role (user, assistant, system)
content String Message content
token_count Option<i32> Token count estimate
model_id Option<String> Model that generated the message
images Option<String> JSON-encoded image references
created_at i64 Unix timestamp
expires_at Option<i64> TTL expiry timestamp (session tier)

ConversationStore

Conversation metadata with create-or-update semantics.

Method Description
new(client) Create store
create(id, title, model_id, message_count) Create or update conversation
get(conversation_id) Get by ID
list(limit) List conversations sorted by recency
update(conversation_id, title, message_count) Update metadata
delete(conversation_id) Delete conversation

TieredMemory

Three-tier memory hierarchy with adaptive search and automatic demotion/promotion.

Method Description
new(hot_store, client, embeddings, config) Create with custom configuration
with_defaults(hot_store, client, embeddings) Create with default thresholds
add_message(message, importance) Add to hot tier with Session authority
add_canonical_message(message, importance, token) Add canonical message (no TTL)
evict_expired() Delete expired session messages -> count
record_access(message_id) Update access tracking for scoring
search_adaptive(query, conversation_id) Similarity-based search across tiers
search_adaptive_multi_factor(query, conversation_id) Blended scoring (similarity + recency + importance)
demote_to_warm(message_id, summary) Compress message to summary
demote_to_cold(summary_id, fact) Extract fact from summary
promote_to_hot(message_id) Restore full message from warm tier
get_demotion_candidates(tier, count) Get candidates for demotion
get_stats() Get tier counts

MemoryTier enum: Hot, Warm, Cold.

MemoryAuthority enum: Ephemeral, Session, Canonical.

DocumentStore

Document ingestion with hybrid search (vector + BM25 via Reciprocal Rank Fusion).

Method Description
new(client, embeddings, bm25_base_path) Create with default chunking
index_file(file_path, scope) Index document from file
index_bytes(bytes, file_name, file_type, scope) Index document from bytes
search(request) Hybrid or vector-only search
delete_document(document_id) Delete document and chunks
list_by_conversation(conversation_id) List documents in conversation
list_by_project(project_id) List documents in project

DocumentScope enum: Conversation(String), Project(String), Global.

DocumentType enum: Pdf, Markdown, PlainText, Docx, Unknown.

ImageStore

Analyzed image storage with semantic search over LLM-generated descriptions.

Method Description
new(client, embeddings) Create store
store(metadata, storage) Store image with metadata
store_from_bytes(bytes, analysis, conversation_id, format) Store from raw bytes
get(image_id) Get image metadata
search(request) Semantic search on analysis text
delete(image_id) Delete image

ImageFormat enum: Png, Jpeg, Gif, Webp, Svg.

ImageStorage enum: Base64(String), FilePath(String), Url(String).

LockStore

SQLite-backed cross-process lock coordination with stale lock detection.

Method Description
new_default() Use ~/.brainwires/locks.db
new_with_path(db_path) Use custom database path
try_acquire(lock_type, resource_path, agent_id, timeout) Acquire lock (idempotent per agent)
release(lock_type, resource_path, agent_id) Release a lock
release_all_for_agent(agent_id) Release all locks held by agent
is_locked(lock_type, resource_path) Check lock status
cleanup_stale() Remove expired and dead-process locks

Lock types: file_read, file_write, build, test, build_test.

TemplateStore

JSON file-based reusable plan template storage with {{variable}} substitution.

Method Description
new(data_dir) Create store (creates templates.json)
save(template) Save a template
get(template_id) Get by ID
search(query) Search name, description, tags
delete(template_id) Delete template

Usage Examples

Connection sharing across domain stores and RAG

use brainwires_storage::{
    LanceDatabase, EmbeddingProvider, MessageStore, ConversationStore,
};
use brainwires_storage::databases::VectorDatabase;
use std::sync::Arc;

let db = Arc::new(LanceDatabase::new("~/.brainwires/db").await?);
let embeddings = Arc::new(EmbeddingProvider::new()?);
db.initialize(embeddings.dimension()).await?;

// All stores share the same LanceDatabase connection
let messages = MessageStore::new(db.clone(), embeddings.clone());
let conversations = ConversationStore::new(db.clone());

// The same `db` can also be passed to the RAG subsystem as a VectorDatabase
// let rag = RagClient::with_vector_db(db.clone());

Store and search conversation messages

use brainwires_storage::{LanceDatabase, EmbeddingProvider, MessageStore, MessageMetadata};
use std::sync::Arc;

let db = Arc::new(LanceDatabase::new("~/.brainwires/db").await?);
let embeddings = Arc::new(EmbeddingProvider::new()?);
db.initialize(embeddings.dimension()).await?;

let store = MessageStore::new(db.clone(), embeddings.clone());

// Add messages
store.add(MessageMetadata {
    message_id: "msg-001".into(),
    conversation_id: "conv-001".into(),
    role: "assistant".into(),
    content: "We should use B-tree indexes for the user lookup table".into(),
    token_count: Some(35),
    model_id: None,
    images: None,
    created_at: chrono::Utc::now().timestamp(),
    expires_at: None,
}).await?;

// Semantic search
let results = store.search("database indexing strategy", 5, 0.7).await?;
for (msg, score) in &results {
    println!("[{:.2}] {}", score, msg.content);
}

// Search within a conversation
let results = store.search_conversation("conv-001", "indexing", 3, 0.6).await?;

Use tiered memory for infinite context

use brainwires_storage::{
    TieredMemory, TieredMemoryConfig, MessageStore, MessageMetadata,
    MemoryTier, LanceDatabase, EmbeddingProvider,
};
use std::sync::Arc;

let db = Arc::new(LanceDatabase::new("~/.brainwires/db").await?);
let embeddings = Arc::new(EmbeddingProvider::new()?);
db.initialize(embeddings.dimension()).await?;

let hot_store = Arc::new(MessageStore::new(db.clone(), embeddings.clone()));

let config = TieredMemoryConfig {
    hot_retention_hours: 12,
    warm_retention_hours: 168,
    max_hot_messages: 500,
    session_ttl_hours: 24,
    ..TieredMemoryConfig::default()
};

let mut memory = TieredMemory::new(hot_store, db.clone(), embeddings.clone(), config);

// Add message to hot tier
memory.add_message(MessageMetadata {
    message_id: "msg-042".into(),
    conversation_id: "conv-001".into(),
    role: "assistant".into(),
    content: "JWT tokens expire after 15 minutes".into(),
    token_count: Some(20),
    model_id: None,
    images: None,
    created_at: chrono::Utc::now().timestamp(),
    expires_at: None,
}, 0.8).await?;

// Search across all tiers with multi-factor scoring
let results = memory.search_adaptive_multi_factor("token expiration", Some("conv-001")).await?;
for result in &results {
    println!("[{:?} {:.2}] {}", result.tier, result.score, result.content);
}

Index and search documents with hybrid retrieval

use brainwires_storage::{
    DocumentStore, DocumentScope, DocumentSearchRequest, DocumentType,
    LanceDatabase, EmbeddingProvider,
};
use std::sync::Arc;
use std::path::Path;

let db = Arc::new(LanceDatabase::new("~/.brainwires/db").await?);
let embeddings = Arc::new(EmbeddingProvider::new()?);
db.initialize(embeddings.dimension()).await?;

let store = DocumentStore::new(db.clone(), embeddings.clone(), "~/.brainwires/bm25");

// Index a file
let metadata = store.index_file(
    Path::new("docs/architecture.md"),
    DocumentScope::Project("my-project".into()),
).await?;
println!("Indexed: {} ({} chunks)", metadata.title.unwrap_or_default(), metadata.chunk_count);

// Hybrid search (vector + BM25)
let results = store.search(DocumentSearchRequest {
    query: "authentication flow".into(),
    limit: 10,
    min_score: 0.5,
    conversation_id: None,
    project_id: Some("my-project".into()),
    file_types: None,
    use_hybrid: true,
}).await?;

for result in &results {
    println!("[{:.2}] {} (chunk {})", result.score, result.document_id, result.chunk_index);
}

Coordinate multi-process access with locks

use brainwires_storage::LockStore;
use std::time::Duration;

let locks = LockStore::new_default().await?;

// Acquire a write lock with 30-second timeout
let acquired = locks.try_acquire(
    "file_write",
    "src/main.rs",
    "agent-001",
    Some(Duration::from_secs(30)),
).await?;

if acquired {
    // Do exclusive work on file...
    locks.release("file_write", "src/main.rs", "agent-001").await?;
}

// Cleanup stale locks from dead processes
let cleaned = locks.cleanup_stale().await?;
println!("Cleaned {} stale locks", cleaned);

Integration

Use via the brainwires facade crate with the storage feature, or depend on brainwires-storage directly:

# Via facade
[dependencies]
brainwires = { version = "0.6", features = ["storage"] }

# Direct
[dependencies]
brainwires-storage = "0.6"

The crate re-exports all components at the top level:

use brainwires_storage::{
    // Always available
    StorageBackend, BackendCapabilities,
    FieldDef, FieldType, FieldValue, Filter, Record, ScoredRecord, record_get,
    ImageFormat, ImageMetadata, ImageSearchRequest, ImageSearchResult, ImageStorage,
    PlanTemplate, TemplateStore,
};

// Database backends (feature-gated)
#[cfg(feature = "lance-backend")]
use brainwires_storage::LanceDatabase;

// Native-only stores
#[cfg(feature = "native")]
use brainwires_storage::{
    EmbeddingProvider, CachedEmbeddingProvider, FastEmbedManager,
    ConversationMetadata, ConversationStore,
    MessageMetadata, MessageStore,
    TaskMetadata, TaskStore, AgentStateMetadata, AgentStateStore,
    PlanStore,
    LockStore, LockRecord, LockStats,
    ImageStore,
    SummaryStore, FactStore, TierMetadataStore,
    CanonicalWriteToken, MemoryAuthority, MemoryTier,
    TieredMemory, TieredMemoryConfig, TieredSearchResult,
    FileChunk, FileContent, FileContextManager,
};

// VectorDatabase trait (from databases module)
use brainwires_storage::databases::VectorDatabase;

A prelude module is also available for convenient imports:

use brainwires_storage::prelude::*;

License

Licensed under the MIT License. See LICENSE for details.