omega-agentdb 1.1.0

Advanced vector database with ruvector integration for agent storage and skill management
Documentation
# omega-agentdb

[![Crates.io](https://img.shields.io/crates/v/omega-agentdb)](https://crates.io/crates/omega-agentdb)
[![Documentation](https://docs.rs/omega-agentdb/badge.svg)](https://docs.rs/omega-agentdb)
[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)

SIMD-optimized vector database with HNSW indexing for agent storage, skill management, and semantic search.

**Part of the [ExoGenesis-Omega](https://github.com/prancer-io/ExoGenesis-Omega) cognitive architecture.**

## Overview

`omega-agentdb` is a high-performance, in-memory vector database optimized for AI agent systems. It provides four specialized subsystems: vector storage with HNSW approximate nearest neighbor search, reflexion episode tracking for agent learning, causal graph analysis for action-outcome relationships, and semantic skill management.

Built with SimSIMD for hardware-accelerated vector operations, AgentDB achieves **13-41x speedup** over standard implementations on modern CPUs with AVX-512 support.

## Features

- **SIMD-Accelerated Vector Search**: SimSIMD optimization for 13-41x faster cosine similarity
- **HNSW Index**: Hierarchical Navigable Small World graphs for sub-linear search
- **Reflexion System**: Store and analyze agent learning episodes
- **Causal Graphs**: Track cause-effect relationships with confidence scores
- **Skill Management**: Semantic search for learned skills with usage tracking
- **High Performance**: Optimized for 4096-dimensional embeddings (configurable)
- **Async-First**: Full Tokio integration for concurrent operations
- **Type-Safe API**: Strongly typed with comprehensive error handling

## Installation

Add this to your `Cargo.toml`:

```toml
[dependencies]
omega-agentdb = "0.1.0"
```

## Quick Start

```rust
use omega_agentdb::{AgentDB, AgentDBConfig};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create database with default config (4096 dimensions)
    let db = AgentDB::new(AgentDBConfig::default()).await?;

    // Store a vector with metadata
    let embedding: Vec<f32> = vec![0.1; 4096]; // 4096-dim vector
    let metadata = serde_json::json!({
        "type": "agent_state",
        "agent_id": "agent-001",
    });

    let vector_id = db.vector_store(embedding.clone(), metadata).await?;

    // Search for similar vectors
    let results = db.vector_search(&embedding, 5).await?;
    for result in results {
        println!("Found vector {} with similarity {:.3}",
            result.id, result.similarity);
    }

    Ok(())
}
```

## Core Concepts

### Vector Storage

AgentDB uses a two-layer approach:

1. **HNSW Index**: Approximate nearest neighbor search for fast retrieval
2. **SimSIMD**: Hardware-accelerated distance computations

The HNSW index provides O(log n) search complexity with high recall, while SimSIMD leverages AVX-512, AVX2, NEON, and other SIMD instruction sets for massive speedups.

### Reflexion Episodes

Reflexion captures agent learning experiences:

```rust
pub struct ReflexionEpisode {
    pub id: Option<String>,
    pub session_id: String,
    pub task: String,
    pub input: serde_json::Value,
    pub output: serde_json::Value,
    pub reward: f64,
    pub success: bool,
    pub critique: String,
    pub latency_ms: u64,
    pub tokens: u64,
    pub timestamp: DateTime<Utc>,
    pub embedding: Option<Vec<f32>>,
}
```

### Causal Edges

Model cause-effect relationships:

```rust
pub struct CausalEdge {
    pub cause: String,
    pub effect: String,
    pub uplift: f64,        // Effect size
    pub confidence: f64,    // Statistical confidence
    pub sample_size: u64,
    pub first_observed: DateTime<Utc>,
    pub last_observed: DateTime<Utc>,
}
```

### Skills

Semantic skill storage with embeddings:

```rust
pub struct Skill {
    pub id: Option<String>,
    pub name: String,
    pub description: String,
    pub embedding: Vec<f32>,
    pub usage_count: u64,
    pub success_rate: f64,
    pub created_at: DateTime<Utc>,
}
```

## Use Cases

### 1. Semantic Agent Memory

```rust
use omega_agentdb::{AgentDB, AgentDBConfig};

let db = AgentDB::new(AgentDBConfig {
    dimension: 1536, // OpenAI ada-002 dimensions
    hnsw_m: 32,
    hnsw_ef: 100,
    cache_size: 100_000,
}).await?;

// Store agent memories with embeddings
let memory_embedding = get_embedding("User prefers concise responses");
let id = db.vector_store(
    memory_embedding,
    serde_json::json!({
        "type": "preference",
        "user_id": "user-123",
        "content": "User prefers concise responses",
    })
).await?;

// Later, retrieve relevant memories
let query_embedding = get_embedding("How should I respond?");
let relevant = db.vector_search(&query_embedding, 3).await?;
```

### 2. Agent Learning with Reflexion

```rust
use omega_agentdb::{AgentDB, ReflexionEpisode};
use chrono::Utc;

let db = AgentDB::new(AgentDBConfig::default()).await?;

// Store a learning episode
let episode = ReflexionEpisode {
    id: None,
    session_id: "session-001".to_string(),
    task: "solve_math_problem".to_string(),
    input: serde_json::json!({"problem": "What is 2^10?"}),
    output: serde_json::json!({"answer": 1024}),
    reward: 1.0,
    success: true,
    critique: "Correctly computed power of 2".to_string(),
    latency_ms: 150,
    tokens: 25,
    timestamp: Utc::now(),
    embedding: None,
};

db.reflexion_store(episode).await?;

// Analyze learning progress
let stats = db.reflexion_analyze("solve_math").await?;
println!("Success rate: {:.1}%", stats.success_rate * 100.0);
println!("Avg reward: {:.2}", stats.avg_reward);
```

### 3. Causal Discovery

```rust
use omega_agentdb::{AgentDB, CausalEdge};
use chrono::Utc;

let db = AgentDB::new(AgentDBConfig::default()).await?;

// Record causal relationship
let edge = CausalEdge {
    cause: "use_caching".to_string(),
    effect: "reduced_latency".to_string(),
    uplift: 0.65, // 65% improvement
    confidence: 0.95,
    sample_size: 1000,
    first_observed: Utc::now(),
    last_observed: Utc::now(),
};

db.causal_add_edge(edge).await?;

// Query what causes improved latency
let causes = db.causal_query_causes("reduced_latency").await?;
for cause in causes {
    println!("Cause: {} (uplift: {:.1}%, confidence: {:.1}%)",
        cause.cause, cause.uplift * 100.0, cause.confidence * 100.0);
}

// Find causal paths
let paths = db.causal_find_path("use_caching", "user_satisfaction", 3).await?;
```

### 4. Skill Library

```rust
use omega_agentdb::{AgentDB, Skill};
use chrono::Utc;

let db = AgentDB::new(AgentDBConfig {
    dimension: 768, // bert-base dimensions
    ..Default::default()
}).await?;

// Add skills to library
let skill = Skill {
    id: None,
    name: "code_review".to_string(),
    description: "Review code for bugs, style, and best practices".to_string(),
    embedding: get_skill_embedding("code review"),
    usage_count: 0,
    success_rate: 0.0,
    created_at: Utc::now(),
};

let skill_id = db.skill_create(skill).await?;

// Find relevant skills by embedding
let query = get_skill_embedding("check code quality");
let skills = db.skill_search_by_embedding(&query, 5).await?;

for (skill, similarity) in skills {
    println!("{} (similarity: {:.2}, success: {:.1}%)",
        skill.name, similarity, skill.success_rate * 100.0);
}

// Update skill statistics after use
db.skill_update_stats(&skill_id, true).await?; // success = true
```

### 5. Multi-Session Agent Coordination

```rust
use omega_agentdb::{AgentDB, AgentDBConfig};

let db = AgentDB::new(AgentDBConfig::default()).await?;

// Multiple agents share knowledge through vector database
for agent_id in ["agent-1", "agent-2", "agent-3"] {
    let knowledge = get_agent_knowledge(agent_id);
    db.vector_store(
        knowledge,
        serde_json::json!({"agent_id": agent_id, "type": "knowledge"})
    ).await?;
}

// Any agent can query collective knowledge
let query = get_embedding("How to optimize neural networks?");
let collective_knowledge = db.vector_search(&query, 10).await?;

// Track which agent contributed most relevant knowledge
for result in collective_knowledge {
    println!("From {}: similarity {:.3}",
        result.metadata["agent_id"], result.similarity);
}
```

## Examples

### HNSW Performance Benchmarking

```rust
use omega_agentdb::{AgentDB, AgentDBConfig};
use std::time::Instant;

let db = AgentDB::new(AgentDBConfig {
    dimension: 128,
    hnsw_m: 16,      // Lower M = faster build, lower recall
    hnsw_ef: 100,    // Higher ef = better recall, slower search
    cache_size: 100_000,
}).await?;

// Insert 10,000 vectors
let start = Instant::now();
for i in 0..10_000 {
    let embedding: Vec<f32> = (0..128)
        .map(|j| ((i * j) as f32) / 1000.0)
        .collect();
    db.vector_store(embedding, serde_json::json!({"id": i})).await?;
}
println!("Inserted 10k vectors in {:?}", start.elapsed());

// Search benchmark
let query: Vec<f32> = vec![0.5; 128];
let start = Instant::now();
let results = db.vector_search(&query, 10).await?;
println!("Search completed in {:?}", start.elapsed());
println!("Top result similarity: {:.3}", results[0].similarity);
```

### Database Statistics

```rust
let stats = db.stats().await;
println!("AgentDB Statistics:");
println!("  Vectors: {}", stats.vector_count);
println!("  Reflexion episodes: {}", stats.episode_count);
println!("  Causal edges: {}", stats.causal_edge_count);
println!("  Skills: {}", stats.skill_count);
```

## Architecture

AgentDB integrates seamlessly with the Omega ecosystem:

```
┌────────────────────────────────────────┐
│      omega-runtime / Applications       │
└──────────────┬─────────────────────────┘
┌────────────────────────────────────────┐
│         omega-agentdb                   │
│  ┌──────────────┬──────────────────┐   │
│  │ Vector Store │  Reflexion       │   │
│  │  (HNSW)      │  Episodes        │   │
│  ├──────────────┼──────────────────┤   │
│  │ Causal Graph │  Skill Library   │   │
│  └──────────────┴──────────────────┘   │
└──────────────┬─────────────────────────┘
┌────────────────────────────────────────┐
│  SimSIMD (SIMD-optimized similarity)   │
│  instant-distance (HNSW index)         │
└────────────────────────────────────────┘
```

## Performance

AgentDB is optimized for modern hardware:

### SIMD Acceleration (SimSIMD)

- **AVX-512**: 41x speedup for 4096-dim vectors
- **AVX2**: 28x speedup for 4096-dim vectors
- **NEON** (ARM): 13x speedup for 4096-dim vectors
- **Fallback**: Optimized scalar implementation

### HNSW Index Performance

- **Build time**: O(n log n) for n vectors
- **Search time**: O(log n) with >95% recall
- **Memory**: ~32 bytes per vector for M=16

### Real-World Benchmarks

For 100K vectors (4096-dim, M=32, ef=100):

- **Insert**: ~2.5ms per vector
- **Search (k=10)**: ~0.8ms per query
- **Recall@10**: 97.5%

## Related Crates

- **[omega-core]../omega-core** - Core types and traits
- **[omega-memory]../omega-memory** - Uses AgentDB for semantic memory
- **[omega-persistence]../omega-persistence** - SQLite storage layer
- **[omega-loops]../omega-loops** - Temporal loop orchestration
- **[omega-meta-sona]../omega-meta-sona** - Architecture search
- **[omega-runtime]../omega-runtime** - Production runtime
- **[omega-hippocampus]../omega-hippocampus** - Hippocampal memory system
- **[omega-brain]../omega-brain** - Unified cognitive architecture

## License

Licensed under the MIT License. See [LICENSE](../../LICENSE) for details.