Codex Memory

A high-performance Rust-based memory storage service designed for reliable content management with advanced extensibility for companion applications. Codex Memory provides deduplicated storage, automatic chunking, and MCP (Model Context Protocol) interface for Claude Desktop integration - all while maintaining a clean, extensible architecture that enables powerful companion applications like Codex-Dreams.

⚠️ Development Notice
This project was developed using AI-assisted iterative coding. While functional and tested, the codebase may contain unconventional patterns or edge cases. We recommend thorough testing in development environments before production use. Contributions and feedback are welcome!

🚀 Features

Core Capabilities

🗄️ Reliable Text Storage - PostgreSQL-backed storage with ACID compliance
🔒 Content Deduplication - SHA-256 hash-based automatic deduplication
📄 Smart File Chunking - Automatic chunking with configurable overlap for large files
🏷️ Tag-Based Organization - Flexible tagging system for categorization
🔗 Parent-Child Relationships - Maintains relationships between chunks and source documents
🤖 MCP Integration - Native Model Context Protocol support for Claude Desktop
🔍 Progressive Search - 3-stage intelligent search that automatically finds results without manual threshold tuning

Technical Features

Connection Pooling - Optimized connection management (5 connections)
Async/Await Architecture - Built on Tokio for high concurrency
Comprehensive Error Handling - Proper Result types throughout
UTF-8 Safe Chunking - Respects character boundaries in all operations
Full-Text Search - PostgreSQL-powered search capabilities
Process Reliability - Singleton process management with health monitoring
Graceful Shutdown - SIGTERM/SIGINT handling with resource cleanup
Auto-Recovery - Wrapper script with restart capabilities and rate limiting

🤝 Companion Applications

Codex-Dreams: Semantic Memory Enhancement

Codex-Dreams is a companion application that adds biological memory modeling and semantic search capabilities to your Codex-stored memories.

What Codex-Dreams Adds:

Semantic embeddings for similarity search across your stored memories (768-dim vectors)
Cognitive modeling using Miller's 7±2, Hebbian learning, and memory consolidation
Advanced analytics including memory clustering and biological insights
pgvector-powered search with <250ms query performance
Pattern recognition across your entire memory corpus

How They Work Together:

Codex Memory (this project): Handles fast storage, retrieval, and MCP integration (your data layer)
Codex-Dreams: Reads your data and adds semantic processing (the intelligence layer)
Shared Database: Both use the same PostgreSQL memories table
Non-Destructive: Codex-Dreams extends but never modifies core Codex functionality
Independent Operation: Each system works alone or together

Integration Status

✅ Schema Compatible: Codex-Dreams adds optional columns, preserves all existing fields
✅ Performance Optimized: HNSW indexes don't interfere with Codex operations
✅ Production Ready: Successfully processing 8,600+ memories with 99.98% coverage
✅ Zero Configuration: Both applications automatically share the same database

📋 Architecture

Codex follows a modular architecture focused on simplicity, reliability, and extensibility:

graph TD
    A[Codex CLI] -->|Store/Retrieve| B[(PostgreSQL: memories table)]
    C[MCP Server] -->|JSON-RPC 2.0| B
    C -->|Claude Desktop| D[AI Assistant Integration]
    
    %% Optional semantic enhancement
    E[Codex-Dreams] -.->|Reads & Enhances| B
    E -.->|Adds embeddings| F[pgvector similarity search]
    
    B -->|Fast retrieval| G[2ms response time]
    B -->|Reliable storage| H[ACID compliance]
    
    style E fill:#f9f,stroke:#333,stroke-width:2px,stroke-dasharray: 5 5
    style F fill:#f9f,stroke:#333,stroke-width:2px,stroke-dasharray: 5 5

Core Architecture

┌─────────────────────────────────────────────────────────────────┐
│                      Codex Memory Architecture                   │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─────────────────┐    ┌─────────────────┐    ┌─────────────┐ │
│  │   CLI Interface │    │   MCP Server    │    │  PostgreSQL │ │
│  │                 │    │                 │    │  Database   │ │
│  │  • store        │◄──►│  • JSON-RPC 2.0 │◄──►│             │ │
│  │  • get          │    │  • 5 MCP Tools  │    │  • Indexes  │ │
│  │  • stats        │    │  • stdio I/O    │    │  • ACID     │ │
│  │  • setup        │    │                 │    │             │ │
│  └─────────────────┘    └─────────────────┘    └─────────────┘ │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

🛠️ Installation

Prerequisites

Rust 1.70 or higher
PostgreSQL 14 or higher
Claude Desktop (optional, for MCP integration)

Quick Start

# Clone the repository
git clone https://github.com/Ladvien/codex-memory.git
cd codex-memory

# Set up environment
cp .env.example .env
# Edit .env with your database credentials

# Install
cargo install --path . --force

# Setup database (creates database, user, and tables)
codex-memory setup

# Run MCP server for Claude Desktop
codex-memory mcp

🚀 Advanced Setup with Companion Applications

Adding Semantic Search with Codex-Dreams

Install Codex Memory first (provides the storage foundation):

cargo install codex-memory
codex-memory setup --database-url postgresql://user:pass@host:5432/codex_db

Store some memories:

codex-memory store "Your content here" --context "Meeting notes" --summary "Team sync"

Install Codex-Dreams for semantic enhancement:

git clone https://github.com/Ladvien/codex-dreams.git
cd codex-dreams
pip install -e .
# Configure to use the same database
cp .env.example .env  # Edit DATABASE_URL to match your Codex database

Run semantic processing:

# Generate embeddings for existing memories
python -m codex_dreams.process_memories

Use enhanced features:
- Continue using Codex for fast storage/retrieval
- Use Codex-Dreams for semantic similarity search
- Both systems share the same data seamlessly

Verification Steps

# Verify Codex is working
codex-memory stats

# Check if Codex-Dreams extensions are present (optional)
psql $DATABASE_URL -c "\d memories"
# Look for embedding_vector column if Codex-Dreams is installed

Environment Configuration

Create a .env file with:

DATABASE_URL=postgresql://codex_user:codex_pass@localhost:5432/codex_db
RUST_LOG=info  # Optional: debug, info, warn, error

📖 Usage

Command Line Interface

# Store content with metadata
codex-memory store "Your content here" \
  --context "Meeting notes" \
  --summary "Q4 planning discussion" \
  --tags "meeting,planning,q4"

# Retrieve content by ID
codex-memory get <UUID>

# View storage statistics
codex-memory stats

# Run MCP server for Claude Desktop
codex-memory mcp

🔄 Companion Workflow with Codex-Dreams

For the full cognitive memory experience, use both applications together:

# 1. Store memories with Codex Memory (fast, reliable)
codex-memory store "Research findings on neural networks" \
  --context "AI Research" \
  --summary "Key insights from latest papers" \
  --tags "ai,research,neural-networks"

# 2. Generate insights with Codex-Dreams (AI-powered analysis)
codex-dreams generate-insights --time-period week
codex-dreams show-insights --limit 5
codex-dreams search-insights "neural networks" --limit 10

💡 Workflow Tip: Use Codex Memory for day-to-day storage and retrieval, then run Codex-Dreams periodically to generate insights and discover patterns across your stored memories.

🔍 Progressive Search System

The search_memory tool provides intelligent 3-stage progressive search that automatically finds results without manual threshold tuning:

Progressive Search Stages:

Stage 1 (Original): Search with your specified parameters and threshold
Stage 2 (Relaxed): If no results, automatically lower threshold by 0.25 (minimum 0.1)
Stage 3 (Content-Only): If still no results, do content-only similarity search at 0.1 threshold

Search Strategies:

TagsFirst: Prioritizes tag similarity, then filters by content
ContentFirst: Prioritizes content similarity, then enhances with tag scores
Hybrid: Balances both tag and content similarity

Configurable Parameters:

query: Search text (required)
tag_filter: Filter results by specific tags
similarity_threshold: Minimum similarity score (0.0-1.0, default: 0.7)
max_results: Maximum results to return (default: 10)
search_strategy: TagsFirst, ContentFirst, or Hybrid (default)
boost_recent: Apply recency boost to newer memories
tag_weight / content_weight: Customize scoring balance (default: 0.4/0.6)
use_tag_embedding / use_content_embedding: Enable/disable embedding types

Progressive Search Benefits:

No Empty Results: Automatically retries with relaxed criteria to find relevant content
Quality First: Tries high-quality matches first, falls back gracefully
Metadata Included: Response includes stage used, actual threshold, and search description
Full Fallback: Uses PostgreSQL ILIKE pattern matching when embeddings unavailable

The search leverages existing embeddings from Codex-Dreams when available, with intelligent fallback to PostgreSQL full-text search.

MCP Tools (Claude Desktop)

Codex provides 6 MCP tools:

Tool	Description	Parameters
`store_memory`	Store text with metadata	content, context, summary, tags
`search_memory`	Progressive 3-stage search with automatic fallback	query, tag_filter, similarity_threshold, max_results, search_strategy, and more
`get_memory`	Retrieve by ID	id (UUID)
`delete_memory`	Remove by ID	id (UUID)
`get_statistics`	Get storage stats	none
`store_file`	Chunk and store files	file_path, chunk_size, overlap, tags

Claude Desktop Configuration

Add to your Claude Desktop config:

{
  "mcpServers": {
    "codex-memory": {
      "command": "/path/to/codex-memory",
      "args": ["mcp"],
      "env": {
        "DATABASE_URL": "postgresql://codex_user:codex_pass@localhost:5432/codex_db"
      }
    }
  }
}

🏗️ API

Rust API Example

use codex_memory::{Storage, Config, create_pool};
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create connection pool
    let config = Config::from_env()?;
    let pool = create_pool(&config.database_url).await?;
    let storage = Arc::new(Storage::new(pool));
    
    // Store content
    let id = storage.store(
        "Content to store",
        "Context information".to_string(),
        "Brief summary".to_string(),
        Some(vec!["tag1".to_string(), "tag2".to_string()])
    ).await?;
    
    // Retrieve content
    if let Some(memory) = storage.get(id).await? {
        println!("Retrieved: {}", memory.content);
    }
    
    Ok(())
}

Database Schema

Core Codex Schema

CREATE TABLE memories (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    content TEXT NOT NULL,
    content_hash VARCHAR(64) NOT NULL UNIQUE,
    context TEXT NOT NULL,
    summary TEXT NOT NULL,
    metadata JSONB DEFAULT '{}',
    tags TEXT[] DEFAULT '{}',
    chunk_index INTEGER DEFAULT NULL,
    total_chunks INTEGER DEFAULT NULL,
    parent_id UUID DEFAULT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW(),
    updated_at TIMESTAMPTZ DEFAULT NOW()
);

Schema Extensions Support

The memories table can be optionally extended by companion applications like Codex-Dreams:

-- Core Codex schema (always present - shown above)

-- Optional extensions added by Codex-Dreams (ignored by Codex)
ALTER TABLE memories ADD COLUMN IF NOT EXISTS embedding_vector vector(768);    -- For similarity search
ALTER TABLE memories ADD COLUMN IF NOT EXISTS semantic_cluster INTEGER;        -- For cognitive grouping
ALTER TABLE memories ADD COLUMN IF NOT EXISTS importance_score FLOAT;          -- For memory ranking
ALTER TABLE memories ADD COLUMN IF NOT EXISTS last_accessed TIMESTAMPTZ;       -- For access patterns

-- Extensions don't affect Codex operations - all original commands work unchanged

Important Notes:

Codex Memory only uses the core columns listed in the first schema
Companion applications may add columns but never modify existing ones
All Codex CLI commands and MCP operations continue working with extensions present
Extensions can be removed without affecting stored data

🧪 Testing

# Run all tests
cargo test

# Run with output
cargo test -- --nocapture

# Run specific test suite
cargo test integration
cargo test unit
cargo test edge_cases

# Run with coverage (requires cargo-tarpaulin)
cargo tarpaulin --out Html

🚀 Performance

Core Performance Benchmarks

Operation	Performance	Notes
Store (small)	~5ms	Including deduplication
Store (chunked)	~10ms/chunk	8KB chunks
Retrieve	~2ms	By UUID
Delete	~3ms	Single operation
Statistics	~15ms	Aggregate query

Optimization Features

Connection pooling (5 connections optimized)
Prepared statements
Index optimization (B-tree and GIN indexes)
SHA-256 content deduplication
Async I/O throughout

Performance with Companion Applications

When using applications like Codex-Dreams that extend the schema:

Impact Analysis

Storage Performance: Unaffected (5ms store, 2ms retrieve maintained)
Additional Indexes: pgvector HNSW indexes don't impact Codex operations
Memory Usage: Additional columns increase row size but don't affect core operations
Query Performance: Core Codex queries use different indexes, no interference
Compatibility: All existing Codex CLI commands work at full speed

Storage Sizing

Extended schemas require additional storage:

Base Codex data: ~1KB per memory average
With embeddings (Codex-Dreams): ~4KB per memory (768-dim float vectors)
Indexes: Additional ~20-30% storage overhead for similarity search
Example: 10,000 memories = ~10MB base, ~40MB with embeddings

🔧 Development

Building from Source

# Development build
cargo build

# Release build (optimized)
cargo build --release

# Run clippy lints
cargo clippy -- -D warnings

# Format code
cargo fmt

# Security audit
cargo audit

Project Structure

codex-memory/
├── src/
│   ├── main.rs           # CLI entry point
│   ├── lib.rs            # Library exports
│   ├── storage.rs        # Core storage logic
│   ├── models.rs         # Data structures
│   ├── database/         # Database operations
│   ├── mcp_server/       # MCP protocol implementation
│   └── chunking.rs       # File chunking logic
├── tests/
│   ├── unit/             # Unit tests
│   ├── integration/      # Integration tests
│   └── edge_cases/       # Edge case tests
└── migrations/           # Database migrations

❓ FAQ

Can I use other applications with my Codex database?

Yes! Codex stores data in a standard PostgreSQL table that other applications can read from and extend. Popular companion applications include:

Codex-Dreams: Adds semantic search and biological memory modeling
Other applications can be built to extend the memories table with additional features

Will companion applications break my Codex installation?

No. Well-designed companion applications like Codex-Dreams only ADD columns to the memories table and never modify or remove existing data. Your Codex CLI and MCP server continue working exactly as before.

How do companion applications extend the schema?

Companion apps use PostgreSQL's ALTER TABLE ADD COLUMN to add their specific fields. For example:

-- Codex-Dreams adds optional columns for semantic search
ALTER TABLE memories ADD COLUMN IF NOT EXISTS embedding_vector vector(768);

These columns are ignored by Codex but enable advanced features in the companion app.

How do I remove extensions added by companion applications?

Extensions are typically just additional columns that can be safely removed:

-- Example: Remove Codex-Dreams extensions (optional)
ALTER TABLE memories DROP COLUMN IF EXISTS embedding_vector;
ALTER TABLE memories DROP COLUMN IF EXISTS semantic_cluster;
-- Your core Codex data remains intact

What if I want to use Codex-Dreams features but keep databases separate?

While the recommended setup is to share the database, you can configure Codex-Dreams to periodically sync from your Codex database if needed. See the Codex-Dreams documentation for sync configuration.

Do I need to install companion applications?

No. Codex Memory is fully functional on its own. Companion applications are optional enhancements that add specialized features when you need them.

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🔗 Related Projects

Codex-Dreams - Companion app for AI-powered memory insights and cognitive processing
Claude Desktop - Anthropic's Claude Desktop application
MCP Specification - Model Context Protocol specification

📞 Support

Issues: GitHub Issues
Discussions: GitHub Discussions
Documentation: See ARCHITECTURE.md for detailed system design

🙏 Acknowledgments

Built with Rust and PostgreSQL
MCP protocol for LLM integration
Tokio for async runtime
SQLx for database operations

💡 Pro Tip: Maximize your memory system by using both applications together. Codex Memory handles fast storage and retrieval, while Codex-Dreams adds intelligent analysis and insights generation.

codex-memory 3.0.13