codex-memory 3.0.15

# Codex Memory

[![Rust](https://img.shields.io/badge/rust-%23000000.svg?style=for-the-badge&logo=rust&logoColor=white)](https://www.rust-lang.org/)
[![PostgreSQL](https://img.shields.io/badge/PostgreSQL-316192?style=for-the-badge&logo=postgresql&logoColor=white)](https://www.postgresql.org/)
[![License: GPL-3.0](https://img.shields.io/badge/License-GPL--3.0-blue.svg)](https://opensource.org/licenses/GPL-3.0)

A high-performance Rust-based memory storage service designed for reliable content management with advanced extensibility for companion applications. Codex Memory provides deduplicated storage, automatic chunking, and MCP (Model Context Protocol) interface for Claude Desktop integration - all while maintaining a clean, extensible architecture that enables powerful companion applications like Codex-Dreams.

> **⚠️ Development Notice**  
> This project was developed using AI-assisted iterative coding. While functional and tested, the codebase may contain unconventional patterns or edge cases. We recommend thorough testing in development environments before production use. Contributions and feedback are welcome!

## 🚀 Features

### Core Capabilities
- **🗄️ Reliable Text Storage** - PostgreSQL-backed storage with ACID compliance
- **🔒 Content Deduplication** - SHA-256 hash-based automatic deduplication
- **📄 Smart File Chunking** - Automatic chunking with configurable overlap for large files
- **🏷️ Tag-Based Organization** - Flexible tagging system for categorization
- **🔗 Parent-Child Relationships** - Maintains relationships between chunks and source documents
- **🤖 MCP Integration** - Native Model Context Protocol support for Claude Desktop
- **🔍 Progressive Search** - 3-stage intelligent search that automatically finds results without manual threshold tuning

### Technical Features
- **Connection Pooling** - Optimized connection management (5 connections)
- **Async/Await Architecture** - Built on Tokio for high concurrency
- **Comprehensive Error Handling** - Proper Result types throughout
- **UTF-8 Safe Chunking** - Respects character boundaries in all operations
- **Full-Text Search** - PostgreSQL-powered search capabilities
- **Process Reliability** - Singleton process management with health monitoring
- **Graceful Shutdown** - SIGTERM/SIGINT handling with resource cleanup
- **Auto-Recovery** - Wrapper script with restart capabilities and rate limiting

## 🤝 Companion Applications

### Codex-Dreams: Semantic Memory Enhancement

**[Codex-Dreams](https://github.com/Ladvien/codex-dreams)** is a companion application that adds biological memory modeling and semantic search capabilities to your Codex-stored memories.

**What Codex-Dreams Adds:**
- **Semantic embeddings** for similarity search across your stored memories (768-dim vectors)
- **Cognitive modeling** using Miller's 7±2, Hebbian learning, and memory consolidation
- **Advanced analytics** including memory clustering and biological insights
- **pgvector-powered search** with <250ms query performance
- **Pattern recognition** across your entire memory corpus

**How They Work Together:**
- **Codex Memory** (this project): Handles fast storage, retrieval, and MCP integration (your data layer)
- **Codex-Dreams**: Reads your data and adds semantic processing (the intelligence layer)
- **Shared Database**: Both use the same PostgreSQL `memories` table
- **Non-Destructive**: Codex-Dreams extends but never modifies core Codex functionality
- **Independent Operation**: Each system works alone or together

### Integration Status
- ✅ **Schema Compatible**: Codex-Dreams adds optional columns, preserves all existing fields
- ✅ **Performance Optimized**: HNSW indexes don't interfere with Codex operations
- ✅ **Production Ready**: Successfully processing 8,600+ memories with 99.98% coverage
- ✅ **Zero Configuration**: Both applications automatically share the same database

## 📋 Architecture

Codex follows a modular architecture focused on simplicity, reliability, and extensibility:

```mermaid
graph TD
    A[Codex CLI] -->|Store/Retrieve| B[(PostgreSQL: memories table)]
    C[MCP Server] -->|JSON-RPC 2.0| B
    C -->|Claude Desktop| D[AI Assistant Integration]
    
    %% Optional semantic enhancement
    E[Codex-Dreams] -.->|Reads & Enhances| B
    E -.->|Adds embeddings| F[pgvector similarity search]
    
    B -->|Fast retrieval| G[2ms response time]
    B -->|Reliable storage| H[ACID compliance]
    
    style E fill:#f9f,stroke:#333,stroke-width:2px,stroke-dasharray: 5 5
    style F fill:#f9f,stroke:#333,stroke-width:2px,stroke-dasharray: 5 5
```

### Core Architecture
```
┌─────────────────────────────────────────────────────────────────┐
│                      Codex Memory Architecture                   │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─────────────────┐    ┌─────────────────┐    ┌─────────────┐ │
│  │   CLI Interface │    │   MCP Server    │    │  PostgreSQL │ │
│  │                 │    │                 │    │  Database   │ │
│  │  • store        │◄──►│  • JSON-RPC 2.0 │◄──►│             │ │
│  │  • get          │    │  • 5 MCP Tools  │    │  • Indexes  │ │
│  │  • stats        │    │  • stdio I/O    │    │  • ACID     │ │
│  │  • setup        │    │                 │    │             │ │
│  └─────────────────┘    └─────────────────┘    └─────────────┘ │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘
```


## 🛠️ Installation

### Prerequisites
- Rust 1.70 or higher
- PostgreSQL 14 or higher
- Claude Desktop (optional, for MCP integration)

### Quick Start

```bash
# Clone the repository
git clone https://github.com/Ladvien/codex-memory.git
cd codex-memory

# Set up environment
cp .env.example .env
# Edit .env with your database credentials

# Install
cargo install --path . --force

# Setup database (creates database, user, and tables)
codex-memory setup

# Run MCP server for Claude Desktop
codex-memory mcp
```

### 🚀 Advanced Setup with Companion Applications

#### Adding Semantic Search with Codex-Dreams

1. **Install Codex Memory first** (provides the storage foundation):
   ```bash
   cargo install codex-memory
   codex-memory setup --database-url postgresql://user:pass@host:5432/codex_db
   ```

2. **Store some memories**:
   ```bash
   codex-memory store "Your content here" --context "Meeting notes" --summary "Team sync"
   ```

3. **Install Codex-Dreams for semantic enhancement**:
   ```bash
   git clone https://github.com/Ladvien/codex-dreams.git
   cd codex-dreams
   pip install -e .
   # Configure to use the same database
   cp .env.example .env  # Edit DATABASE_URL to match your Codex database
   ```

4. **Run semantic processing**:
   ```bash
   # Generate embeddings for existing memories
   python -m codex_dreams.process_memories
   ```

5. **Use enhanced features**:
   - Continue using Codex for fast storage/retrieval
   - Use Codex-Dreams for semantic similarity search
   - Both systems share the same data seamlessly

#### Verification Steps

```bash
# Verify Codex is working
codex-memory stats

# Check if Codex-Dreams extensions are present (optional)
psql $DATABASE_URL -c "\d memories"
# Look for embedding_vector column if Codex-Dreams is installed
```

### Environment Configuration

Create a `.env` file with:

```bash
DATABASE_URL=postgresql://codex_user:codex_pass@localhost:5432/codex_db
RUST_LOG=info  # Optional: debug, info, warn, error
```

## 📖 Usage

### Command Line Interface

```bash
# Store content with metadata
codex-memory store "Your content here" \
  --context "Meeting notes" \
  --summary "Q4 planning discussion" \
  --tags "meeting,planning,q4"

# Retrieve content by ID
codex-memory get <UUID>

# View storage statistics
codex-memory stats

# Run MCP server for Claude Desktop
codex-memory mcp
```

### 🔄 Companion Workflow with Codex-Dreams

For the full cognitive memory experience, use both applications together:

```bash
# 1. Store memories with Codex Memory (fast, reliable)
codex-memory store "Research findings on neural networks" \
  --context "AI Research" \
  --summary "Key insights from latest papers" \
  --tags "ai,research,neural-networks"

# 2. Generate insights with Codex-Dreams (AI-powered analysis)
codex-dreams generate-insights --time-period week
codex-dreams show-insights --limit 5
codex-dreams search-insights "neural networks" --limit 10
```

> **💡 Workflow Tip**: Use Codex Memory for day-to-day storage and retrieval, then run Codex-Dreams periodically to generate insights and discover patterns across your stored memories.

### 🔍 Progressive Search System

The `search_memory` tool provides intelligent 3-stage progressive search that automatically finds results without manual threshold tuning:

**Progressive Search Stages:**
1. **Stage 1 (Original)**: Search with your specified parameters and threshold
2. **Stage 2 (Relaxed)**: If no results, automatically lower threshold by 0.25 (minimum 0.1)
3. **Stage 3 (Content-Only)**: If still no results, do content-only similarity search at 0.1 threshold

**Search Strategies:**
- **TagsFirst**: Prioritizes tag similarity, then filters by content
- **ContentFirst**: Prioritizes content similarity, then enhances with tag scores  
- **Hybrid**: Balances both tag and content similarity

**Configurable Parameters:**
- `query`: Search text (required)
- `tag_filter`: Filter results by specific tags
- `similarity_threshold`: Minimum similarity score (0.0-1.0, default: 0.7)
- `max_results`: Maximum results to return (default: 10)
- `search_strategy`: TagsFirst, ContentFirst, or Hybrid (default)
- `boost_recent`: Apply recency boost to newer memories
- `tag_weight` / `content_weight`: Customize scoring balance (default: 0.4/0.6)
- `use_tag_embedding` / `use_content_embedding`: Enable/disable embedding types

**Progressive Search Benefits:**
- **No Empty Results**: Automatically retries with relaxed criteria to find relevant content
- **Quality First**: Tries high-quality matches first, falls back gracefully
- **Metadata Included**: Response includes stage used, actual threshold, and search description
- **Full Fallback**: Uses PostgreSQL ILIKE pattern matching when embeddings unavailable

The search leverages existing embeddings from Codex-Dreams when available, with intelligent fallback to PostgreSQL full-text search.

### MCP Tools (Claude Desktop)

Codex provides 6 MCP tools:

| Tool | Description | Parameters |
|------|-------------|------------|
| `store_memory` | Store text with metadata | content, context, summary, tags |
| `search_memory` | Progressive 3-stage search with automatic fallback | query, tag_filter, similarity_threshold, max_results, search_strategy, and more |
| `get_memory` | Retrieve by ID | id (UUID) |
| `delete_memory` | Remove by ID | id (UUID) |
| `get_statistics` | Get storage stats | none |
| `store_file` | Chunk and store files | file_path, chunk_size, overlap, tags |

### Claude Desktop Configuration

Add to your Claude Desktop config:

```json
{
  "mcpServers": {
    "codex-memory": {
      "command": "/path/to/codex-memory",
      "args": ["mcp"],
      "env": {
        "DATABASE_URL": "postgresql://codex_user:codex_pass@localhost:5432/codex_db"
      }
    }
  }
}
```

## 🏗️ API

### Rust API Example

```rust
use codex_memory::{Storage, Config, create_pool};
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create connection pool
    let config = Config::from_env()?;
    let pool = create_pool(&config.database_url).await?;
    let storage = Arc::new(Storage::new(pool));
    
    // Store content
    let id = storage.store(
        "Content to store",
        "Context information".to_string(),
        "Brief summary".to_string(),
        Some(vec!["tag1".to_string(), "tag2".to_string()])
    ).await?;
    
    // Retrieve content
    if let Some(memory) = storage.get(id).await? {
        println!("Retrieved: {}", memory.content);
    }
    
    Ok(())
}
```

### Database Schema

#### Core Codex Schema

```sql
CREATE TABLE memories (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    content TEXT NOT NULL,
    content_hash VARCHAR(64) NOT NULL UNIQUE,
    context TEXT NOT NULL,
    summary TEXT NOT NULL,
    metadata JSONB DEFAULT '{}',
    tags TEXT[] DEFAULT '{}',
    chunk_index INTEGER DEFAULT NULL,
    total_chunks INTEGER DEFAULT NULL,
    parent_id UUID DEFAULT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW(),
    updated_at TIMESTAMPTZ DEFAULT NOW()
);
```

#### Schema Extensions Support

The `memories` table can be optionally extended by companion applications like Codex-Dreams:

```sql
-- Core Codex schema (always present - shown above)

-- Optional extensions added by Codex-Dreams (ignored by Codex)
ALTER TABLE memories ADD COLUMN IF NOT EXISTS embedding_vector vector(768);    -- For similarity search
ALTER TABLE memories ADD COLUMN IF NOT EXISTS semantic_cluster INTEGER;        -- For cognitive grouping
ALTER TABLE memories ADD COLUMN IF NOT EXISTS importance_score FLOAT;          -- For memory ranking
ALTER TABLE memories ADD COLUMN IF NOT EXISTS last_accessed TIMESTAMPTZ;       -- For access patterns

-- Extensions don't affect Codex operations - all original commands work unchanged
```

**Important Notes:**
- Codex Memory only uses the core columns listed in the first schema
- Companion applications may add columns but never modify existing ones
- All Codex CLI commands and MCP operations continue working with extensions present
- Extensions can be removed without affecting stored data

## 🧪 Testing

```bash
# Run all tests
cargo test

# Run with output
cargo test -- --nocapture

# Run specific test suite
cargo test integration
cargo test unit
cargo test edge_cases

# Run with coverage (requires cargo-tarpaulin)
cargo tarpaulin --out Html
```

## 🚀 Performance

### Core Performance Benchmarks
| Operation | Performance | Notes |
|-----------|-------------|-------|
| Store (small) | ~5ms | Including deduplication |
| Store (chunked) | ~10ms/chunk | 8KB chunks |
| Retrieve | ~2ms | By UUID |
| Delete | ~3ms | Single operation |
| Statistics | ~15ms | Aggregate query |

### Optimization Features
- Connection pooling (5 connections optimized)
- Prepared statements
- Index optimization (B-tree and GIN indexes)
- SHA-256 content deduplication
- Async I/O throughout

### Performance with Companion Applications

When using applications like Codex-Dreams that extend the schema:

#### Impact Analysis
- **Storage Performance**: Unaffected (5ms store, 2ms retrieve maintained)
- **Additional Indexes**: pgvector HNSW indexes don't impact Codex operations
- **Memory Usage**: Additional columns increase row size but don't affect core operations
- **Query Performance**: Core Codex queries use different indexes, no interference
- **Compatibility**: All existing Codex CLI commands work at full speed

#### Storage Sizing

Extended schemas require additional storage:
- **Base Codex data**: ~1KB per memory average
- **With embeddings** (Codex-Dreams): ~4KB per memory (768-dim float vectors)
- **Indexes**: Additional ~20-30% storage overhead for similarity search
- **Example**: 10,000 memories = ~10MB base, ~40MB with embeddings

## 🔧 Development

### Building from Source

```bash
# Development build
cargo build

# Release build (optimized)
cargo build --release

# Run clippy lints
cargo clippy -- -D warnings

# Format code
cargo fmt

# Security audit
cargo audit
```

### Project Structure

```
codex-memory/
├── src/
│   ├── main.rs           # CLI entry point
│   ├── lib.rs            # Library exports
│   ├── storage.rs        # Core storage logic
│   ├── models.rs         # Data structures
│   ├── database/         # Database operations
│   ├── mcp_server/       # MCP protocol implementation
│   └── chunking.rs       # File chunking logic
├── tests/
│   ├── unit/             # Unit tests
│   ├── integration/      # Integration tests
│   └── edge_cases/       # Edge case tests
└── migrations/           # Database migrations
```

## ❓ FAQ

### Can I use other applications with my Codex database?

Yes! Codex stores data in a standard PostgreSQL table that other applications can read from and extend. Popular companion applications include:

- **[Codex-Dreams](https://github.com/Ladvien/codex-dreams)**: Adds semantic search and biological memory modeling
- Other applications can be built to extend the memories table with additional features

### Will companion applications break my Codex installation?

No. Well-designed companion applications like Codex-Dreams only ADD columns to the memories table and never modify or remove existing data. Your Codex CLI and MCP server continue working exactly as before.

### How do companion applications extend the schema?

Companion apps use PostgreSQL's `ALTER TABLE ADD COLUMN` to add their specific fields. For example:
```sql
-- Codex-Dreams adds optional columns for semantic search
ALTER TABLE memories ADD COLUMN IF NOT EXISTS embedding_vector vector(768);
```
These columns are ignored by Codex but enable advanced features in the companion app.

### How do I remove extensions added by companion applications?

Extensions are typically just additional columns that can be safely removed:
```sql
-- Example: Remove Codex-Dreams extensions (optional)
ALTER TABLE memories DROP COLUMN IF EXISTS embedding_vector;
ALTER TABLE memories DROP COLUMN IF EXISTS semantic_cluster;
-- Your core Codex data remains intact
```

### What if I want to use Codex-Dreams features but keep databases separate?

While the recommended setup is to share the database, you can configure Codex-Dreams to periodically sync from your Codex database if needed. See the Codex-Dreams documentation for sync configuration.

### Do I need to install companion applications?

No. Codex Memory is fully functional on its own. Companion applications are optional enhancements that add specialized features when you need them.

## 🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

1. Fork the repository
2. Create your feature branch (`git checkout -b feature/AmazingFeature`)
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request

## 📄 License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## 🔗 Related Projects

- [**Codex-Dreams**](https://github.com/Ladvien/codex-dreams) - Companion app for AI-powered memory insights and cognitive processing
- [Claude Desktop](https://claude.ai/desktop) - Anthropic's Claude Desktop application
- [MCP Specification](https://modelcontextprotocol.io) - Model Context Protocol specification

## 📞 Support

- **Issues**: [GitHub Issues](https://github.com/Ladvien/codex-memory/issues)
- **Discussions**: [GitHub Discussions](https://github.com/Ladvien/codex-memory/discussions)
- **Documentation**: See [ARCHITECTURE.md](ARCHITECTURE.md) for detailed system design

## 🙏 Acknowledgments

- Built with Rust and PostgreSQL
- MCP protocol for LLM integration
- Tokio for async runtime
- SQLx for database operations

---

**💡 Pro Tip**: Maximize your memory system by using both applications together. Codex Memory handles fast storage and retrieval, while [Codex-Dreams](https://github.com/Ladvien/codex-dreams) adds intelligent analysis and insights generation.