# Codex Memory
[](https://www.rust-lang.org/)
[](https://www.postgresql.org/)
[](https://opensource.org/licenses/GPL-3.0)
A high-performance Rust-based memory storage service designed for reliable content management with advanced extensibility for companion applications. Codex Memory provides deduplicated storage, automatic chunking, and MCP (Model Context Protocol) interface for Claude Desktop integration - all while maintaining a clean, extensible architecture that enables powerful companion applications like Codex-Dreams.
> **โ ๏ธ Development Notice**
> This project was developed using AI-assisted iterative coding. While functional and tested, the codebase may contain unconventional patterns or edge cases. We recommend thorough testing in development environments before production use. Contributions and feedback are welcome!
## ๐ Features
### Core Capabilities
- **๐๏ธ Reliable Text Storage** - PostgreSQL-backed storage with ACID compliance
- **๐ Content Deduplication** - SHA-256 hash-based automatic deduplication
- **๐ Smart File Chunking** - Automatic chunking with configurable overlap for large files
- **๐ท๏ธ Tag-Based Organization** - Flexible tagging system for categorization
- **๐ Parent-Child Relationships** - Maintains relationships between chunks and source documents
- **๐ค MCP Integration** - Native Model Context Protocol support for Claude Desktop
- **๐ Progressive Search** - 3-stage intelligent search that automatically finds results without manual threshold tuning
### Technical Features
- **Connection Pooling** - Optimized connection management (5 connections)
- **Async/Await Architecture** - Built on Tokio for high concurrency
- **Comprehensive Error Handling** - Proper Result types throughout
- **UTF-8 Safe Chunking** - Respects character boundaries in all operations
- **Full-Text Search** - PostgreSQL-powered search capabilities
- **Process Reliability** - Singleton process management with health monitoring
- **Graceful Shutdown** - SIGTERM/SIGINT handling with resource cleanup
- **Auto-Recovery** - Wrapper script with restart capabilities and rate limiting
## ๐ค Companion Applications
### Codex-Dreams: Semantic Memory Enhancement
**[Codex-Dreams](https://github.com/Ladvien/codex-dreams)** is a companion application that adds biological memory modeling and semantic search capabilities to your Codex-stored memories.
**What Codex-Dreams Adds:**
- **Semantic embeddings** for similarity search across your stored memories (768-dim vectors)
- **Cognitive modeling** using Miller's 7ยฑ2, Hebbian learning, and memory consolidation
- **Advanced analytics** including memory clustering and biological insights
- **pgvector-powered search** with <250ms query performance
- **Pattern recognition** across your entire memory corpus
**How They Work Together:**
- **Codex Memory** (this project): Handles fast storage, retrieval, and MCP integration (your data layer)
- **Codex-Dreams**: Reads your data and adds semantic processing (the intelligence layer)
- **Shared Database**: Both use the same PostgreSQL `memories` table
- **Non-Destructive**: Codex-Dreams extends but never modifies core Codex functionality
- **Independent Operation**: Each system works alone or together
### Integration Status
- โ
**Schema Compatible**: Codex-Dreams adds optional columns, preserves all existing fields
- โ
**Performance Optimized**: HNSW indexes don't interfere with Codex operations
- โ
**Production Ready**: Successfully processing 8,600+ memories with 99.98% coverage
- โ
**Zero Configuration**: Both applications automatically share the same database
## ๐ Architecture
Codex follows a modular architecture focused on simplicity, reliability, and extensibility:
```mermaid
graph TD
A[Codex CLI] -->|Store/Retrieve| B[(PostgreSQL: memories table)]
C[MCP Server] -->|JSON-RPC 2.0| B
C -->|Claude Desktop| D[AI Assistant Integration]
%% Optional semantic enhancement
E[Codex-Dreams] -.->|Reads & Enhances| B
E -.->|Adds embeddings| F[pgvector similarity search]
B -->|Fast retrieval| G[2ms response time]
B -->|Reliable storage| H[ACID compliance]
style E fill:#f9f,stroke:#333,stroke-width:2px,stroke-dasharray: 5 5
style F fill:#f9f,stroke:#333,stroke-width:2px,stroke-dasharray: 5 5
```
### Core Architecture
```
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Codex Memory Architecture โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โ
โ โ CLI Interface โ โ MCP Server โ โ PostgreSQL โ โ
โ โ โ โ โ โ Database โ โ
โ โ โข store โโโโโบโ โข JSON-RPC 2.0 โโโโโบโ โ โ
โ โ โข get โ โ โข 5 MCP Tools โ โ โข Indexes โ โ
โ โ โข stats โ โ โข stdio I/O โ โ โข ACID โ โ
โ โ โข setup โ โ โ โ โ โ
โ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
```
## ๐ ๏ธ Installation
### Prerequisites
- Rust 1.70 or higher
- PostgreSQL 14 or higher
- Claude Desktop (optional, for MCP integration)
### Quick Start
```bash
# Clone the repository
git clone https://github.com/Ladvien/codex-memory.git
cd codex-memory
# Set up environment
cp .env.example .env
# Edit .env with your database credentials
# Install
cargo install --path . --force
# Setup database (creates database, user, and tables)
codex-memory setup
# Run MCP server for Claude Desktop
codex-memory mcp
```
### ๐ Advanced Setup with Companion Applications
#### Adding Semantic Search with Codex-Dreams
1. **Install Codex Memory first** (provides the storage foundation):
```bash
cargo install codex-memory
codex-memory setup --database-url postgresql://user:pass@host:5432/codex_db
```
2. **Store some memories**:
```bash
codex-memory store "Your content here" --context "Meeting notes" --summary "Team sync"
```
3. **Install Codex-Dreams for semantic enhancement**:
```bash
git clone https://github.com/Ladvien/codex-dreams.git
cd codex-dreams
pip install -e .
cp .env.example .env ```
4. **Run semantic processing**:
```bash
python -m codex_dreams.process_memories
```
5. **Use enhanced features**:
- Continue using Codex for fast storage/retrieval
- Use Codex-Dreams for semantic similarity search
- Both systems share the same data seamlessly
#### Verification Steps
```bash
# Verify Codex is working
codex-memory stats
# Check if Codex-Dreams extensions are present (optional)
psql $DATABASE_URL -c "\d memories"
# Look for embedding_vector column if Codex-Dreams is installed
```
### Environment Configuration
Create a `.env` file with:
```bash
DATABASE_URL=postgresql://codex_user:codex_pass@localhost:5432/codex_db
RUST_LOG=info # Optional: debug, info, warn, error
```
## ๐ Usage
### Command Line Interface
```bash
# Store content with metadata
codex-memory store "Your content here" \
--context "Meeting notes" \
--summary "Q4 planning discussion" \
--tags "meeting,planning,q4"
# Retrieve content by ID
codex-memory get <UUID>
# View storage statistics
codex-memory stats
# Run MCP server for Claude Desktop
codex-memory mcp
```
### ๐ Companion Workflow with Codex-Dreams
For the full cognitive memory experience, use both applications together:
```bash
# 1. Store memories with Codex Memory (fast, reliable)
codex-memory store "Research findings on neural networks" \
--context "AI Research" \
--summary "Key insights from latest papers" \
--tags "ai,research,neural-networks"
# 2. Generate insights with Codex-Dreams (AI-powered analysis)
codex-dreams generate-insights --time-period week
codex-dreams show-insights --limit 5
codex-dreams search-insights "neural networks" --limit 10
```
> **๐ก Workflow Tip**: Use Codex Memory for day-to-day storage and retrieval, then run Codex-Dreams periodically to generate insights and discover patterns across your stored memories.
### ๐ Progressive Search System
The `search_memory` tool provides intelligent 3-stage progressive search that automatically finds results without manual threshold tuning:
**Progressive Search Stages:**
1. **Stage 1 (Original)**: Search with your specified parameters and threshold
2. **Stage 2 (Relaxed)**: If no results, automatically lower threshold by 0.25 (minimum 0.1)
3. **Stage 3 (Content-Only)**: If still no results, do content-only similarity search at 0.1 threshold
**Search Strategies:**
- **TagsFirst**: Prioritizes tag similarity, then filters by content
- **ContentFirst**: Prioritizes content similarity, then enhances with tag scores
- **Hybrid**: Balances both tag and content similarity
**Configurable Parameters:**
- `query`: Search text (required)
- `tag_filter`: Filter results by specific tags
- `similarity_threshold`: Minimum similarity score (0.0-1.0, default: 0.7)
- `max_results`: Maximum results to return (default: 10)
- `search_strategy`: TagsFirst, ContentFirst, or Hybrid (default)
- `boost_recent`: Apply recency boost to newer memories
- `tag_weight` / `content_weight`: Customize scoring balance (default: 0.4/0.6)
- `use_tag_embedding` / `use_content_embedding`: Enable/disable embedding types
**Progressive Search Benefits:**
- **No Empty Results**: Automatically retries with relaxed criteria to find relevant content
- **Quality First**: Tries high-quality matches first, falls back gracefully
- **Metadata Included**: Response includes stage used, actual threshold, and search description
- **Full Fallback**: Uses PostgreSQL ILIKE pattern matching when embeddings unavailable
The search leverages existing embeddings from Codex-Dreams when available, with intelligent fallback to PostgreSQL full-text search.
### MCP Tools (Claude Desktop)
Codex provides 6 MCP tools:
| `store_memory` | Store text with metadata | content, context, summary, tags |
| `search_memory` | Progressive 3-stage search with automatic fallback | query, tag_filter, similarity_threshold, max_results, search_strategy, and more |
| `get_memory` | Retrieve by ID | id (UUID) |
| `delete_memory` | Remove by ID | id (UUID) |
| `get_statistics` | Get storage stats | none |
| `store_file` | Chunk and store files | file_path, chunk_size, overlap, tags |
### Claude Desktop Configuration
Add to your Claude Desktop config:
```json
{
"mcpServers": {
"codex-memory": {
"command": "/path/to/codex-memory",
"args": ["mcp"],
"env": {
"DATABASE_URL": "postgresql://codex_user:codex_pass@localhost:5432/codex_db"
}
}
}
}
```
## ๐๏ธ API
### Rust API Example
```rust
use codex_memory::{Storage, Config, create_pool};
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Create connection pool
let config = Config::from_env()?;
let pool = create_pool(&config.database_url).await?;
let storage = Arc::new(Storage::new(pool));
// Store content
let id = storage.store(
"Content to store",
"Context information".to_string(),
"Brief summary".to_string(),
Some(vec!["tag1".to_string(), "tag2".to_string()])
).await?;
// Retrieve content
if let Some(memory) = storage.get(id).await? {
println!("Retrieved: {}", memory.content);
}
Ok(())
}
```
### Database Schema
#### Core Codex Schema
```sql
CREATE TABLE memories (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
content TEXT NOT NULL,
content_hash VARCHAR(64) NOT NULL UNIQUE,
context TEXT NOT NULL,
summary TEXT NOT NULL,
metadata JSONB DEFAULT '{}',
tags TEXT[] DEFAULT '{}',
chunk_index INTEGER DEFAULT NULL,
total_chunks INTEGER DEFAULT NULL,
parent_id UUID DEFAULT NULL,
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
```
#### Schema Extensions Support
The `memories` table can be optionally extended by companion applications like Codex-Dreams:
```sql
-- Core Codex schema (always present - shown above)
-- Optional extensions added by Codex-Dreams (ignored by Codex)
ALTER TABLE memories ADD COLUMN IF NOT EXISTS embedding_vector vector(768); -- For similarity search
ALTER TABLE memories ADD COLUMN IF NOT EXISTS semantic_cluster INTEGER; -- For cognitive grouping
ALTER TABLE memories ADD COLUMN IF NOT EXISTS importance_score FLOAT; -- For memory ranking
ALTER TABLE memories ADD COLUMN IF NOT EXISTS last_accessed TIMESTAMPTZ; -- For access patterns
-- Extensions don't affect Codex operations - all original commands work unchanged
```
**Important Notes:**
- Codex Memory only uses the core columns listed in the first schema
- Companion applications may add columns but never modify existing ones
- All Codex CLI commands and MCP operations continue working with extensions present
- Extensions can be removed without affecting stored data
## ๐งช Testing
```bash
# Run all tests
cargo test
# Run with output
cargo test -- --nocapture
# Run specific test suite
cargo test integration
cargo test unit
cargo test edge_cases
# Run with coverage (requires cargo-tarpaulin)
cargo tarpaulin --out Html
```
## ๐ Performance
### Core Performance Benchmarks
| Store (small) | ~5ms | Including deduplication |
| Store (chunked) | ~10ms/chunk | 8KB chunks |
| Retrieve | ~2ms | By UUID |
| Delete | ~3ms | Single operation |
| Statistics | ~15ms | Aggregate query |
### Optimization Features
- Connection pooling (5 connections optimized)
- Prepared statements
- Index optimization (B-tree and GIN indexes)
- SHA-256 content deduplication
- Async I/O throughout
### Performance with Companion Applications
When using applications like Codex-Dreams that extend the schema:
#### Impact Analysis
- **Storage Performance**: Unaffected (5ms store, 2ms retrieve maintained)
- **Additional Indexes**: pgvector HNSW indexes don't impact Codex operations
- **Memory Usage**: Additional columns increase row size but don't affect core operations
- **Query Performance**: Core Codex queries use different indexes, no interference
- **Compatibility**: All existing Codex CLI commands work at full speed
#### Storage Sizing
Extended schemas require additional storage:
- **Base Codex data**: ~1KB per memory average
- **With embeddings** (Codex-Dreams): ~4KB per memory (768-dim float vectors)
- **Indexes**: Additional ~20-30% storage overhead for similarity search
- **Example**: 10,000 memories = ~10MB base, ~40MB with embeddings
## ๐ง Development
### Building from Source
```bash
# Development build
cargo build
# Release build (optimized)
cargo build --release
# Run clippy lints
cargo clippy -- -D warnings
# Format code
cargo fmt
# Security audit
cargo audit
```
### Project Structure
```
codex-memory/
โโโ src/
โ โโโ main.rs # CLI entry point
โ โโโ lib.rs # Library exports
โ โโโ storage.rs # Core storage logic
โ โโโ models.rs # Data structures
โ โโโ database/ # Database operations
โ โโโ mcp_server/ # MCP protocol implementation
โ โโโ chunking.rs # File chunking logic
โโโ tests/
โ โโโ unit/ # Unit tests
โ โโโ integration/ # Integration tests
โ โโโ edge_cases/ # Edge case tests
โโโ migrations/ # Database migrations
```
## โ FAQ
### Can I use other applications with my Codex database?
Yes! Codex stores data in a standard PostgreSQL table that other applications can read from and extend. Popular companion applications include:
- **[Codex-Dreams](https://github.com/Ladvien/codex-dreams)**: Adds semantic search and biological memory modeling
- Other applications can be built to extend the memories table with additional features
### Will companion applications break my Codex installation?
No. Well-designed companion applications like Codex-Dreams only ADD columns to the memories table and never modify or remove existing data. Your Codex CLI and MCP server continue working exactly as before.
### How do companion applications extend the schema?
Companion apps use PostgreSQL's `ALTER TABLE ADD COLUMN` to add their specific fields. For example:
```sql
-- Codex-Dreams adds optional columns for semantic search
ALTER TABLE memories ADD COLUMN IF NOT EXISTS embedding_vector vector(768);
```
These columns are ignored by Codex but enable advanced features in the companion app.
### How do I remove extensions added by companion applications?
Extensions are typically just additional columns that can be safely removed:
```sql
-- Example: Remove Codex-Dreams extensions (optional)
ALTER TABLE memories DROP COLUMN IF EXISTS embedding_vector;
ALTER TABLE memories DROP COLUMN IF EXISTS semantic_cluster;
-- Your core Codex data remains intact
```
### What if I want to use Codex-Dreams features but keep databases separate?
While the recommended setup is to share the database, you can configure Codex-Dreams to periodically sync from your Codex database if needed. See the Codex-Dreams documentation for sync configuration.
### Do I need to install companion applications?
No. Codex Memory is fully functional on its own. Companion applications are optional enhancements that add specialized features when you need them.
## ๐ค Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
1. Fork the repository
2. Create your feature branch (`git checkout -b feature/AmazingFeature`)
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request
## ๐ License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## ๐ Related Projects
- [**Codex-Dreams**](https://github.com/Ladvien/codex-dreams) - Companion app for AI-powered memory insights and cognitive processing
- [Claude Desktop](https://claude.ai/desktop) - Anthropic's Claude Desktop application
- [MCP Specification](https://modelcontextprotocol.io) - Model Context Protocol specification
## ๐ Support
- **Issues**: [GitHub Issues](https://github.com/Ladvien/codex-memory/issues)
- **Discussions**: [GitHub Discussions](https://github.com/Ladvien/codex-memory/discussions)
- **Documentation**: See [ARCHITECTURE.md](ARCHITECTURE.md) for detailed system design
## ๐ Acknowledgments
- Built with Rust and PostgreSQL
- MCP protocol for LLM integration
- Tokio for async runtime
- SQLx for database operations
---
**๐ก Pro Tip**: Maximize your memory system by using both applications together. Codex Memory handles fast storage and retrieval, while [Codex-Dreams](https://github.com/Ladvien/codex-dreams) adds intelligent analysis and insights generation.