memoryoss 0.1.1

Memory for AI Agents — store, recall, update, forget
memoryoss-0.1.1 is not a library.

memoryOSS

Persistent long-term memory for AI agents. memoryOSS runs as a local memory layer in front of the LLM API, with MCP always available for explicit memory tools.

Public Beta (v0.1.1) — memoryOSS is a public beta for evaluation and testing. Features, APIs, and configuration may change without notice. Do not use for critical or regulated workloads. Please keep your own backups. This notice does not limit any mandatory statutory rights.

memoryOSS is for project context, preferences, prior fixes, and working history — not for replacing general world knowledge the model already has.

In internal benchmark and soak runs, memoryOSS maintained retrieval of early high-signal memories even as the stored corpus grew into the tens of thousands.

Quickstart

# Install from source
cargo install --git https://github.com/memoryOSScom/memoryOSS.git

# Or build locally
cargo build --release

# Interactive setup (creates config, registers MCP, starts the hybrid gateway)
memoryoss setup

# Or start directly with an existing config
memoryoss -c memoryoss.toml serve

The setup wizard auto-detects your environment, registers MCP for Claude/Codex, and enables local proxy exports when they are safe for the selected auth mode. OAuth-first setups keep MCP enabled without forcing global BASE_URL overrides, so login flows keep working. On a fresh setup it starts in full mode. If existing memories are already present, the wizard asks which memory mode you want and defaults that prompt to full.

If your auth setup changes later — for example from OAuth to API key or the other way around — run memoryoss setup again so memoryOSS can safely update the integration path.

Hybrid Mode (Recommended)

For API-key setups, the default setup is hybrid:

  • supported clients talk to the local memoryOSS gateway via BASE_URL
  • MCP is also registered for Claude/Codex
  • if the memory core is healthy, requests get recall/injection/extraction
  • if the memory core is unavailable, the gateway falls back to direct upstream passthrough instead of breaking the client

For OAuth-first Claude/Codex setups, the wizard keeps MCP enabled and skips global BASE_URL exports by default so provider login continues to work normally. Claude can still use the proxy in supported OAuth paths; Codex OAuth stays MCP-first by default, and proxy mode for Codex requires an OpenAI API key.

Background fact extraction is only enabled automatically when a real provider API key is available. OAuth alone is enough for passthrough traffic, but not treated as a reliable extraction credential.

So you get transparent memory when available, plus explicit MCP tools when needed.

After memoryoss setup

Start Claude Code or Codex normally. The wizard always registers MCP and writes local BASE_URL exports only when the chosen auth mode is proxy-safe.

Manual proxy mode (optional)

If you want to point clients at the gateway yourself:

Claude Code / Claude API

export ANTHROPIC_BASE_URL=http://127.0.0.1:8000/proxy/anthropic/v1
export ANTHROPIC_API_KEY=<your-existing-key-or-oauth-flow>

OpenAI / Codex CLI

export OPENAI_BASE_URL=http://127.0.0.1:8000/proxy/v1
export OPENAI_API_KEY=<your-openai-api-key>

Both the Chat Completions API (/v1/chat/completions) and the Responses API (/v1/responses) are supported in manual proxy mode. For Codex OAuth, the supported path remains MCP-first without forced global BASE_URL overrides.

What memoryOSS Adds

memoryOSS is most useful when the missing context is specific to you or your work:

  • project conventions and architecture decisions
  • previous fixes, incidents, and deployment notes
  • user preferences and recurring workflows
  • facts learned in earlier sessions with Claude or Codex

It is not meant to inject generic facts the model already knows.

How It Works

Client (Claude/Codex)
    │
    ▼
memoryOSS Gateway (:8000)
    ├── 1. Try memory core
    ├── 2. Recall: find relevant project memories
    ├── 3. Inject: add scoped context to the request
    ├── 4. Forward: send to upstream LLM API
    ├── 5. Extract: pull candidate facts from response (async)
    └── 6. Fail open to direct upstream if the core is unavailable
    │
    ▼
Upstream API (Anthropic / OpenAI)

Memory Modes

memoryOSS supports 4 memory modes, configurable per-server (in config) or per-request (via headers):

Mode Recall Store Use Case
full Yes Yes Full automatic memory — recall past context, store new facts
readonly Yes No See past memories but don't save anything from this session
after Yes (filtered) Yes Only recall memories after a specific date
off No No Pure proxy passthrough, no memory involvement

On a fresh setup the wizard defaults to full. If existing memories are already present, the wizard asks which mode to use and defaults that prompt to full.

Per-Request Memory Control

Control memory behavior per request via headers:

Header Values Effect
X-Memory-Mode full / readonly / off / after Set memory mode for this request
X-Memory-After YYYY-MM-DD Only inject memories after this date (with mode after)

Server operators can set a default mode in config (default_memory_mode) and disable client overrides with allow_client_memory_control = false.

Configuration

Representative generated config

This is a representative hybrid config. The exact extraction provider/model and whether extraction is enabled depend on the tooling and real provider credentials the wizard detects.

[server]
host = "127.0.0.1"
port = 8000
hybrid_mode = true
core_port = 8001

[tls]
enabled = false
auto_generate = false

[auth]
jwt_secret = "..."
audit_hmac_secret = "..."

[[auth.api_keys]]
key = "ek_..."       # Generated by setup wizard
role = "admin"
namespace = "default"

[storage]
data_dir = "data"

[proxy]
enabled = true
passthrough_auth = true
passthrough_local_only = true              # Restrict passthrough to loopback clients by default
upstream_url = "https://api.openai.com/v1"
default_memory_mode = "full"               # Fresh setup default; existing installs are prompted
extraction_enabled = false                 # True only when a real provider API key is available
extract_provider = "openai"                # Or "claude", depending on detected auth
extract_model = "gpt-4o-mini"              # Or a provider-specific default such as Claude Haiku
allow_client_memory_control = true         # Allow X-Memory-Mode header (default: true)
max_memory_pct = 0.10                      # Max 10% of context window for memories
min_recall_score = 0.40                    # Minimum relevance score for injection (calibrated from internal query benchmarks)
min_channel_score = 0.15                   # Precision gate: min score in any channel (default: 0.15)
diversity_factor = 0.3                     # MMR diversity penalty (default: 0.3)

[[proxy.key_mapping]]
proxy_key = "ek_..."                       # Client-facing key
namespace = "default"                      # Memory isolation namespace
# upstream_key = "sk-..."                  # Optional per-client upstream key override

[logging]
level = "info"

[decay]
enabled = true
strategy = "age"
after_days = 14

[sharing]
allow_private_webhooks = false             # Keep localhost/private webhook targets blocked by default

API Endpoints

Proxy (transparent, no code changes)

Endpoint Method Description
/proxy/v1/chat/completions POST OpenAI Chat Completions proxy with memory
/proxy/v1/responses POST OpenAI Responses API proxy with memory (Codex CLI)
/proxy/v1/models GET Model list passthrough
/proxy/anthropic/v1/messages POST Anthropic proxy with memory
/proxy/v1/debug/stats GET Proxy metrics (auth required)

Memory API (direct access)

Endpoint Method Description
/v1/auth/token POST Get JWT from API key
/v1/store POST Store a memory
/v1/store/batch POST Store multiple memories
/v1/recall POST Semantic recall
/v1/recall/batch POST Batch recall
/v1/update PATCH Update a memory
/v1/forget DELETE Delete memories
/v1/consolidate POST Merge similar memories
/health GET Health check

Admin

Endpoint Method Description
/v1/admin/keys GET List API keys
/v1/admin/keys/rotate POST Rotate key
/v1/admin/keys/{id} DELETE Revoke a specific key
/v1/admin/tokens POST Create scoped tokens
/v1/admin/cache/flush POST Flush recall cache
/v1/admin/cache/stats GET Cache statistics
/v1/admin/trust-stats GET Memory trust scores
/v1/admin/index-health GET Index status
/v1/admin/idf-stats GET IDF index statistics
/v1/admin/space-stats GET Space index statistics
/v1/admin/query-explain POST Query debug/explain
/v1/admin/intent-cache/stats GET Intent cache statistics
/v1/admin/intent-cache/flush POST Flush intent cache
/v1/admin/prefetch/stats GET Prefetch statistics
/v1/inspect/{id} GET Inspect memory by ID
/v1/peek/{id} GET Peek at memory content
/v1/source GET AGPL-3.0 source code info
/metrics GET Prometheus-style metrics

Sharing (cross-namespace collaboration)

Endpoint Method Description
/v1/admin/sharing/create POST Create shared namespace
/v1/admin/sharing/list GET List shared namespaces
/v1/admin/sharing/{name} DELETE Delete shared namespace
/v1/admin/sharing/{name}/grants/add POST Add sharing grant
/v1/admin/sharing/{name}/grants/list GET List sharing grants
/v1/admin/sharing/{name}/grants/{grant_id} DELETE Remove sharing grant
/v1/sharing/accessible GET List accessible shared namespaces

GDPR Compliance

Endpoint Method Description
/v1/export GET Data export (all memories)
/v1/memories GET Data access (list memories)
/v1/forget/certified DELETE Certified deletion with audit trail

MCP Server

For Claude Desktop, Claude Code, or Codex MCP support:

{
  "mcpServers": {
    "memory": {
      "command": "memoryoss",
      "args": ["-c", "memoryoss.toml", "mcp-server"]
    }
  }
}

Provides 4 tools: store, recall, update, forget. In the default setup MCP runs alongside the gateway. It is not the transport failover path; it is the explicit memory-tool path.

CLI Commands

Command Description
memoryoss setup Interactive setup wizard
memoryoss serve Start the configured server mode (monolith or hybrid gateway)
memoryoss dev Start without TLS (development)
memoryoss mcp-server Start as MCP server (stdio, embedded)
memoryoss inspect <id> Inspect a memory
memoryoss backup -o backup.tar.zst Backup all data
memoryoss restore <path> Restore from backup
memoryoss decay Run memory decay (age-based cleanup)
memoryoss migrate Run schema migrations
memoryoss migrate-embeddings Re-embed all memories with a new model

Architecture

  • Storage: redb (embedded, crash-safe, single-file) — source of truth
  • Vector Index: usearch (384-dim, AllMiniLM-L6-V2)
  • Full-Text Search: tantivy (BM25 + structured metadata fields)
  • Recall: 4-channel retrieval (vector 0.30 + BM25 0.30 + exact match 0.25 + recency 0.15) with IDF identifier boosting, precision gate, MMR diversity, and trust weighting
  • Extraction: Async LLM-based fact extraction with quarantine (confidence scoring)
  • Indexer: Async outbox-based pipeline with crash recovery across all namespaces
  • Group Committer: Batches concurrent writes into single redb transactions
  • Trust Scoring: 4-signal Bayesian (recency decay, source reputation, embedding coherence, access frequency) — persisted to redb
  • Encryption: AES-256-GCM per-namespace (local key provider, AWS KMS and Vault stubs)
  • Security: Constant-time key comparison, NFKC injection filtering, secret redaction (API keys, tokens, passwords), rate limiting, body size limits, path traversal protection

Security

  • All proxy endpoints require authentication
  • Extracted memories start at confidence 0.2 (quarantine) — only memories with confidence ≥0.3 and passing trust scoring are injected
  • Memory content is filtered against prompt injection patterns (NFKC-normalized)
  • Secrets (API keys, tokens, passwords) are redacted before sending to extraction LLM
  • API keys are compared in constant time (SHA-256 hash comparison)
  • Request bodies capped at 2MB, responses at 10MB
  • Rate limiting on all endpoints including passthrough

Compatible With

Claude Code · OpenAI SDK · Codex CLI · Cursor · Aider · Continue · LangChain · Any OpenAI-compatible client

License

AGPL-3.0 — free to use, modify, and self-host. If you offer it as a network service (SaaS), you must publish your source code. Commercial licenses available for SaaS providers who prefer not to open-source their modifications.