Memory MCP Integration

MCP (Model Context Protocol) server integration for the self-learning memory system with secure code execution capabilities.

Features

MCP Server: Standard MCP protocol implementation with 19 tools
Episode Lifecycle Management: Programmatic episode creation, tracking, and completion (NEW in v0.1.13)
Secure Code Sandbox: WASM-based code execution with comprehensive security
Memory Integration: Query episodic memory and analyze learned patterns
Pattern Analysis: Advanced pattern extraction and recommendations
Embeddings Support: Multiple providers (OpenAI, Ollama, local models)
Progressive Tool Disclosure: Tools prioritized based on usage patterns
Execution Monitoring: Detailed statistics and performance tracking

Implementation Status

Phase 2A: Wasmtime WASM Sandbox ✅ COMPLETE

Status: Production-ready POC eliminating rquickjs GC crashes

✅ wasmtime 24.0.5 integration
✅ Concurrent execution without SIGABRT crashes
✅ 100-parallel stress test passing
✅ Semaphore-based pooling (max 20 concurrent)
✅ Comprehensive metrics and health monitoring
✅ All tests passing (5/5)

Key Achievement: Zero GC crashes under high concurrency (100 parallel executions)

Phase 2B: JavaScript Support via Javy (Next)

Goal: Enable JavaScript/TypeScript execution through Javy compiler

⏳ Javy v8.0.0 integration (JavaScript→WASM)
⏳ WASI preview1 (stdout/stderr capture)
⏳ Fuel-based timeout enforcement
⏳ Performance benchmarking vs baseline

Note: The javy backend requires either a bundled javy-plugin.wasm plugin (set via JAVY_PLUGIN) or the javy CLI available on PATH. CI will attempt to install the CLI when running the javy-backend feature; if neither is present, Javy tests will be skipped gracefully.

Phase 1: rquickjs Migration ✅ COMPLETE

Problem Solved: rquickjs v0.6.2 had critical GC race conditions causing SIGABRT crashes under concurrent test execution.

Solution: Disabled WASM sandbox in all tests (via MCP_USE_WASM=false) until wasmtime replacement complete.

Security Architecture

The sandbox implements defense-in-depth security with multiple layers:

1. Input Validation

Code length limits (100KB max)
Malicious pattern detection
Syntax validation

2. Process Isolation

Separate Node.js process per execution
Restricted global access
No require/import capabilities (by default)

3. Resource Limits

Configurable timeout (default: 5 seconds)
Memory limits (default: 128MB)
CPU usage constraints (default: 50%)

4. Access Controls

File System: Denied by default, whitelist approach when enabled
Network: Denied by default, no external connections
Subprocesses: Denied, no command execution

5. Pattern Detection

Automatically blocks:

require('fs'), require('http'), require('https')
require('child_process'), exec(), spawn()
eval(), new Function()
while(true), for(;;) infinite loops
fetch(), WebSocket, XMLHttpRequest

Usage

Basic Example

use memory_mcp::{MemoryMCPServer, SandboxConfig, ExecutionContext};
use serde_json::json;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    // Create server with restrictive sandbox
    let server = MemoryMCPServer::new(SandboxConfig::restrictive()).await?;

    // Execute code securely
    let code = r#"
        const result = {
            sum: 1 + 1,
            message: "Hello from sandbox"
        };
        console.log("Calculating sum...");
        return result;
    "#;

    let context = ExecutionContext::new(
        "Calculate sum".to_string(),
        json!({"a": 1, "b": 1}),
    );

    let result = server.execute_agent_code(code.to_string(), context).await?;
    println!("Result: {:?}", result);

    Ok(())
}

Sandbox Configurations

Restrictive (Recommended for Untrusted Code)

let config = SandboxConfig::restrictive();
// - 3 second timeout
// - 64MB memory limit
// - 30% CPU limit
// - No network, no filesystem, no subprocesses

Default (Balanced)

let config = SandboxConfig::default();
// - 5 second timeout
// - 128MB memory limit
// - 50% CPU limit
// - No network, no filesystem, no subprocesses

Permissive (For Trusted Code)

let config = SandboxConfig::permissive();
// - 10 second timeout
// - 256MB memory limit
// - 80% CPU limit
// - Filesystem access to whitelisted paths

Custom Configuration

let config = SandboxConfig {
    max_execution_time_ms: 3000,
    max_memory_mb: 64,
    max_cpu_percent: 30,
    allowed_paths: vec!["/tmp/safe-dir".to_string()],
    allowed_network: vec![],
    allow_network: false,
    allow_filesystem: false,
    allow_subprocesses: false,
};

Available Tools

The MCP server provides 22 tools organized into categories:

Episode Lifecycle Management (NEW in v0.1.13)

Programmatically manage episodes through the MCP interface:

create_episode - Start tracking a new task with metadata
add_episode_step - Log execution steps to track progress
complete_episode - Finalize episode and trigger learning cycle
get_episode - Retrieve complete episode details
get_episode_timeline - Visualize chronological task progression
delete_episode - Remove episodes permanently (with safeguards)

📖 Complete Episode Lifecycle Documentation

Batch Operations Contract Status

The MCP JSON-RPC endpoint supports batch/execute (multi-operation transport). However, tool-level batch analytics names are currently deferred and not advertised:

batch_query_episodes
batch_pattern_analysis
batch_compare_episodes

These names intentionally return Tool not found until dedicated handlers are implemented.

📖 Batch Tool Status (WG-053)

Memory & Query Tools

query_memory - Query episodic memory for relevant past experiences
query_semantic_memory - Semantic search using embeddings
bulk_episodes - Retrieve multiple episodes efficiently

Code Execution

execute_agent_code - Execute TypeScript/JavaScript in secure WASM sandbox

Pattern Analysis

analyze_patterns - Analyze patterns from past episodes
advanced_pattern_analysis - Deep pattern analysis with statistical methods
search_patterns - Search for specific patterns
recommend_patterns - Get pattern recommendations for tasks

Embeddings & Configuration

configure_embeddings - Configure embedding providers (OpenAI, Ollama, local)
test_embeddings - Test embedding generation

Monitoring & Health

health_check - Server health and status
get_metrics - Performance metrics and statistics
quality_metrics - Episode quality assessment

Quick Reference

1. `query_memory`

{
  "query": "Search query describing task",
  "domain": "Task domain (e.g., 'web-api')",
  "task_type": "code_generation | debugging | refactoring | testing | analysis | documentation",
  "limit": 10
}

2. `execute_agent_code`

{
  "code": "TypeScript/JavaScript code to execute",
  "context": {
    "task": "Task description",
    "input": { "data": "as JSON" }
  }
}

3. `analyze_patterns`

{
  "task_type": "Type of task to analyze",
  "min_success_rate": 0.7,
  "limit": 20
}

Security Testing

The crate includes comprehensive security tests:

# Run all tests
cargo test --package do-memory-mcp

# Run only security tests
cargo test --package do-memory-mcp --test security_test

# Run integration tests
cargo test --package do-memory-mcp --test integration_test

Security Test Coverage

File system access blocking (12 tests)
Network access blocking (4 tests)
Process execution blocking (3 tests)
Infinite loop detection (2 tests)
Code injection blocking (2 tests)
Resource exhaustion (2 tests)
Path traversal attacks (3 tests)
Legitimate code execution (4 tests)

Execution Results

The sandbox returns detailed execution results:

pub enum ExecutionResult {
    Success {
        output: String,
        stdout: String,
        stderr: String,
        execution_time_ms: u64,
    },
    Error {
        message: String,
        error_type: ErrorType,
        stdout: String,
        stderr: String,
    },
    Timeout {
        elapsed_ms: u64,
        partial_output: Option<String>,
    },
    SecurityViolation {
        reason: String,
        violation_type: SecurityViolationType,
    },
}

Performance

Average execution time: ~50-200ms for simple code
Timeout overhead: <10ms
Memory footprint: ~5MB per execution
Concurrent executions: Supported via async runtime

Limitations

Node.js Required: The sandbox requires Node.js to be installed
Pattern-Based Detection: Some obfuscated attacks may bypass detection
Resource Monitoring: CPU/memory limits are advisory, not enforced
Async Timeout: Async code may run slightly beyond timeout

Best Practices

For Untrusted Code

// Use restrictive config
let config = SandboxConfig::restrictive();
let server = MemoryMCPServer::new(config).await?;

// Always check result type
match server.execute_agent_code(code, context).await? {
    ExecutionResult::Success { .. } => { /* handle success */ },
    ExecutionResult::SecurityViolation { reason, .. } => {
        eprintln!("Security violation: {}", reason);
    },
    _ => { /* handle other cases */ }
}

For Trusted Code

// Use permissive config with specific whitelist
let mut config = SandboxConfig::permissive();
config.allowed_paths = vec!["/app/data".to_string()];
config.allowed_network = vec!["api.example.com".to_string()];

let server = MemoryMCPServer::new(config).await?;

Error Handling

use memory_mcp::{ExecutionResult, ErrorType};

let result = server.execute_agent_code(code, context).await?;

match result {
    ExecutionResult::Success { output, .. } => {
        println!("Success: {}", output);
    },
    ExecutionResult::Error { error_type: ErrorType::Syntax, message, .. } => {
        eprintln!("Syntax error: {}", message);
    },
    ExecutionResult::Error { error_type: ErrorType::Runtime, message, .. } => {
        eprintln!("Runtime error: {}", message);
    },
    ExecutionResult::Timeout { elapsed_ms, .. } => {
        eprintln!("Timeout after {}ms", elapsed_ms);
    },
    ExecutionResult::SecurityViolation { reason, violation_type, .. } => {
        eprintln!("Security violation ({:?}): {}", violation_type, reason);
    },
}

Contributing

When adding new features:

Security First: Always consider security implications
Test Coverage: Add tests for both success and failure cases
Documentation: Update README and inline docs
Performance: Profile code execution paths

License

MIT License - See LICENSE file for details

do-memory-mcp 0.1.30