# Memory MCP Integration
MCP (Model Context Protocol) server integration for the self-learning memory system with secure code execution capabilities.
## Features
- **MCP Server**: Standard MCP protocol implementation with 19 tools
- **Episode Lifecycle Management**: Programmatic episode creation, tracking, and completion (NEW in v0.1.13)
- **Secure Code Sandbox**: WASM-based code execution with comprehensive security
- **Memory Integration**: Query episodic memory and analyze learned patterns
- **Pattern Analysis**: Advanced pattern extraction and recommendations
- **Embeddings Support**: Multiple providers (OpenAI, Ollama, local models)
- **Progressive Tool Disclosure**: Tools prioritized based on usage patterns
- **Execution Monitoring**: Detailed statistics and performance tracking
## Implementation Status
### Phase 2A: Wasmtime WASM Sandbox ✅ **COMPLETE**
**Status**: Production-ready POC eliminating rquickjs GC crashes
- ✅ wasmtime 24.0.5 integration
- ✅ Concurrent execution without SIGABRT crashes
- ✅ 100-parallel stress test passing
- ✅ Semaphore-based pooling (max 20 concurrent)
- ✅ Comprehensive metrics and health monitoring
- ✅ All tests passing (5/5)
**Key Achievement**: Zero GC crashes under high concurrency (100 parallel executions)
### Phase 2B: JavaScript Support via Javy (Next)
**Goal**: Enable JavaScript/TypeScript execution through Javy compiler
- ⏳ Javy v8.0.0 integration (JavaScript→WASM)
- ⏳ WASI preview1 (stdout/stderr capture)
- ⏳ Fuel-based timeout enforcement
- ⏳ Performance benchmarking vs baseline
> **Note:** The `javy` backend requires either a bundled `javy-plugin.wasm` plugin (set via `JAVY_PLUGIN`) or the `javy` CLI available on PATH. CI will attempt to install the CLI when running the `javy-backend` feature; if neither is present, Javy tests will be skipped gracefully.
### Phase 1: rquickjs Migration ✅ **COMPLETE**
**Problem Solved**: rquickjs v0.6.2 had critical GC race conditions causing SIGABRT crashes under concurrent test execution.
**Solution**: Disabled WASM sandbox in all tests (via `MCP_USE_WASM=false`) until wasmtime replacement complete.
## Security Architecture
The sandbox implements **defense-in-depth** security with multiple layers:
### 1. Input Validation
- Code length limits (100KB max)
- Malicious pattern detection
- Syntax validation
### 2. Process Isolation
- Separate Node.js process per execution
- Restricted global access
- No require/import capabilities (by default)
### 3. Resource Limits
- Configurable timeout (default: 5 seconds)
- Memory limits (default: 128MB)
- CPU usage constraints (default: 50%)
### 4. Access Controls
- **File System**: Denied by default, whitelist approach when enabled
- **Network**: Denied by default, no external connections
- **Subprocesses**: Denied, no command execution
### 5. Pattern Detection
Automatically blocks:
- `require('fs')`, `require('http')`, `require('https')`
- `require('child_process')`, `exec()`, `spawn()`
- `eval()`, `new Function()`
- `while(true)`, `for(;;)` infinite loops
- `fetch()`, `WebSocket`, `XMLHttpRequest`
## Usage
### Basic Example
```rust
use memory_mcp::{MemoryMCPServer, SandboxConfig, ExecutionContext};
use serde_json::json;
#[tokio::main]
async fn main() -> anyhow::Result<()> {
// Create server with restrictive sandbox
let server = MemoryMCPServer::new(SandboxConfig::restrictive()).await?;
// Execute code securely
let code = r#"
const result = {
sum: 1 + 1,
message: "Hello from sandbox"
};
console.log("Calculating sum...");
return result;
"#;
let context = ExecutionContext::new(
"Calculate sum".to_string(),
json!({"a": 1, "b": 1}),
);
let result = server.execute_agent_code(code.to_string(), context).await?;
println!("Result: {:?}", result);
Ok(())
}
```
### Sandbox Configurations
#### Restrictive (Recommended for Untrusted Code)
```rust
let config = SandboxConfig::restrictive();
// - 3 second timeout
// - 64MB memory limit
// - 30% CPU limit
// - No network, no filesystem, no subprocesses
```
#### Default (Balanced)
```rust
let config = SandboxConfig::default();
// - 5 second timeout
// - 128MB memory limit
// - 50% CPU limit
// - No network, no filesystem, no subprocesses
```
#### Permissive (For Trusted Code)
```rust
let config = SandboxConfig::permissive();
// - 10 second timeout
// - 256MB memory limit
// - 80% CPU limit
// - Filesystem access to whitelisted paths
```
### Custom Configuration
```rust
let config = SandboxConfig {
max_execution_time_ms: 3000,
max_memory_mb: 64,
max_cpu_percent: 30,
allowed_paths: vec!["/tmp/safe-dir".to_string()],
allowed_network: vec![],
allow_network: false,
allow_filesystem: false,
allow_subprocesses: false,
};
```
## Available Tools
The MCP server provides **22 tools** organized into categories:
### Episode Lifecycle Management (NEW in v0.1.13)
Programmatically manage episodes through the MCP interface:
- **`create_episode`** - Start tracking a new task with metadata
- **`add_episode_step`** - Log execution steps to track progress
- **`complete_episode`** - Finalize episode and trigger learning cycle
- **`get_episode`** - Retrieve complete episode details
- **`get_episode_timeline`** - Visualize chronological task progression
- **`delete_episode`** - Remove episodes permanently (with safeguards)
📖 **[Complete Episode Lifecycle Documentation](EPISODE_LIFECYCLE_TOOLS.md)**
### Batch Operations Contract Status
The MCP JSON-RPC endpoint supports `batch/execute` (multi-operation transport).
However, tool-level batch analytics names are currently **deferred and not advertised**:
- `batch_query_episodes`
- `batch_pattern_analysis`
- `batch_compare_episodes`
These names intentionally return `Tool not found` until dedicated handlers are implemented.
📖 **[Batch Tool Status (WG-053)](BATCH_OPERATIONS_TOOLS.md)**
### Memory & Query Tools
- **`query_memory`** - Query episodic memory for relevant past experiences
- **`query_semantic_memory`** - Semantic search using embeddings
- **`bulk_episodes`** - Retrieve multiple episodes efficiently
### Code Execution
- **`execute_agent_code`** - Execute TypeScript/JavaScript in secure WASM sandbox
### Pattern Analysis
- **`analyze_patterns`** - Analyze patterns from past episodes
- **`advanced_pattern_analysis`** - Deep pattern analysis with statistical methods
- **`search_patterns`** - Search for specific patterns
- **`recommend_patterns`** - Get pattern recommendations for tasks
### Embeddings & Configuration
- **`configure_embeddings`** - Configure embedding providers (OpenAI, Ollama, local)
- **`test_embeddings`** - Test embedding generation
### Monitoring & Health
- **`health_check`** - Server health and status
- **`get_metrics`** - Performance metrics and statistics
- **`quality_metrics`** - Episode quality assessment
### Quick Reference
#### 1. `query_memory`
```json
{
"query": "Search query describing task",
"domain": "Task domain (e.g., 'web-api')",
}
```
#### 2. `execute_agent_code`
```json
{
"code": "TypeScript/JavaScript code to execute",
"context": {
"task": "Task description",
"input": { "data": "as JSON" }
}
}
```
#### 3. `analyze_patterns`
```json
{
"task_type": "Type of task to analyze",
"min_success_rate": 0.7,
"limit": 20
}
```
## Security Testing
The crate includes comprehensive security tests:
```bash
# Run all tests
cargo test --package do-memory-mcp
# Run only security tests
cargo test --package do-memory-mcp --test security_test
# Run integration tests
cargo test --package do-memory-mcp --test integration_test
```
### Security Test Coverage
- File system access blocking (12 tests)
- Network access blocking (4 tests)
- Process execution blocking (3 tests)
- Infinite loop detection (2 tests)
- Code injection blocking (2 tests)
- Resource exhaustion (2 tests)
- Path traversal attacks (3 tests)
- Legitimate code execution (4 tests)
## Execution Results
The sandbox returns detailed execution results:
```rust
pub enum ExecutionResult {
Success {
output: String,
stdout: String,
stderr: String,
execution_time_ms: u64,
},
Error {
message: String,
error_type: ErrorType,
stdout: String,
stderr: String,
},
Timeout {
elapsed_ms: u64,
partial_output: Option<String>,
},
SecurityViolation {
reason: String,
violation_type: SecurityViolationType,
},
}
```
## Performance
- **Average execution time**: ~50-200ms for simple code
- **Timeout overhead**: <10ms
- **Memory footprint**: ~5MB per execution
- **Concurrent executions**: Supported via async runtime
## Limitations
1. **Node.js Required**: The sandbox requires Node.js to be installed
2. **Pattern-Based Detection**: Some obfuscated attacks may bypass detection
3. **Resource Monitoring**: CPU/memory limits are advisory, not enforced
4. **Async Timeout**: Async code may run slightly beyond timeout
## Best Practices
### For Untrusted Code
```rust
// Use restrictive config
let config = SandboxConfig::restrictive();
let server = MemoryMCPServer::new(config).await?;
// Always check result type
match server.execute_agent_code(code, context).await? {
ExecutionResult::Success { .. } => { /* handle success */ },
ExecutionResult::SecurityViolation { reason, .. } => {
eprintln!("Security violation: {}", reason);
},
_ => { /* handle other cases */ }
}
```
### For Trusted Code
```rust
// Use permissive config with specific whitelist
let mut config = SandboxConfig::permissive();
config.allowed_paths = vec!["/app/data".to_string()];
config.allowed_network = vec!["api.example.com".to_string()];
let server = MemoryMCPServer::new(config).await?;
```
### Error Handling
```rust
use memory_mcp::{ExecutionResult, ErrorType};
let result = server.execute_agent_code(code, context).await?;
match result {
ExecutionResult::Success { output, .. } => {
println!("Success: {}", output);
},
ExecutionResult::Error { error_type: ErrorType::Syntax, message, .. } => {
eprintln!("Syntax error: {}", message);
},
ExecutionResult::Error { error_type: ErrorType::Runtime, message, .. } => {
eprintln!("Runtime error: {}", message);
},
ExecutionResult::Timeout { elapsed_ms, .. } => {
eprintln!("Timeout after {}ms", elapsed_ms);
},
ExecutionResult::SecurityViolation { reason, violation_type, .. } => {
eprintln!("Security violation ({:?}): {}", violation_type, reason);
},
}
```
## Contributing
When adding new features:
1. **Security First**: Always consider security implications
2. **Test Coverage**: Add tests for both success and failure cases
3. **Documentation**: Update README and inline docs
4. **Performance**: Profile code execution paths
## License
MIT License - See LICENSE file for details