AgenticVision-MCP
MCP server for AgenticVision — universal LLM access to persistent visual memory.
What it does
AgenticVision-MCP exposes the AgenticVision engine over the Model Context Protocol (JSON-RPC 2.0 over stdio). Any MCP-compatible LLM gains persistent visual memory — capture screenshots, embed with CLIP ViT-B/32, compare, recall.
Install
Configure Claude Desktop
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
Configure VS Code / Cursor
Add to .vscode/settings.json:
MCP Surface Area
| Category | Count | Examples |
|---|---|---|
| Tools | 10 | vision_capture, vision_similar, vision_diff, vision_compare, vision_query, vision_ocr, vision_track, vision_link, session_start, session_end |
| Resources | 6 | avis://capture/{id}, avis://session/{id}, avis://timeline/{start}/{end}, avis://similar/{id}, avis://stats, avis://recent |
| Prompts | 4 | observe, compare, track, describe |
How it works
- Capture —
vision_captureaccepts images from files, base64, or URLs. Embeds with CLIP ViT-B/32, stores in.avisbinary format. - Query —
vision_queryretrieves by time, description, or recency.vision_similarfinds visually similar captures by cosine similarity. - Compare —
vision_comparefor side-by-side LLM analysis.vision_difffor pixel-level differencing with 8×8 grid region detection. - Link —
vision_linkconnects captures to AgenticMemory cognitive graph nodes.
Performance
| Operation | Time |
|---|---|
| MCP tool round-trip | 7.2 ms |
| Image capture | 47 ms |
| Similarity search (top-5) | 1-2 ms |
| Visual diff | <1 ms |
Links
- GitHub
- Core Library
- AgenticMemory — Persistent cognitive memory for AI agents
License
MIT