agentic-vision-mcp 0.1.1

MCP server for AgenticVision — universal LLM access to persistent visual memory
Documentation

AgenticVision-MCP

MCP server for AgenticVision — universal LLM access to persistent visual memory.

crates.io MIT License

What it does

AgenticVision-MCP exposes the AgenticVision engine over the Model Context Protocol (JSON-RPC 2.0 over stdio). Any MCP-compatible LLM gains persistent visual memory — capture screenshots, embed with CLIP ViT-B/32, compare, recall.

Install

cargo install agentic-vision-mcp

Configure Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "vision": {
      "command": "agentic-vision-mcp",
      "args": ["serve", "--vision", "~/vision.avis"]
    }
  }
}

Configure VS Code / Cursor

Add to .vscode/settings.json:

{
  "mcp.servers": {
    "vision": {
      "command": "agentic-vision-mcp",
      "args": ["serve", "--vision", "${workspaceFolder}/.vision/project.avis"]
    }
  }
}

MCP Surface Area

Category Count Examples
Tools 10 vision_capture, vision_similar, vision_diff, vision_compare, vision_query, vision_ocr, vision_track, vision_link, session_start, session_end
Resources 6 avis://capture/{id}, avis://session/{id}, avis://timeline/{start}/{end}, avis://similar/{id}, avis://stats, avis://recent
Prompts 4 observe, compare, track, describe

How it works

  1. Capturevision_capture accepts images from files, base64, or URLs. Embeds with CLIP ViT-B/32, stores in .avis binary format.
  2. Queryvision_query retrieves by time, description, or recency. vision_similar finds visually similar captures by cosine similarity.
  3. Comparevision_compare for side-by-side LLM analysis. vision_diff for pixel-level differencing with 8×8 grid region detection.
  4. Linkvision_link connects captures to AgenticMemory cognitive graph nodes.

Performance

Operation Time
MCP tool round-trip 7.2 ms
Image capture 47 ms
Similarity search (top-5) 1-2 ms
Visual diff <1 ms

Links

License

MIT