Octomind 🤖 - AI-Powered Development Assistant
© 2025 Muvon Un Limited | Complete Documentation
Session-based AI development assistant with conversational codebase interaction, multimodal vision support, built-in MCP tools, and multi-provider AI integration
Octomind is a session-first AI development assistant that transforms how you interact with codebases through natural language conversations. Built on the Model Context Protocol (MCP), it provides seamless integration with development tools, multi-provider AI support, and intelligent cost optimization.
✨ Core Features
- 🎯 Session-First Architecture - Everything happens in interactive AI conversations with persistent context
- 🛠️ Built-in MCP Tools - File operations, code analysis, shell commands, web search via Model Context Protocol
- 🌐 Multi-Provider AI Support - OpenRouter, OpenAI, Anthropic, Google, Amazon, Cloudflare, DeepSeek
- 🖼️ Multimodal Vision Support - Analyze images, screenshots, diagrams with AI vision capabilities
- 💰 Cost Tracking & Optimization - Real-time usage monitoring, caching, and detailed cost reporting
- 🔧 Role-Based Configuration - Developer (full tools), Assistant (chat-only), and custom roles
- 🧠 Smart Session Continuation - Automatic context management when token limits are reached
- ⚡ Layered Processing - AI pipeline system for complex task decomposition and processing
🚀 Quick Start
Prerequisites
- API Key from supported AI provider
Installation
# One-line install (recommended)
|
# Set your AI provider API key (choose one)
# Multi-provider access
# Direct OpenAI
# Direct Anthropic
# Start your first session
💬 How It Works
Octomind operates through interactive AI sessions with built-in development tools:
> "How does authentication work in this project?"
[AI analyzes project structure, finds auth-related files, explains implementation]
> "Add error handling to the login function"
[AI examines login code, implements error handling, shows changes]
> "Rename 'processData' to 'processUserData' across all files"
[AI finds all occurrences, performs batch edit across multiple files]
> /image screenshot.png
> "What's wrong with this UI layout?"
[AI analyzes the image, identifies layout issues, suggests CSS fixes]
> agent_context_gatherer(task="Analyze the authentication system architecture")
[Routes task to specialized context gathering AI agent with development tools]
> /report
[Shows: $0.02 spent, 3 requests, 5 tool calls, timing analysis]
Built-in MCP Tools
- Developer Tools:
shell(),ast_grep()- Execute commands and search code patterns - Filesystem Tools:
text_editor(),list_files(),batch_edit()- File operations - Web Tools:
web_search(),read_html()- Web research and content analysis - Agent Tools:
agent_*()- Route tasks to specialized AI processing layers
Session Commands
/help- Show available commands/info- Display token usage and costs/image <path>- Attach images for AI analysis/mcp info- Check MCP server status/model <model>- Switch AI models/role <role>- Change role (developer/assistant)/cache- Add cache checkpoint for cost optimization
🌐 Supported AI Providers
| Provider | Format | Features |
|---|---|---|
| OpenRouter | openrouter:provider/model |
Multi-provider access, caching, vision models |
| OpenAI | openai:model-name |
Direct API, cost calculation, GPT-4o vision |
| Anthropic | anthropic:model-name |
Claude models, caching, Claude 3+ vision |
google:model-name |
Vertex AI, Gemini 1.5+ vision support | |
| Amazon | amazon:model-name |
Bedrock models, AWS integration, Claude vision |
| Cloudflare | cloudflare:model-name |
Edge AI, fast inference, Llama 3.2 vision |
| DeepSeek | deepseek:model-name |
Cost-effective models, competitive performance |
🛠️ Installation & Setup
Prerequisites
- Rust 1.82+ and Cargo
- API Key from supported AI provider
Installation Options
# One-line install (recommended)
|
# Build from source (for development)
# Install via Cargo (when published)
API Key Setup
Set your AI provider API key (choose one or more):
# Multi-provider access (recommended)
# Direct provider access
# Optional: Web search capability
First Run
# Generate default configuration (optional)
# Start your first session
# Within the session, try:
🎮 Session Commands
Essential commands for interactive sessions:
Core Commands
/help- Show available commands/info- Display token usage and costs/image <path>- Attach images for AI analysis/model [model]- View or change AI model/role [role]- Change role (developer/assistant)
Context Management
/cache- Add cache checkpoint for cost optimization/context [filter]- Display session context/truncate- Manually truncate context/done- Finalize task with memorization
MCP Tools & Debugging
/mcp info- Check MCP server status/run <command>- Execute custom commands/layers- Toggle layered processing/loglevel [level]- Set logging level
Session Management
/save- Save current session/clear- Clear terminal screen/exit- Exit session
🏗️ Architecture
Session-First Design: Everything happens in interactive AI conversations with persistent context and built-in development tools.
Core Components:
- MCP Tools: Built-in servers for development (shell, ast_grep), filesystem (text_editor, batch_edit), web (search, html), and agent routing
- Multi-Provider AI: Seamless switching between OpenRouter, OpenAI, Anthropic, Google, Amazon, Cloudflare, DeepSeek
- Role-Based Access: Developer (full tools), Assistant (chat-only), and custom role configurations
- Smart Caching: Automatic cost optimization with cache markers and intelligent context management
- Layered Processing: AI pipeline system for complex task decomposition and specialized processing
🔧 Configuration
Octomind uses a template-based configuration system with smart defaults:
# Generate default config (optional)
# View current settings
# Validate configuration
Configuration Features:
- Template-Based: All defaults in
config-templates/default.toml - Environment Overrides: Any setting can be overridden with
OCTOMIND_*variables - Role-Based: Different configurations for developer/assistant/custom roles
- MCP Integration: Built-in and external MCP server configurations
- Cost Controls: Spending thresholds and performance tuning
📖 Documentation
📚 Complete Documentation - Comprehensive guides and references
Quick Navigation
- Installation Guide - Setup, prerequisites, and development
- Overview - Architecture and core concepts
- Configuration Guide - Configuration system and customization
- AI Providers - Provider setup and model selection
- Sessions Guide - Interactive sessions and commands
- Advanced Features - MCP tools and extensibility
- Command Layers - AI processing pipeline
- MCP Development - Tool development
🚀 Contributing
Contributions are welcome! Help make Octomind better for the development community.
Development Setup:
Development Areas:
- AI Providers: Add new providers in
src/providers/ - MCP Tools: Extend built-in tools in
src/mcp/ - Session Features: Enhance session management in
src/session/ - Documentation: Improve guides and examples
Requirements: Rust 1.82+, API key from supported providers
🆘 Troubleshooting
Common Issues:
- Build Errors: Use
cargo check --message-format=shortfor fast syntax checking - Missing API Keys: Set
OPENROUTER_API_KEYor provider-specific keys - Invalid Model Format: Use
provider:modelformat (e.g.,openrouter:anthropic/claude-sonnet-4) - MCP Tool Issues: Check
/mcp infofor server status - Session Problems: Use
/loglevel debugfor detailed logging
Getting Help:
- 🐛 Issues: GitHub Issues
- 📖 Documentation: Complete Documentation
- 💬 Discussions: GitHub Discussions
- ✉️ Email: opensource@muvon.io
📞 Support & Contact
- 🏢 Company: Muvon Un Limited (Hong Kong)
- 🌐 Website: muvon.io
- 📦 Product Page: octomind.muvon.io
- 📧 Email: opensource@muvon.io
- 🐛 Issues: GitHub Issues
⚖️ License
Apache License 2.0 Copyright © 2025 Muvon Un Limited