VT Code

cargo install vtcode or brew install vinhnx/tap/vtcode (macOS) or npm install -g vtcode

VT Code is a Rust-based terminal coding agent with semantic code intelligence via Tree-sitter (parsers for Rust, Python, JavaScript/TypeScript, Go, Java) and ast-grep (structural pattern matching and refactoring).

It supports multiple LLM providers: OpenAI, Anthropic, xAI, DeepSeek, Gemini, OpenRouter, all with automatic failover, prompt caching, and token-efficient context management. Configuration occurs entirely through vtcode.toml, sourcing constants from vtcode-core/src/config/constants.rs and model IDs from docs/models.json to ensure reproducibility and avoid hardcoding.

Demo

Technical Motivation

VT Code addresses limitations in existing coding agents by prioritizing Rust's type safety, zero-cost abstractions, and async ecosystem for reliable, high-performance execution. Motivated by agentic AI research (e.g., Anthropic's context engineering principles), it integrates Tree-sitter for precise parsing and MCP for extensible tooling. This enables long-running sessions with maintained context integrity, error resilience, and minimal token overhead. Builds on foundational work like perg while incorporating lessons from Anthropic's Context Engineering and Building effective agents guide. Inspiration from OpenAI's codex-cli.

System Architecture

The architecture divides into vtcode-core (reusable library) and src/ (CLI executable), leveraging Tokio for multi-threaded async runtime (#[tokio::main(flavor = "multi_thread")] for CPU-intensive tasks), anyhow for contextual error propagation, and clap for derive-based CLI parsing. Key design tenets include atomic operations, metadata-driven tool calls (to optimize context tokens), and phase-aware context curation.

Core Components (`vtcode-core/`)

LLM Abstractions (llm/): Provider traits enable uniform async interfaces:
```
#[async_trait::async_trait]
pub trait Provider: Send + Sync {
    async fn complete(&self, prompt: &str) -> anyhow::Result<Completion>;
    fn supports_caching(&self) -> bool;
}
```
Features: Streaming responses, model-specific optimizations (e.g., Anthropic's cache_control: { ttl: "5m" } for 5-minute TTL; OpenAI's prompt_tokens_details.cached_tokens reporting ~40% savings). Tokenization via tiktoken-rs ensures accurate budgeting across models.
Modular Tools (tools/): Trait-based extensibility:
```
#[async_trait]
pub trait Tool: Send + Sync {
    fn name(&self) -> &'static str;
    fn description(&self) -> &'static str;
    async fn execute(&self, args: serde_json::Value) -> anyhow::Result<serde_json::Value>;
}
```
Built-ins include read_file (chunked at 2000 lines, metadata-first), ast_grep_search (operations: search/transform/lint/refactor with preview_only=true), and run_terminal_cmd (modes: terminal/pty/streaming; 30s timeout default). Git integration via list_files uses walkdir with ignore crate for .gitignore-aware traversal and nucleo-matcher for fuzzy scoring.
Configuration Engine (config/): Deserializes vtcode.toml into structs with validation:
```
[context.curation]
enabled = true
max_tokens_per_turn = 100000  # Enforce per-provider limits
phase_detection = true  # Auto-classify: exploration/implementation/etc.
```
Sections cover agents, tools (allow/deny), MCP (provider URLs), caching (quality_threshold=0.7), and safety (workspace_paths, max_file_size=1MB).
Context Engineering System: Implements iterative, per-turn curation based on conversation phase detection (e.g., exploration prioritizes search tools). Token budgeting: Real-time tracking with tiktoken-rs (~10μs/message), thresholds (0.75 warn/0.85 compact), and automatic summarization (LLM-driven, preserving decision ledger and errors; targets 30% compression ratio, saving ~29% tokens/turn). Decision ledger: Structured audit (Vec<DecisionEntry> with status: pending/in_progress/completed, confidence: 0-1). Error recovery: Pattern matching (e.g., parse failures) with fallback strategies and context preservation.
Code Intelligence: Tree-sitter integration for AST traversal (e.g., symbol resolution in tools/ast_grep_search); ast-grep for rule-based transforms:
```
# Example pattern in tool call
pattern: "fn $NAME($PARAMS) { $BODY }"
replacement: "async fn $NAME($PARAMS) -> Result<()> { $BODY.await }"
```
Supports preview mode to avoid destructive applies.
MCP Integration: Client uses official Rust SDK for protocol-compliant calls:
```
let client = McpClient::new("ws://localhost:8080");
let docs = client.call("get-library-docs", json!({
    "context7CompatibleLibraryID": "/tokio/docs",
    "tokens": 5000,
    "topic": "async runtime"
})).await?;
```
Discovers tools dynamically (e.g., mcp_resolve-library-id for Context7 IDs, mcp_sequentialthinking for chain-of-thought reasoning with branch/revision support, mcp_get_current_time for timezone-aware ops). Connection pooling and failover for multi-provider setups.

CLI Execution (`src/`)

User Interface: Ratatui for reactive TUI (mouse-enabled, ANSI escape sequences for colors: e.g., \x1b[34m for blue tool banners). Real-time PTY via vte crate for command streaming; slash commands parsed with fuzzy matching.
Runtime: Tokio executor handles concurrent tool calls; human-in-the-loop via confirmation prompts for high-risk ops (e.g., rm -rf denials).
Observability: Logs to file/console with structured format; metrics (e.g., cache hit rate, token usage) exposed via debug flags.

Performance notes: Multi-threaded Tokio reduces latency for I/O-bound tasks (~20% faster than single-thread); context compression yields 50-80% token savings in long sessions. See docs/ARCHITECTURE.md for dependency graph and profiling data.

Key Capabilities

LLM Orchestration: Failover logic (e.g., Gemini primary, OpenAI fallback); reasoning control (low/medium/high effort via provider params); caching with quality gating (cache only >70% confidence, TTL=30 days).
Code Analysis & Editing: Semantic search (AST-grep similarity mode, threshold=0.7); targeted edits (exact string match in edit_file, preserving whitespace); multi-file patches via apply_patch.
Context & Session Management: Phase-adaptive tool selection (e.g., validation phase favors run_terminal_cmd with cargo test); ledger injection for coherence (max 12 entries); summarization triggers at 20 turns or 85% budget.
Extensibility: Custom tools via trait impls; MCP for domain-specific extensions (e.g., library docs resolution: resolve-library-id → get-library-docs with max_tokens=5000).
Security Posture: Path validation (no escapes outside WORKSPACE_DIR); sandboxed network (curl HTTPS only, no localhost); allowlists (e.g., deny rm, permit cargo); env-var secrets (no file storage).

Installation and Initialization

Binaries on GitHub Releases support macOS (aarch64/x86_64-apple-darwin), Linux (x86_64/aarch64-unknown-linux-gnu), Windows (x86_64-pc-windows-msvc).

Initialize environment:

# API keys (required for providers)
export OPENAI_API_KEY="your_api_key"  # Validate via curl test
# [RECOMMEND] use dot file .env in your project root: OPENAI_API_KEY=sk-...

# Workspace setup
cd my-project
cp /path/to/vtcode.toml.example .vtcode/vtcode.toml  # Customize [mcp.enabled=true] etc.

Execution:

vtcode  # Launch TUI (loads vtcode.toml defaults)
vtcode --provider openai --model gpt-5-codex ask "Refactor async fn in src/lib.rs using ast-grep"  # Tool-enabled query
vtcode --debug --no-tools ask "Compute token budget for current context"  # Dry-run analysis

Configuration validation: On load, checks TOML against schema (e.g., model in docs/models.json); logs warnings for deprecated keys.

Command-Line Interface

Parsed via clap derives:

vtcode [OPTIONS] <SUBCOMMAND>

SUBCOMMANDS:
    ask <QUERY>     Non-interactive query with optional tools
    OPTIONS:
        --provider <PROVIDER>    LLM provider (default: from config)
        --model <MODEL>          Model ID (e.g., gemini-1.5-pro; from constants.rs)
        --no-tools               Disable tool execution (analysis only)
        --debug                  Enable verbose logging and metrics
        --help                   Print usage

Development runs: ./run.sh (release profile: lto=true, codegen-units=1 for opt); ./run-debug.sh (debug symbols, hot-reload via cargo-watch).

Development Practices

Validation Pipeline: cargo check for incremental builds; cargo clippy with custom lints (e.g., no unsafe, prefer ? over unwrap); cargo fmt --check for style enforcement (4-space indents, max_width=100).
Testing: Unit tests in #[cfg(test)] modules; integration in tests/ (e.g., mock LLM responses with wiremock); coverage via cargo tarpaulin. Property testing for tools (e.g., proptest for path sanitization).

Error Handling: Uniform anyhow::Result<T> with context:

fn load_config(path: &Path) -> anyhow::Result<Config> {
    toml::from_str(&fs::read_to_string(path)?)
        .with_context(|| format!("Invalid TOML in {}", path.display()))?
}

Benchmarking: Criterion in benches/ (e.g., token counting: <50μs for 1k tokens); flamegraphs for async bottlenecks.
Documentation: Rustdoc for public APIs; Markdown in ./docs/ (e.g., context engineering impl). No root-level Markdown except README.

Adhere to CONTRIBUTING.md: Conventional commits, PR templates with benchmarks.

References

User Guides: Getting Started, Configuration.
Technical Docs: Context Engineering, MCP Setup, Prompt Caching.
API: vtcode-core (full crate docs).
Changelog: CHANGELOG.

License

MIT License - LICENSE for full terms.

vtcode-core 0.19.0