VT Code
cargo install vtcode
or brew install vinhnx/tap/vtcode
(macOS)
or npm install -g vtcode
VT Code is a Rust-based terminal coding agent with semantic code intelligence via Tree-sitter (parsers for Rust, Python, JavaScript/TypeScript, Go, Java) and ast-grep (structural pattern matching and refactoring).
It supports multiple LLM providers: OpenAI, Anthropic, xAI, DeepSeek, Gemini, OpenRouter, all with automatic failover, prompt caching, and token-efficient context management. Configuration occurs entirely through vtcode.toml
, sourcing constants from vtcode-core/src/config/constants.rs
and model IDs from docs/models.json
to ensure reproducibility and avoid hardcoding.
Technical Motivation
VT Code addresses limitations in existing coding agents by prioritizing Rust's type safety, zero-cost abstractions, and async ecosystem for reliable, high-performance execution. Motivated by agentic AI research (e.g., Anthropic's context engineering principles), it integrates Tree-sitter for precise parsing and MCP for extensible tooling. This enables long-running sessions with maintained context integrity, error resilience, and minimal token overhead. Builds on foundational work like perg while incorporating lessons from Anthropic's Context Engineering and Building effective agents guide. Inspiration from OpenAI's codex-cli.
System Architecture
The architecture divides into vtcode-core
(reusable library) and src/
(CLI executable), leveraging Tokio for multi-threaded async runtime (#[tokio::main(flavor = "multi_thread")]
for CPU-intensive tasks), anyhow for contextual error propagation, and clap for derive-based CLI parsing. Key design tenets include atomic operations, metadata-driven tool calls (to optimize context tokens), and phase-aware context curation.
Core Components (vtcode-core/
)
-
LLM Abstractions (
llm/
): Provider traits enable uniform async interfaces:Features: Streaming responses, model-specific optimizations (e.g., Anthropic's
cache_control: { ttl: "5m" }
for 5-minute TTL; OpenAI'sprompt_tokens_details.cached_tokens
reporting ~40% savings). Tokenization viatiktoken-rs
ensures accurate budgeting across models. -
Modular Tools (
tools/
): Trait-based extensibility:Built-ins include
read_file
(chunked at 2000 lines, metadata-first),ast_grep_search
(operations: search/transform/lint/refactor with preview_only=true), andrun_terminal_cmd
(modes: terminal/pty/streaming; 30s timeout default). Git integration vialist_files
useswalkdir
withignore
crate for .gitignore-aware traversal andnucleo-matcher
for fuzzy scoring. -
Configuration Engine (
config/
): Deserializesvtcode.toml
into structs with validation:[] = true = 100000 # Enforce per-provider limits = true # Auto-classify: exploration/implementation/etc.
Sections cover agents, tools (allow/deny), MCP (provider URLs), caching (quality_threshold=0.7), and safety (workspace_paths, max_file_size=1MB).
-
Context Engineering System: Implements iterative, per-turn curation based on conversation phase detection (e.g., exploration prioritizes search tools). Token budgeting: Real-time tracking with
tiktoken-rs
(~10μs/message), thresholds (0.75 warn/0.85 compact), and automatic summarization (LLM-driven, preserving decision ledger and errors; targets 30% compression ratio, saving ~29% tokens/turn). Decision ledger: Structured audit (Vec<DecisionEntry>
with status: pending/in_progress/completed, confidence: 0-1). Error recovery: Pattern matching (e.g., parse failures) with fallback strategies and context preservation. -
Code Intelligence: Tree-sitter integration for AST traversal (e.g., symbol resolution in
tools/ast_grep_search
); ast-grep for rule-based transforms:# Example pattern in tool call pattern: "fn $NAME($PARAMS) { $BODY }" replacement: "async fn $NAME($PARAMS) -> Result<()> { $BODY.await }"
Supports preview mode to avoid destructive applies.
-
MCP Integration: Client uses official Rust SDK for protocol-compliant calls:
let client = new; let docs = client.call.await?;
Discovers tools dynamically (e.g.,
mcp_resolve-library-id
for Context7 IDs,mcp_sequentialthinking
for chain-of-thought reasoning with branch/revision support,mcp_get_current_time
for timezone-aware ops). Connection pooling and failover for multi-provider setups.
CLI Execution (src/
)
- User Interface: Ratatui for reactive TUI (mouse-enabled, ANSI escape sequences for colors: e.g., \x1b[34m for blue tool banners). Real-time PTY via
vte
crate for command streaming; slash commands parsed with fuzzy matching. - Runtime: Tokio executor handles concurrent tool calls; human-in-the-loop via confirmation prompts for high-risk ops (e.g.,
rm -rf
denials). - Observability: Logs to file/console with structured format; metrics (e.g., cache hit rate, token usage) exposed via debug flags.
Performance notes: Multi-threaded Tokio reduces latency for I/O-bound tasks (~20% faster than single-thread); context compression yields 50-80% token savings in long sessions. See docs/ARCHITECTURE.md for dependency graph and profiling data.
Key Capabilities
- LLM Orchestration: Failover logic (e.g., Gemini primary, OpenAI fallback); reasoning control (low/medium/high effort via provider params); caching with quality gating (cache only >70% confidence, TTL=30 days).
- Code Analysis & Editing: Semantic search (AST-grep similarity mode, threshold=0.7); targeted edits (exact string match in
edit_file
, preserving whitespace); multi-file patches viaapply_patch
. - Context & Session Management: Phase-adaptive tool selection (e.g., validation phase favors
run_terminal_cmd
withcargo test
); ledger injection for coherence (max 12 entries); summarization triggers at 20 turns or 85% budget. - Extensibility: Custom tools via trait impls; MCP for domain-specific extensions (e.g., library docs resolution:
resolve-library-id
→get-library-docs
with max_tokens=5000). - Security Posture: Path validation (no escapes outside WORKSPACE_DIR); sandboxed network (curl HTTPS only, no localhost); allowlists (e.g., deny
rm
, permitcargo
); env-var secrets (no file storage).
Installation and Initialization
Binaries on GitHub Releases support macOS (aarch64/x86_64-apple-darwin), Linux (x86_64/aarch64-unknown-linux-gnu), Windows (x86_64-pc-windows-msvc).
Initialize environment:
# API keys (required for providers)
# Validate via curl test
# [RECOMMEND] use dot file .env in your project root: OPENAI_API_KEY=sk-...
# Workspace setup
Execution:
Configuration validation: On load, checks TOML against schema (e.g., model in docs/models.json
); logs warnings for deprecated keys.
Command-Line Interface
Parsed via clap derives:
; )
)
)
Development runs: ./run.sh
(release profile: lto=true, codegen-units=1 for opt); ./run-debug.sh
(debug symbols, hot-reload via cargo-watch).
Development Practices
- Validation Pipeline:
cargo check
for incremental builds;cargo clippy
with custom lints (e.g., no unsafe, prefer ? over unwrap);cargo fmt --check
for style enforcement (4-space indents, max_width=100). - Testing: Unit tests in
#[cfg(test)]
modules; integration intests/
(e.g., mock LLM responses with wiremock); coverage viacargo tarpaulin
. Property testing for tools (e.g., proptest for path sanitization). - Error Handling: Uniform
anyhow::Result<T>
with context: - Benchmarking: Criterion in
benches/
(e.g., token counting: <50μs for 1k tokens); flamegraphs for async bottlenecks. - Documentation: Rustdoc for public APIs; Markdown in
./docs/
(e.g., context engineering impl). No root-level Markdown except README.
Adhere to CONTRIBUTING.md: Conventional commits, PR templates with benchmarks.
References
- User Guides: Getting Started, Configuration.
- Technical Docs: Context Engineering, MCP Setup, Prompt Caching.
- API: vtcode-core (full crate docs).
- Changelog: CHANGELOG.
License
MIT License - LICENSE for full terms.