vibe-tests

Integration test framework for MCP servers with LLM-powered tool calling.

AI models test your MCP tools automatically — they decide which tool to call and how. Isolated Docker environment, structured tracing, and JSON reports out of the box.

Quick Start

1. Add dependency

[dev-dependencies]
vibe-tests = "0.0.1"
# Required by engine_config! macro for pre-test initialization
ctor = "1.0.1"

2. Configure test engine

vibe_tests::engine_config! {
    EngineTests::builder()
        .mcp_host("http://localhost:9021/mcp/v1")
        .ollama_host("http://localhost:11434")
        .ollama_models(&["qwen2.5-coder:3b-instruct"])
        .on_start(|env| async move {
            // Start your MCP server
            std::process::Command::new("my-mcp-server")
                .stdout(env.tee.clone())
                .stderr(env.tee.clone())
                .spawn()?;
            Ok(())
        })
        .on_stop(|env| {
            // Cleanup and save reports
        })
        .build()
        .expect("Failed to build engine")
}

3. Write tests

#[tokio::test]
async fn test_my_tool() {
    // Arrange
    let engine = vibe_tests::engine().await;
    // Act
    let result = engine.test("Show available projects").await;
    // Assert
    assert!(result.success);
    assert!(result.models.iter().all(|m| m.tool.as_deref() == Some("show_projects")));
}

Features

Agentic testing — AI models call MCP tools based on natural language queries
Isolated environment — Docker compose with automatic up/down, temp directories
Structured tracing — file + real-time callback, parseable log format
JSON reports — per-query details: model, tool, args, response, duration, error codes
Multi-model — test same queries against multiple Ollama models
Zero-config defaults — sensible defaults, minimal setup

How it works

engine_config! — one-time setup before all tests
on_start — launch your MCP server (Docker, process, whatever)
engine.test("query") — LLM receives query + available tools → calls one → returns result
on_stop — cleanup, save logs, parse JSON report

Why

Testing MCP servers manually is slow and brittle. Vibe-tests lets AI models test your tools automatically — they understand natural language queries, decide which tool to call, and verify responses.

Catch regressions, validate tool schemas, and ensure your MCP server works before users notice.

Part of Vibe tools for Agentic RAG — Vibe Analyzer.

vibe-tests 0.0.1