Crate adk_bench

Expand description

§adk-bench

A comprehensive benchmarking framework for ADK-Rust that measures framework-level runtime performance using real LLM APIs.

adk-bench isolates framework overhead from LLM latency through precise per-call instrumentation, supports concurrent agent throughput testing, memory profiling, and external framework comparison via subprocess execution with a standardized JSON protocol (External Benchmark Protocol).

§Features

Cold start measurement: Binary launch to first LLM call timing
Agent loop overhead: Per-turn framework processing latency (excluding LLM time)
Concurrent throughput: Agents/second under Tokio async load
Memory footprint: Platform-specific RSS sampling (Linux/macOS)
Tool invocation latency: Deserialization, validation, and dispatch timing
Token overhead: Framework-injected token cost analysis
External comparison: Subprocess-based competitor framework benchmarking
Regression detection: Baseline save/compare with configurable tolerance

§Feature Flags

tau2 — Enables the τ²-bench task quality adapter
bfcl — Enables the BFCL (Berkeley Function Calling Leaderboard) adapter

§Quick Start

use adk_bench::{BenchConfig, BenchRunner};

let config = BenchConfig::default();
let runner = BenchRunner::new(config);
let results = runner.run().await?;

Re-exports§

pub use adapters::CaseResult;
pub use adapters::TaskQualityAdapter;
pub use adapters::TaskQualityResult;
pub use config::BenchConfig;
pub use config::ExternalFrameworkConfig;
pub use config::OutputFormat;
pub use config::TaskSuite;
pub use error::BenchError;
pub use error::Result;
pub use external::ExternalConfigFile;
pub use external::ExternalDurationStats;
pub use external::ExternalMetricsOutput;
pub use external::ExternalRunner;
pub use external::ExternalTokenOverhead;
pub use external::load_external_configs;
pub use formatter::ComparisonResult;
pub use formatter::format_comparison;
pub use formatter::format_result;
pub use instrumented_llm::DeterministicConfig;
pub use instrumented_llm::InstrumentedLlm;
pub use instrumented_llm::LlmCallRecord;
pub use metrics::BenchmarkResult;
pub use metrics::ConcurrencyLevel;
pub use metrics::DurationStats;
pub use metrics::MemoryMetrics;
pub use metrics::MetricCollector;
pub use metrics::RunMetadata;
pub use metrics::ThroughputMetrics;
pub use metrics::TokenBreakdown;
pub use metrics::TokenOverheadMetrics;
pub use metrics::ToolInvocationMetrics;
pub use metrics::compute_stats;
pub use runner::BenchRunner;
pub use runner::RegressionReport;
pub use workload::AgentConfig;
pub use workload::ToolDefinition;
pub use workload::Workload;
pub use workload::builtin_workloads;
pub use workload::load_workload;
pub use workload::multi_agent_delegation_workload;

Modules§

adapters: Task quality adapters for established benchmark suites. Task quality adapters for established benchmark suites.
config: Benchmark configuration types mapped from CLI flags.
error: Error types for the adk-bench benchmarking framework.
external: External framework comparison via subprocess execution.
formatter: Result output formatting (JSON, table, markdown).
instrumented_llm: Instrumented LLM wrapper for capturing per-call timing metrics.
memory: Platform-specific memory (RSS) sampling.
metrics: Metric collection and statistical computation.
runner: Benchmark runner orchestrator.
workload: Workload schema, loading, and validation.

Crate adk_bench

Crate adk_bench Copy item path

§adk-bench

§Features

§Feature Flags

§Quick Start

Re-exports§

Modules§

Crate adk_bench