Expand description
§adk-bench
A comprehensive benchmarking framework for ADK-Rust that measures framework-level runtime performance using real LLM APIs.
adk-bench isolates framework overhead from LLM latency through precise
per-call instrumentation, supports concurrent agent throughput testing,
memory profiling, and external framework comparison via subprocess
execution with a standardized JSON protocol (External Benchmark Protocol).
§Features
- Cold start measurement: Binary launch to first LLM call timing
- Agent loop overhead: Per-turn framework processing latency (excluding LLM time)
- Concurrent throughput: Agents/second under Tokio async load
- Memory footprint: Platform-specific RSS sampling (Linux/macOS)
- Tool invocation latency: Deserialization, validation, and dispatch timing
- Token overhead: Framework-injected token cost analysis
- External comparison: Subprocess-based competitor framework benchmarking
- Regression detection: Baseline save/compare with configurable tolerance
§Feature Flags
tau2— Enables the τ²-bench task quality adapterbfcl— Enables the BFCL (Berkeley Function Calling Leaderboard) adapter
§Quick Start
ⓘ
use adk_bench::{BenchConfig, BenchRunner};
let config = BenchConfig::default();
let runner = BenchRunner::new(config);
let results = runner.run().await?;Re-exports§
pub use adapters::CaseResult;pub use adapters::TaskQualityAdapter;pub use adapters::TaskQualityResult;pub use config::BenchConfig;pub use config::ExternalFrameworkConfig;pub use config::OutputFormat;pub use config::TaskSuite;pub use error::BenchError;pub use error::Result;pub use external::ExternalConfigFile;pub use external::ExternalDurationStats;pub use external::ExternalMetricsOutput;pub use external::ExternalRunner;pub use external::ExternalTokenOverhead;pub use external::load_external_configs;pub use formatter::ComparisonResult;pub use formatter::format_comparison;pub use formatter::format_result;pub use instrumented_llm::DeterministicConfig;pub use instrumented_llm::InstrumentedLlm;pub use instrumented_llm::LlmCallRecord;pub use metrics::BenchmarkResult;pub use metrics::ConcurrencyLevel;pub use metrics::DurationStats;pub use metrics::MemoryMetrics;pub use metrics::MetricCollector;pub use metrics::RunMetadata;pub use metrics::ThroughputMetrics;pub use metrics::TokenBreakdown;pub use metrics::TokenOverheadMetrics;pub use metrics::ToolInvocationMetrics;pub use metrics::compute_stats;pub use runner::BenchRunner;pub use runner::RegressionReport;pub use workload::AgentConfig;pub use workload::ToolDefinition;pub use workload::Workload;pub use workload::builtin_workloads;pub use workload::load_workload;pub use workload::multi_agent_delegation_workload;
Modules§
- adapters
- Task quality adapters for established benchmark suites. Task quality adapters for established benchmark suites.
- config
- Benchmark configuration types mapped from CLI flags.
- error
- Error types for the
adk-benchbenchmarking framework. - external
- External framework comparison via subprocess execution.
- formatter
- Result output formatting (JSON, table, markdown).
- instrumented_
llm - Instrumented LLM wrapper for capturing per-call timing metrics.
- memory
- Platform-specific memory (RSS) sampling.
- metrics
- Metric collection and statistical computation.
- runner
- Benchmark runner orchestrator.
- workload
- Workload schema, loading, and validation.