Skip to main content

Module benchmark

Module benchmark 

Source
Expand description

Benchmark event protocol for Smith agent optimization

This module defines the complete event schema for collecting agent performance data to optimize Smith’s performance on coding benchmarks like SWE-bench.

Structs§

BenchmarkEvent
Core benchmark event with required tracking fields
ConfigSuggestion
Optimizer configuration suggestions
ErrorState
Final error state classification
EvidenceFootprint
Evidence of tool’s impact on the codebase
FailureAnalysis
Structured failure analysis for learning
PruningDecision
Context pruning decisions for optimization
RecoveryAttempt
Recovery attempt during failure handling
RetryPolicy
Retry policy configuration
RunConfig
Configuration for a benchmark run
RunResult
Results from a completed benchmark run
SandboxLimits
Sandbox resource limits
StepData
Individual reasoning/action step data
TaskFeatures
Task features for contextual optimization
ToolPerformance
Tool performance metrics for optimization

Enums§

BenchmarkEventType
All benchmark event types for agent optimization
ExitKind
FailureRoot
SegmentType
StepType