Skip to main content

Crate agent_sdk_eval

Crate agent_sdk_eval 

Source
Expand description

Optional evaluation framework primitives for Agent SDK runs.

This crate owns post-hoc evaluation contracts over core traces and evidence. It does not run agents, append journals, publish events, choose evaluator models, or define product-specific success rubrics.

Re-exports§

pub use comparison::ComparisonDesign;
pub use evaluator::Evaluator;
pub use evidence::EvidenceBundle;
pub use evidence::EvidenceItem;
pub use evidence::EvidenceRole;
pub use evidence::SupportRefValidation;
pub use identity::EvaluationId;
pub use metrics::ToolTraceMetric;
pub use metrics::TraceMetrics;
pub use metrics::TraceMetricsComparison;
pub use report::EvaluationConfidence;
pub use report::EvaluationMetricDelta;
pub use report::EvaluationReport;
pub use report::EvaluationVerdict;
pub use report::EvaluatorJudgment;
pub use request::EvaluationBudget;
pub use request::EvaluationRequest;
pub use request::EvaluationUsage;
pub use scope::EvaluationCriterion;
pub use scope::EvaluationScope;
pub use scope::EvaluationSubject;
pub use scope::EvaluationSubjectRole;
pub use scope::ExpectedOutcome;

Modules§

comparison
Comparison designs for evaluation reports.
evaluator
Evaluator trait for post-hoc evaluation implementations.
evidence
Evidence bundles derived from core traces.
identity
Stable identifiers for evaluation framework records.
metrics
Deterministic metrics derived from core traces and journal records.
report
Evaluation report records and confidence validation.
request
Evaluation request, budget, and usage records.
scope
Evaluation scopes, subjects, and expected outcomes.
testing
Deterministic evaluator fakes for SDK consumers.