Skip to main content

Crate nous_judge

Crate nous_judge 

Source
Expand description

Async LLM-as-judge evaluators for Nous.

These evaluators run asynchronously after agent runs complete. They use a separate model call to assess quality dimensions that require language understanding.

Re-exports§

pub use anthropic_judge::AnthropicJudgeProvider;
pub use judge_provider::JudgeProvider;
pub use judge_provider::MockJudgeProvider;
pub use judge_provider::parse_judge_scores;
pub use plan_adherence::PlanAdherence;
pub use plan_quality::PlanQuality;
pub use task_completion::TaskCompletion;

Modules§

anthropic_judge
Anthropic API-backed judge provider for real LLM-as-judge evaluation.
judge_provider
LLM call wrapper for evaluation.
plan_adherence
Plan adherence evaluator — did the agent follow its stated plan?
plan_quality
Plan quality evaluator — LLM-as-judge for reasoning coherence.
task_completion
Task completion evaluator — did the agent achieve its goal?