Crate nous_judge

Expand description

Async LLM-as-judge evaluators for Nous.

These evaluators run asynchronously after agent runs complete. They use a separate model call to assess quality dimensions that require language understanding.

Re-exports§

pub use anthropic_judge::AnthropicJudgeProvider;
pub use judge_provider::JudgeProvider;
pub use judge_provider::MockJudgeProvider;
pub use judge_provider::parse_judge_scores;
pub use plan_adherence::PlanAdherence;
pub use plan_quality::PlanQuality;
pub use task_completion::TaskCompletion;

Modules§

anthropic_judge: Anthropic API-backed judge provider for real LLM-as-judge evaluation.
judge_provider: LLM call wrapper for evaluation.
plan_adherence: Plan adherence evaluator — did the agent follow its stated plan?
plan_quality: Plan quality evaluator — LLM-as-judge for reasoning coherence.
task_completion: Task completion evaluator — did the agent achieve its goal?

Crate nous_judge

Crate nous_judge Copy item path

Re-exports§

Modules§

Crate nous_judge