Async LLM-as-judge evaluators for Nous.
These evaluators run asynchronously after agent runs complete. They use a separate model call to assess quality dimensions that require language understanding.
Async LLM-as-judge evaluators for Nous.
These evaluators run asynchronously after agent runs complete. They use a separate model call to assess quality dimensions that require language understanding.