Module llm

Functions§

async_evaluate_llm: Main orchestration function that decides which execution path to take
evaluate_llm: Function for evaluating LLM response and generating metrics. The primary use case for evaluate_llm is to take a list of data samples, which often contain inputs and outputs from LLM systems and evaluate them against user-defined metrics in a LLM as a judge pipeline. The user is expected provide a list of dict objects and a list of LLMEval metrics. These eval metrics will be used to create a workflow, which is then executed in an async context. All eval scores are extracted and returned to the user.
workflow_from_eval_metrics: Builds a workflow from a list of LLMEvalMetric objects