Expand description
Eval framework — agent UX friction testing.
Defines task scenarios that evaluate how well agents can use maw without knowledge of git internals. Each scenario has preconditions, a plain-English task prompt, expected outcomes, and a scoring rubric.
Modules§
- scenarios
- Agent task scenarios and scoring rubric for UX friction evaluation.