Expand description
Run results and the JSON report. The serialized shape here is the stable
contract the language SDKs parse. These types are the source of truth:
their JSON Schemas (via skilltest schema, goldens in schemas/) are what
the SDK contract tests compare their Pydantic/Zod models against.
Structsยง
- CaseRun
- The result of running one test case on one (platform, model) pair.
- Report
- The top-level report for a
skilltest runinvocation. - Summary
- Aggregate pass/fail counts for a report.
- Validation
Finding - One problem found while validating a skill, as serialized in the
skilltest validate --format jsonoutput. - Validation
Report - The top-level report for a
skilltest validateinvocation.