tandem-eval 0.6.5

Evaluation harness and regression tooling for Tandem