Expand description
Python to Rust Single-Shot Compile Benchmark (10 Levels)
Canonical benchmark for code translation model evaluation. Measures success rate by turn and finds smallest model meeting thresholds.
§Levels
- Hello World
- Variables & Arithmetic
- Functions & Ownership
- Collections & Iterators
- Control Flow & Borrowing
- Error Handling (Result)
- OOP → Traits
- Concurrency (async/rayon)
- FFI/Unsafe
- Metaprogramming (proc macros)
Structs§
- Level
Result - Result for a single level
- Py2Rs
Score - Score for Python→Rust benchmark
Enums§
- Py2Rs
Level - Python→Rust benchmark level (1-10)
Functions§
- compare_
models - Compare multiple models on
Py2Rsbenchmark - format_
comparison_ table - Format comparison as table
- generate_
canonical_ examples - Generate canonical
Py2Rsexamples - run_
benchmark - Run
Py2Rsbenchmark on a model (mock implementation)