gpu-trace-perf 1.4.0

Plays a collection of GPU traces under different environments to evaluate driver changes on performance
## gpu-trace-perf

This is a rust rewrite of some tooling I built for comparing
performance between different graphics driver settings on graphics
traces.  The goal is for a driver developer to be able to quickly
experiment and find how their changes affect the performance of actual
rendering.

Right now apitrace, renderdoc, gfxretrace, and angle_trace_tests are supported.
You pass the tool a collection of GPU traces (or the path to the
angle_perf_traces binary, which will then enumerate tests), and it will run
through all of them, rerunning to get better stats over time, and give you an
estimate of the change in FPS from your driver change.

### Installing

```
apt-get install cargo
cargo install gpu-trace-perf
```

For apitrace traces (*.trace), you also need apitrace installed.  I
recommend having apitrace's waffle backend enabled, and
WAFFLE_PLATFORM=gbm set in the environment to not flicker windows on
the screen constantly.

For renderdoc traces (*.rdc), you need:

- python3
- renderdoc installed (`sudo apt-get install renderdoc`)
- renderdoc's python module findable from python3.

### Example usage

`gpu-trace-perf run --traces $HOME/src/traces-db beforedriver afterdriver`

This command will find all the traces in
[traces-db](https://gitlab.freedesktop.org/gfx-ci/tracie/traces-db/)
and run them in a loop printing stats until you feel ready to hit ^C.

The `beforedriver` and `afterdriver` arguments are scripts in your
path that set the environment to make you use your new driver, like
this:

```
#!/bin/sh

export LD_LIBRARY_PATH=$HOME/src/prefix/lib
"$@"
```

Since a traces db may be large and a change being tested may only affect a
subset of the traces, you can filter down which traces the replayed repeatedly
to only those whose stderr output is changed by some debug environment
variables:

```
    # Only re-run traces that had their shader compiler or command stream output changed on NVK.
    gpu-trace-perf run --traces $HOME/src/traces-db beforedriver afterdriver \
        --debug-filter "NVK_DEBUG=push_dump" --debug-filter "NAK_DEBUG=print"
```

### Running ANGLE traces

To include ANGLE traces in your test list, include the angle_trace_tests binary
in the `-t` list (or the traces directory).  You can also select a specific
ANGLE trace with (for example):

```
gpu-trace-perf -t <path>/angle_trace_tests/TraceTest.minetest
```

### utrace timings

Normally, the trace tool-provided timings are used for the FPS result.  However,
sometimes you only care about a specific subset of your command buffers, and
watching just those can reduce the noise of the timings.  If the driver supports
utrace, you can pass "--utrace <comma_separated_utrace_events>" to use those
instead of the trace tool's timings.  This can be a particular help with
renderdoc traces, which have high CPU overhead for per-renderpass setup on
Vulkan.  Note that if your driver has a nonstandard prefix for start/end events,
you may need to add it to u_trace::Frame::event_times().  You can also specify
"drawcalls" as the event to expand to a list of common draw and compute events
across several drivers.

### Shader heuristic analysis

Sometimes as a developer, you want to select between two modes of compiling or
running a shader based on some heuristic.  This tool lets you generate the A/B
times per shader once, then iterate on your heuristic in the quick-to-run rust
code instead of doing lengthy trace replay runs for each idea you come up with.

The requirements are:

- The driver has utrace events for the start/end of draw call times
- The draw events include the hashes of the shaders for each stage involved in
  the draw.
- The change to the shader is represented in the shader hash.
- `check_debug_filter()` includes the shader dumping env vars for your driver
- your before/after scripts only append to any shader debug env vars also
  involved in shader dumping.
- `capture_draw_times()` has the utrace event names and the shader stage names
  for your driver.
- `shader_parser.rs` supports parsing your driver's shader outputs.

Currently, turnip is supported, and some code exists to support other drivers,
but is incomplete. If you meet all the requirements, running looks like:

````
    gpu-trace-perf run append beforedriver afterdriver --output results/ \
        --capture-shaders  --traces $HOME/src/traces-db

    gpu-trace-perf shader-analyze output/
````

For any traces with shaders that changed modes, it will dump the per-trace
per-shader times between the before and after environments, and a table showing
the overall effect on trace times between the available heuristics (always
choose driver B, always choose optimally, and a dummy heuristic as an example).
Then, edit shader_analyze.rs to replace the dummy heuristic with your own,
potentially adding multiple to the table.

### Cross building for your embedded device

Add the following to ~/.cargo/config:

```
[target.armv7-unknown-linux-gnueabihf]
linker = "arm-linux-gnueabihf-gcc"

[target.aarch64-unknown-linux-gnu]
linker = "aarch64-linux-gnu-gcc"
```

And set up the new toolchain and build:

```
rustup target add aarch64-unknown-linux-gnu
cargo build --release --target aarch64-unknown-linux-gnu gpu-trace-perf
scp target/aarch64-unknown-linux-gnu/release/gpu-trace-perf device:bin/
```

### License

Licensed under the MIT license
   ([LICENSE-MIT](LICENSE-MIT) or http://opensource.org/licenses/MIT)