Crate wgpu_profiler

source ·
Expand description

Easy to use profiler scopes for wgpu using timer queries.

wgpu_profiler manages all the necessary wgpu::QuerySet and wgpu::Buffer behind the scenes and allows you to create to create timer scopes with minimal overhead!

§How to use

use wgpu_profiler::*;

// ...

let mut profiler = GpuProfiler::new(GpuProfilerSettings::default()).unwrap();

// ...

{
    // You can now open profiling scopes on any encoder or pass:
    let mut scope = profiler.scope("name of your scope", &mut encoder, &device);

    // Scopes can be nested arbitrarily!
    let mut nested_scope = scope.scope("nested!", &device);

    // Scopes on encoders can be used to easily create profiled passes!
    let mut compute_pass = nested_scope.scoped_compute_pass("profiled compute", &device);


    // Scopes expose the underlying encoder or pass they wrap:
    compute_pass.set_pipeline(&pipeline);
    // ...

    // Scopes created this way are automatically closed when dropped.
}

// Wgpu-profiler needs to insert buffer copy commands.
profiler.resolve_queries(&mut encoder);

// ...

// And finally, to end a profiling frame, call `end_frame`.
// This does a few checks and will let you know if something is off!
profiler.end_frame().unwrap();

// Retrieving the oldest available frame and writing it out to a chrome trace file.
if let Some(profiling_data) = profiler.process_finished_frame(queue.get_timestamp_period()) {
    // You usually want to write to disk only under some condition, e.g. press of a key.
    if button_pressed {
        wgpu_profiler::chrometrace::write_chrometrace(
            std::path::Path::new("mytrace.json"), &profiling_data);
    }
}

Check also the Example where everything can be seen in action.

§Internals

For every frame that hasn’t completely finished processing yet (i.e. hasn’t returned results via GpuProfiler::process_finished_frame) we keep a PendingFrame around.

Whenever a profiling scope is opened, we allocate two queries. This is done by either using the most recent QueryPool or creating a new one if there’s no non-exhausted one ready. Ideally, we only ever need a single QueryPool per frame! In order to converge to this, we allocate new query pools with the size of all previous query pools in a given frame, effectively doubling the size. On GpuProfiler::end_frame, we memorize the total size of all QueryPools in the current frame and make this the new minimum pool size.

QueryPool from finished frames are re-used, unless they are deemed too small.

Modules§

Structs§

Enums§

Traits§

  • Trait for exposing the methods of wgpu::CommandEncoder, wgpu::RenderPass and wgpu::ComputePass that are used by the profiler.