Expand description
Cross-platform GPU compute pipeline for OxiMedia using WGPU.
This crate provides GPU-accelerated media processing via the wgpu portability layer, which selects the best available native backend at runtime — no compile-time feature flags are required:
| Platform | Backend selected by wgpu |
|---|---|
| Linux | Vulkan (preferred), then OpenGL ES |
| macOS | Metal |
| Windows | DirectX 12, then Vulkan |
| Web | WebGPU |
| All | CPU software fallback when no GPU adapter is found |
§Compute kernels
Color space:
- RGB ↔ YUV (BT.601, BT.709, BT.2020) —
ops::ColorSpaceConversion - Chroma subsampling (4:2:0, 4:2:2, 4:4:4) —
ops::ChromaOps - Tone mapping (Reinhard, Hable, ACES, Drago) —
ops::tonemap
Geometry and scale:
- Image scaling: Bilinear, Bicubic, Lanczos-3 —
ops::ScaleOperation - Convolution filters: blur, sharpen, edge-detect —
ops::FilterOperation - Perspective transform, mipmap generation
Signal processing:
- DCT and FFT transforms —
ops::TransformOperation - Histogram equalization (CLAHE) —
HistogramEqualizer - Optical flow estimation —
optical_flow - Motion detection —
MotionDetector - Film grain synthesis —
film_grain - Bilateral / NLM denoising —
ops::denoise
Quality metrics:
§TexturePool — LRU eviction
TexturePool maintains a byte-budget and slot-count capacity. When both
limits are exhausted, TexturePool::allocate_with_lru_eviction evicts the
least-recently-used texture in a loop until enough space is reclaimed. LRU
order is tracked with a monotonic access_clock counter; the slot with the
smallest timestamp is selected by TexturePool::lru_handle. Call
TexturePool::touch after each use to update the timestamp.
Supported TextureFormats: Rgba8, Rgba16f, Rgb10A2, R8, Rg8,
Yuv420, Nv12.
§Shader cache
shader_cache::GpuShaderCache maintains two levels of caching:
- In-memory: LRU, LFU, or OldestFirst eviction (configurable via
shader_cache::EvictionPolicy). Hit/miss counters are tracked. - Disk-persistent: Cache entries are stored as
<hex_hash>_<backend>_<flags>.shd(compiled bytecode) plus a<hex_hash>_<backend>_<flags>.metasidecar. The cache key is ashader_cache::ShaderVersioncontainingsource_hash: u64,backend: String, andfeature_flags: u32.
§Pipeline system
GpuPipeline is a DAG-based processing pipeline with built-in barrier
management. Stages: Decode → Colorspace → Filter → Encode → Display.
BarrierBatcher supports three strategies — Eager, Batched, and
Deferred — to minimise synchronisation overhead.
BatchedComputePass and ComputeShaderSimulator provide structured
compute dispatch with recorded DispatchCommand queues.
§GPU buffer management
SubAllocator implements a bump-pointer sub-allocator with defragmentation
for the GPU buffer pool. memory_pool::DefragResult reports how many bytes
were compacted per defrag pass.
§Example
use oximedia_gpu::GpuContext;
let ctx = GpuContext::new()?;
let input = vec![0u8; 1920 * 1080 * 4];
let mut output = vec![0u8; 1920 * 1080 * 4];
ctx.rgb_to_yuv(&input, &mut output)?;Re-exports§
pub use accelerator::AcceleratorBuilder;pub use accelerator::CpuAccelerator;pub use accelerator::GpuAccelerator;pub use accelerator::WgpuAccelerator;pub use buffer::BufferType;pub use buffer::GpuBuffer;pub use device::GpuDevice;pub use device::GpuDeviceInfo;pub use ops::quality_metrics::compute_ms_ssim;pub use ops::quality_metrics::compute_psnr;pub use ops::quality_metrics::compute_ssim;pub use ops::quality_metrics::MsSsimResult;pub use ops::quality_metrics::PsnrResult;pub use ops::quality_metrics::SsimResult;pub use ops::ChromaOps;pub use ops::ChromaSubsampling;pub use ops::ColorSpaceConversion;pub use ops::FilterOperation;pub use ops::ScaleOperation;pub use ops::TransformOperation;pub use ops::YcbcrCoefficients;pub use backend::Backend;pub use backend::BackendCapabilities;pub use backend::BackendType;pub use backend::CpuBackend;pub use backend::VulkanBackend;pub use cache::CacheStats;pub use cache::PipelineCache;pub use cache::ShaderCache;pub use compiler::CompilationError;pub use compiler::CompilationOptions;pub use compiler::OptimizationLevel;pub use compiler::ShaderCompiler;pub use compiler::ShaderPreprocessor;pub use compute::ComputeExecutor;pub use compute::ComputePassBuilder;pub use compute::ComputePipelineHandle;pub use compute::ComputePipelineManager;pub use compute::DispatchHelper;pub use kernels::ColorConversionKernel;pub use kernels::ConvolutionKernel;pub use kernels::FilterKernel;pub use kernels::ReduceKernel;pub use kernels::ReduceOp;pub use kernels::ResizeFilter;pub use kernels::ResizeKernel;pub use kernels::TransformKernel;pub use kernels::TransformType;pub use memory::ManagedBuffer;pub use memory::MemoryAllocator;pub use memory::MemoryPool;pub use memory::MemoryStats;pub use queue::AsyncSubmission;pub use queue::BatchSubmitter;pub use queue::CommandBufferBuilder;pub use queue::CommandQueue;pub use queue::QueueManager;pub use queue::QueueType;pub use sync::Barrier;pub use sync::Event;pub use sync::Fence;pub use sync::Semaphore;pub use workgroup::DeviceLimits;pub use workgroup::WorkgroupAutoTuner;pub use memory_pool::CompactionPlan;pub use memory_pool::DefragResult;pub use memory_pool::MigrationEntry;pub use buffer_pool::SubAllocator;pub use compute_pass::BatchedComputePass;pub use compute_pass::DispatchCommand;pub use histogram::ChannelHistogram;pub use histogram::ImageHistogram;pub use motion_detect::MotionAnalysis;pub use motion_detect::MotionDetector;pub use motion_detect::MotionRegion;pub use motion_detect::Sensitivity;pub use pipeline::BarrierBatcher;pub use pipeline::BarrierKind;pub use pipeline::BarrierStrategy;pub use pipeline::BufferBarrier;pub use pipeline::FlushRecord;pub use pipeline::GpuPipeline;pub use pipeline::PipelineMetrics;pub use pipeline::PipelineNode;pub use pipeline::PipelineStage;pub use texture::TextureDescriptor;pub use texture::TextureFormat;pub use texture::TexturePool;pub use video_process::FrameProcessConfig;pub use video_process::FrameProcessResult;pub use video_process::VideoFrameProcessor;pub use compute_shader::ComputeShaderSimulator;pub use compute_shader::ShaderKernel;pub use compute_shader::ThreadGroupContext;pub use histogram_equalization::ClaheConfig;pub use histogram_equalization::EqualizationStats;pub use histogram_equalization::HistogramEqualizer;
Modules§
- accelerator
GpuAcceleratortrait and hardware acceleration abstraction.- async_
compute - Async compute queue for overlapping compute and transfer operations.
- backend
- Backend implementations for GPU and CPU compute
- barrier_
manager - GPU barrier and synchronization management.
- blend_
kernel - GPU blend kernels (CPU simulation via Rayon).
- buffer
- GPU buffer management for staging and device memory
- buffer_
copy - GPU buffer copy and blit operations.
- buffer_
pool - Zero-copy buffer pool for GPU-style memory management.
- cache
- Pipeline cache management for faster startup and reduced compilation overhead
- color_
convert_ kernel - GPU color space conversion kernels (CPU simulation).
- command_
buffer - GPU command buffer recording and submission.
- compiler
- Runtime shader compilation and management
- compute
- Compute pipeline management for GPU operations
- compute_
dispatch - Compute shader dispatch helpers.
- compute_
graph - GPU compute graph — a typed, topologically-ordered execution planner.
- compute_
kernels - Pure-Rust SIMD-optimised compute kernels for image processing.
- compute_
pass - GPU compute pass management — pass types, buffer bindings, and pass queues.
- compute_
shader - GPU compute shader simulator.
- descriptor_
set - Descriptor set and layout management for GPU pipeline bindings.
- device
- GPU device management and enumeration
- double_
buffer - Double-buffered GPU command submission.
- fence_
pool - GPU fence pool for efficient synchronization primitive reuse.
- film_
grain - GPU-accelerated film grain synthesis.
- gpu_
buffer - GPU buffer management for
oximedia-gpu. - gpu_
cpu_ verify - GPU vs CPU output comparison and verification utilities.
- gpu_
fence - GPU fence (synchronization primitive) management for
oximedia-gpu. - gpu_
profiler - GPU profiling and timing utilities.
- gpu_
stats - GPU statistics collection and monitoring.
- gpu_
timer - GPU timing and profiling utilities.
- histogram
- Multi-channel image histogram analysis.
- histogram_
equalization - GPU-accelerated histogram equalization.
- indirect_
dispatch - Indirect dispatch support for GPU compute kernels.
- kernel
- GPU kernel management — kernel types, specs, and caching.
- kernel_
scheduler - GPU kernel scheduling simulation.
- kernels
- GPU compute kernels library
- memory
- GPU memory management and allocation tracking
- memory_
pool - GPU memory pool allocator.
- mipmap_
gen - Mipmap generation utilities for GPU textures.
- motion_
detect - GPU-accelerated motion detection.
- motion_
estimation - GPU-accelerated motion estimation for AV1 and VP9 video codecs.
- multi_
gpu - Multi-GPU load balancing with automatic frame distribution.
- occupancy
- GPU occupancy calculator and optimization.
- ops
- GPU compute operations
- optical_
flow - GPU-accelerated optical flow computation for motion interpolation.
- perspective_
transform - GPU-accelerated perspective transform and lens distortion correction.
- pipeline
- GPU processing pipeline management
- pipeline_
cache - Pipeline object cache for the GPU crate.
- pipeline_
stages - GPU-style image processing pipeline stage abstraction.
- queue
- Command queue management for GPU operations
- readback
- GPU readback utilities.
- render_
pass - Render pass configuration and builder for
oximedia-gpu. - resource_
manager - GPU resource allocation and lifetime tracking.
- sampler
- Texture sampler configuration and caching.
- scale_
kernel - GPU scaling kernels (CPU simulation via Rayon).
- shader
- Shader compilation and pipeline management
- shader_
cache - GPU shader cache management.
- shader_
params - Shader parameter management — param types, individual params, and uniform blocks.
- sync
- Synchronization primitives for GPU operations
- sync_
primitive - GPU synchronisation primitives (semaphores, fences, barriers).
- texture
- GPU texture management
- texture_
atlas - Texture atlas packing for GPU uploads.
- texture_
cache - 2D tile-based texture cache simulation.
- tone_
curve - GPU-accelerated tone curve application.
- upload_
queue - Asynchronous upload queue for staging CPU data to GPU buffers.
- vertex_
buffer - Vertex buffer layout descriptions and management.
- video_
process - GPU-accelerated video frame processing.
- viewport
- Viewport and scissor-rectangle management for GPU render passes.
- workgroup
- GPU workgroup configuration and dispatch sizing.
Structs§
- GpuContext
- GPU context for compute operations
Enums§
- GpuError
- Error types for GPU operations