oximedia-gpu
Cross-platform GPU compute pipeline for OxiMedia using WGPU.
Part of the oximedia workspace — a comprehensive pure-Rust media processing framework.
Version: 0.1.5 — 2026-04-21 — 1,237 tests
Backend selection
oximedia-gpu uses wgpu to access the GPU. The backend is
chosen at runtime — no compile-time feature flags are needed:
| Platform | Backend chosen |
|---|---|
| Linux | Vulkan (preferred), OpenGL ES (fallback) |
| macOS | Metal |
| Windows | DirectX 12, Vulkan (fallback) |
| Web | WebGPU |
| All | CPU software fallback when no GPU adapter is present |
Features
Color operations:
- Color Space Conversions — RGB ↔ YUV with BT.601, BT.709, BT.2020 matrices
- Chroma Subsampling — 4:2:0, 4:2:2, 4:4:4 subsampling/upsampling
- Tone Mapping — Reinhard, Hable, ACES, Drago algorithms
Geometry and scale:
- Image Scaling — Bilinear, bicubic, and Lanczos-3 interpolation on GPU
- Convolution Filters — Blur, sharpen, edge-detect, custom kernels
- Transform Operations — DCT and FFT on GPU
- Perspective Transform — Projective image warping
- Mipmap Generation — Automatic mipmap chain computation
Signal and media processing:
- Histogram Equalization — CLAHE (Contrast-Limited Adaptive HE)
- Motion Detection — GPU-accelerated motion analysis with sensitivity levels
- Optical Flow — Dense optical flow estimation
- Film Grain — Perceptual grain synthesis
- Denoising — Bilateral and NLM denoising kernels
Quality metrics:
- PSNR, SSIM, MS-SSIM — Compute image quality metrics on GPU
Infrastructure:
- TexturePool — LRU-evicting byte-budget pool (see below)
- Shader Cache — Two-level in-memory + disk-persistent cache (see below)
- Pipeline DAG — Barrier-managed processing pipeline
- SubAllocator — Bump-pointer GPU buffer sub-allocator with defragmentation
- BatchedComputePass — Recorded dispatch queue for compute workloads
- Automatic CPU Fallback — Graceful degradation when GPU unavailable
- Multi-GPU Support — Enumerate and select GPU devices
- Command Buffer — Batched GPU command recording
- Compute Pass — Structured compute pass dispatch
- Descriptor Sets — Resource binding management
- Render Pass — GPU render pass management
- Fence Pool — GPU fence lifecycle management
- Vertex Buffer — Vertex data management
- Sampler — Texture sampler configuration
- Profiling — GPU timer, stats, and profiler
- Occupancy — Compute occupancy analysis
- Workgroup — Automatic workgroup sizing and dispatch
Usage
Add to your Cargo.toml:
[]
= "0.1.5"
use GpuContext;
TexturePool — LRU eviction
TexturePool::new(max_gb) creates a pool bounded by a byte budget and a slot
count. When both limits are exhausted,
TexturePool::allocate_with_lru_eviction() evicts textures in a loop until
enough capacity is reclaimed.
LRU order is tracked with a monotonic access_clock counter stored per slot.
lru_handle() returns the slot with the smallest timestamp. Call
TexturePool::touch(handle) after each access to refresh the timestamp.
Supported texture formats: Rgba8, Rgba16f, Rgb10A2, R8, Rg8,
Yuv420, Nv12.
use ;
let mut pool = new; // 2 GiB budget
let desc = TextureDescriptor ;
if let Some = pool.allocate
Shader cache
shader_cache::GpuShaderCache provides two caching levels:
In-memory cache:
- Configurable eviction policy via
EvictionPolicy:Lru,Lfu, orOldestFirst - Hit/miss counters accessible via
GpuShaderCache::stats() - Cache key:
ShaderVersion { source_hash: u64, backend: String, feature_flags: u32 }
Disk-persistent cache:
- Compiled bytecode stored as
<hex_hash>_<backend>_<flags>.shd - Metadata sidecar at
<hex_hash>_<backend>_<flags>.meta - Cache is invalidated when any component of
ShaderVersionchanges
API Overview
Core types:
GpuContext— Main GPU context and entry pointGpuBuffer,GpuFence— GPU resource types
Device and backend:
device— GPU device enumeration and selectionbackend— Backend type information (BackendType: Vulkan, Metal, DX12, CPU)accelerator— High-level acceleration interface (WgpuAccelerator,CpuAccelerator)
Buffer and memory:
buffer,gpu_buffer— Buffer allocation and managementmemory,memory_pool— GPU memory pool withSubAllocatordefragmentationvertex_buffer— Vertex buffer managementbuffer_copy— Buffer copy operationsupload_queue— Staging buffer upload queue
Shader management:
shader,shader_cache,shader_params— Shader compilation and cachingcompiler—ShaderCompilerwithOptimizationLevel(None/Speed/Size)
Compute pipeline:
compute,compute_pass,compute_dispatch— Compute operationspipeline—GpuPipelineDAG;BarrierBatcher(Eager/Batched/Deferred)kernels,kernel— Compute kernel definitionsdescriptor_set— Resource descriptor bindingworkgroup—WorkgroupAutoTunerfor optimal dispatch sizing
Ops (high-level media kernels):
ops::colorspace—ColorSpaceConversion(BT601/709/2020)ops::chroma—ChromaOpssubsampling/upsamplingops::scale—ScaleOperationwithScaleFilter(Bilinear/Bicubic/Lanczos3)ops::filter—FilterOperationconvolution kernelsops::tonemap—TonemapAlgorithm(Reinhard/Hable/ACES/Drago)ops::denoise—DenoiseKernelops::histogram_eq—HistogramEqualizerwithClaheConfigops::quality_metrics—compute_psnr,compute_ssim,compute_ms_ssimops::transform—TransformOperationDCT/FFTops::composite— Layer compositing
Texture and rendering:
texture—TexturePoolwith LRU eviction,TextureFormatenumrender_pass— GPU render passsampler— Sampler configurationviewport— Viewport configurationtexture_atlas,texture_cache,mipmap_gen— Texture utilities
Synchronization:
queue— Command queue managementsync,sync_primitive— Fence and semaphore synchronizationfence_pool— Fence lifecycle management
Video processing:
video_process—VideoFrameProcessorframe pipelinehistogram—ImageHistogram/ChannelHistogrammotion_detect—MotionDetectorwithSensitivitylevelsoptical_flow— Dense optical flow
Profiling:
gpu_profiler— GPU profilinggpu_timer— GPU timing queriesgpu_stats— GPU statistics collectionresource_manager— GPU resource lifecycle trackingoccupancy— Compute occupancy analysis
License
Apache-2.0 — Copyright 2024-2026 COOLJAPAN OU (Team Kitasan)