oximedia-gpu 0.1.7

GPU compute pipeline using WGPU for OxiMedia - cross-platform acceleration
Documentation

oximedia-gpu

Status: Stable Version: 0.1.7

Cross-platform GPU compute pipeline for OxiMedia using WGPU.

Part of the oximedia workspace — a comprehensive pure-Rust media processing framework.

Version: 0.1.7 — 2026-05-16 — 1,237 tests

Backend selection

oximedia-gpu uses wgpu to access the GPU. The backend is chosen at runtime — no compile-time feature flags are needed:

Platform Backend chosen
Linux Vulkan (preferred), OpenGL ES (fallback)
macOS Metal
Windows DirectX 12, Vulkan (fallback)
Web WebGPU
All CPU software fallback when no GPU adapter is present

Features

Color operations:

  • Color Space Conversions — RGB ↔ YUV with BT.601, BT.709, BT.2020 matrices
  • Chroma Subsampling — 4:2:0, 4:2:2, 4:4:4 subsampling/upsampling
  • Tone Mapping — Reinhard, Hable, ACES, Drago algorithms

Geometry and scale:

  • Image Scaling — Bilinear, bicubic, and Lanczos-3 interpolation on GPU
  • Convolution Filters — Blur, sharpen, edge-detect, custom kernels
  • Transform Operations — DCT and FFT on GPU
  • Perspective Transform — Projective image warping
  • Mipmap Generation — Automatic mipmap chain computation

Signal and media processing:

  • Histogram Equalization — CLAHE (Contrast-Limited Adaptive HE)
  • Motion Detection — GPU-accelerated motion analysis with sensitivity levels
  • Optical Flow — Dense optical flow estimation
  • Film Grain — Perceptual grain synthesis
  • Denoising — Bilateral and NLM denoising kernels

Quality metrics:

  • PSNR, SSIM, MS-SSIM — Compute image quality metrics on GPU

Infrastructure:

  • TexturePool — LRU-evicting byte-budget pool (see below)
  • Shader Cache — Two-level in-memory + disk-persistent cache (see below)
  • Pipeline DAG — Barrier-managed processing pipeline
  • SubAllocator — Bump-pointer GPU buffer sub-allocator with defragmentation
  • BatchedComputePass — Recorded dispatch queue for compute workloads
  • Automatic CPU Fallback — Graceful degradation when GPU unavailable
  • Multi-GPU Support — Enumerate and select GPU devices
  • Command Buffer — Batched GPU command recording
  • Compute Pass — Structured compute pass dispatch
  • Descriptor Sets — Resource binding management
  • Render Pass — GPU render pass management
  • Fence Pool — GPU fence lifecycle management
  • Vertex Buffer — Vertex data management
  • Sampler — Texture sampler configuration
  • Profiling — GPU timer, stats, and profiler
  • Occupancy — Compute occupancy analysis
  • Workgroup — Automatic workgroup sizing and dispatch

Usage

Add to your Cargo.toml:

[dependencies]
oximedia-gpu = "0.1.7"
use oximedia_gpu::GpuContext;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let ctx = GpuContext::new()?;

    let input = vec![0u8; 1920 * 1080 * 4];
    let mut output = vec![0u8; 1920 * 1080 * 4];

    ctx.rgb_to_yuv(&input, &mut output)?;
    Ok(())
}

TexturePool — LRU eviction

TexturePool::new(max_gb) creates a pool bounded by a byte budget and a slot count. When both limits are exhausted, TexturePool::allocate_with_lru_eviction() evicts textures in a loop until enough capacity is reclaimed.

LRU order is tracked with a monotonic access_clock counter stored per slot. lru_handle() returns the slot with the smallest timestamp. Call TexturePool::touch(handle) after each access to refresh the timestamp.

Supported texture formats: Rgba8, Rgba16f, Rgb10A2, R8, Rg8, Yuv420, Nv12.

use oximedia_gpu::texture::{TexturePool, TextureDescriptor, TextureFormat};

let mut pool = TexturePool::new(2.0); // 2 GiB budget
let desc = TextureDescriptor {
    width: 1920,
    height: 1080,
    format: TextureFormat::Rgba8,
    label: Some("frame".to_string()),
};
if let Some(handle) = pool.allocate(&desc) {
    pool.touch(handle);
}

Shader cache

shader_cache::GpuShaderCache provides two caching levels:

In-memory cache:

  • Configurable eviction policy via EvictionPolicy: Lru, Lfu, or OldestFirst
  • Hit/miss counters accessible via GpuShaderCache::stats()
  • Cache key: ShaderVersion { source_hash: u64, backend: String, feature_flags: u32 }

Disk-persistent cache:

  • Compiled bytecode stored as <hex_hash>_<backend>_<flags>.shd
  • Metadata sidecar at <hex_hash>_<backend>_<flags>.meta
  • Cache is invalidated when any component of ShaderVersion changes

API Overview

Core types:

  • GpuContext — Main GPU context and entry point
  • GpuBuffer, GpuFence — GPU resource types

Device and backend:

  • device — GPU device enumeration and selection
  • backend — Backend type information (BackendType: Vulkan, Metal, DX12, CPU)
  • accelerator — High-level acceleration interface (WgpuAccelerator, CpuAccelerator)

Buffer and memory:

  • buffer, gpu_buffer — Buffer allocation and management
  • memory, memory_pool — GPU memory pool with SubAllocator defragmentation
  • vertex_buffer — Vertex buffer management
  • buffer_copy — Buffer copy operations
  • upload_queue — Staging buffer upload queue

Shader management:

  • shader, shader_cache, shader_params — Shader compilation and caching
  • compilerShaderCompiler with OptimizationLevel (None/Speed/Size)

Compute pipeline:

  • compute, compute_pass, compute_dispatch — Compute operations
  • pipelineGpuPipeline DAG; BarrierBatcher (Eager/Batched/Deferred)
  • kernels, kernel — Compute kernel definitions
  • descriptor_set — Resource descriptor binding
  • workgroupWorkgroupAutoTuner for optimal dispatch sizing

Ops (high-level media kernels):

  • ops::colorspaceColorSpaceConversion (BT601/709/2020)
  • ops::chromaChromaOps subsampling/upsampling
  • ops::scaleScaleOperation with ScaleFilter (Bilinear/Bicubic/Lanczos3)
  • ops::filterFilterOperation convolution kernels
  • ops::tonemapTonemapAlgorithm (Reinhard/Hable/ACES/Drago)
  • ops::denoiseDenoiseKernel
  • ops::histogram_eqHistogramEqualizer with ClaheConfig
  • ops::quality_metricscompute_psnr, compute_ssim, compute_ms_ssim
  • ops::transformTransformOperation DCT/FFT
  • ops::composite — Layer compositing

Texture and rendering:

  • textureTexturePool with LRU eviction, TextureFormat enum
  • render_pass — GPU render pass
  • sampler — Sampler configuration
  • viewport — Viewport configuration
  • texture_atlas, texture_cache, mipmap_gen — Texture utilities

Synchronization:

  • queue — Command queue management
  • sync, sync_primitive — Fence and semaphore synchronization
  • fence_pool — Fence lifecycle management

Video processing:

  • video_processVideoFrameProcessor frame pipeline
  • histogramImageHistogram / ChannelHistogram
  • motion_detectMotionDetector with Sensitivity levels
  • optical_flow — Dense optical flow

Profiling:

  • gpu_profiler — GPU profiling
  • gpu_timer — GPU timing queries
  • gpu_stats — GPU statistics collection
  • resource_manager — GPU resource lifecycle tracking
  • occupancy — Compute occupancy analysis

License

Apache-2.0 — Copyright 2024-2026 COOLJAPAN OU (Team Kitasan)