Skip to main content

Crate oximedia_gpu

Crate oximedia_gpu 

Source
Expand description

Cross-platform GPU compute pipeline for OxiMedia using WGPU.

This crate provides GPU-accelerated media processing via the wgpu portability layer, which selects the best available native backend at runtime — no compile-time feature flags are required:

PlatformBackend selected by wgpu
LinuxVulkan (preferred), then OpenGL ES
macOSMetal
WindowsDirectX 12, then Vulkan
WebWebGPU
AllCPU software fallback when no GPU adapter is found

§Compute kernels

Color space:

Geometry and scale:

Signal processing:

Quality metrics:

§TexturePool — LRU eviction

TexturePool maintains a byte-budget and slot-count capacity. When both limits are exhausted, TexturePool::allocate_with_lru_eviction evicts the least-recently-used texture in a loop until enough space is reclaimed. LRU order is tracked with a monotonic access_clock counter; the slot with the smallest timestamp is selected by TexturePool::lru_handle. Call TexturePool::touch after each use to update the timestamp.

Supported TextureFormats: Rgba8, Rgba16f, Rgb10A2, R8, Rg8, Yuv420, Nv12.

§Shader cache

shader_cache::GpuShaderCache maintains two levels of caching:

  • In-memory: LRU, LFU, or OldestFirst eviction (configurable via shader_cache::EvictionPolicy). Hit/miss counters are tracked.
  • Disk-persistent: Cache entries are stored as <hex_hash>_<backend>_<flags>.shd (compiled bytecode) plus a <hex_hash>_<backend>_<flags>.meta sidecar. The cache key is a shader_cache::ShaderVersion containing source_hash: u64, backend: String, and feature_flags: u32.

§Pipeline system

GpuPipeline is a DAG-based processing pipeline with built-in barrier management. Stages: Decode → Colorspace → Filter → Encode → Display. BarrierBatcher supports three strategies — Eager, Batched, and Deferred — to minimise synchronisation overhead.

BatchedComputePass and ComputeShaderSimulator provide structured compute dispatch with recorded DispatchCommand queues.

§GPU buffer management

SubAllocator implements a bump-pointer sub-allocator with defragmentation for the GPU buffer pool. memory_pool::DefragResult reports how many bytes were compacted per defrag pass.

§Example

use oximedia_gpu::GpuContext;

let ctx = GpuContext::new()?;

let input = vec![0u8; 1920 * 1080 * 4];
let mut output = vec![0u8; 1920 * 1080 * 4];

ctx.rgb_to_yuv(&input, &mut output)?;

Re-exports§

pub use accelerator::AcceleratorBuilder;
pub use accelerator::CpuAccelerator;
pub use accelerator::GpuAccelerator;
pub use accelerator::WgpuAccelerator;
pub use buffer::BufferType;
pub use buffer::GpuBuffer;
pub use device::GpuDevice;
pub use device::GpuDeviceInfo;
pub use ops::quality_metrics::compute_ms_ssim;
pub use ops::quality_metrics::compute_psnr;
pub use ops::quality_metrics::compute_ssim;
pub use ops::quality_metrics::MsSsimResult;
pub use ops::quality_metrics::PsnrResult;
pub use ops::quality_metrics::SsimResult;
pub use ops::ChromaOps;
pub use ops::ChromaSubsampling;
pub use ops::ColorSpaceConversion;
pub use ops::FilterOperation;
pub use ops::ScaleOperation;
pub use ops::TransformOperation;
pub use ops::YcbcrCoefficients;
pub use backend::Backend;
pub use backend::BackendCapabilities;
pub use backend::BackendType;
pub use backend::CpuBackend;
pub use backend::VulkanBackend;
pub use cache::CacheStats;
pub use cache::PipelineCache;
pub use cache::ShaderCache;
pub use compiler::CompilationError;
pub use compiler::CompilationOptions;
pub use compiler::OptimizationLevel;
pub use compiler::ShaderCompiler;
pub use compiler::ShaderPreprocessor;
pub use compute::ComputeExecutor;
pub use compute::ComputePassBuilder;
pub use compute::ComputePipelineHandle;
pub use compute::ComputePipelineManager;
pub use compute::DispatchHelper;
pub use kernels::ColorConversionKernel;
pub use kernels::ConvolutionKernel;
pub use kernels::FilterKernel;
pub use kernels::ReduceKernel;
pub use kernels::ReduceOp;
pub use kernels::ResizeFilter;
pub use kernels::ResizeKernel;
pub use kernels::TransformKernel;
pub use kernels::TransformType;
pub use memory::ManagedBuffer;
pub use memory::MemoryAllocator;
pub use memory::MemoryPool;
pub use memory::MemoryStats;
pub use queue::AsyncSubmission;
pub use queue::BatchSubmitter;
pub use queue::CommandBufferBuilder;
pub use queue::CommandQueue;
pub use queue::QueueManager;
pub use queue::QueueType;
pub use sync::Barrier;
pub use sync::Event;
pub use sync::Fence;
pub use sync::Semaphore;
pub use workgroup::DeviceLimits;
pub use workgroup::WorkgroupAutoTuner;
pub use memory_pool::CompactionPlan;
pub use memory_pool::DefragResult;
pub use memory_pool::MigrationEntry;
pub use buffer_pool::SubAllocator;
pub use compute_pass::BatchedComputePass;
pub use compute_pass::DispatchCommand;
pub use histogram::ChannelHistogram;
pub use histogram::ImageHistogram;
pub use motion_detect::MotionAnalysis;
pub use motion_detect::MotionDetector;
pub use motion_detect::MotionRegion;
pub use motion_detect::Sensitivity;
pub use pipeline::BarrierBatcher;
pub use pipeline::BarrierKind;
pub use pipeline::BarrierStrategy;
pub use pipeline::BufferBarrier;
pub use pipeline::FlushRecord;
pub use pipeline::GpuPipeline;
pub use pipeline::PipelineMetrics;
pub use pipeline::PipelineNode;
pub use pipeline::PipelineStage;
pub use texture::TextureDescriptor;
pub use texture::TextureFormat;
pub use texture::TexturePool;
pub use video_process::FrameProcessConfig;
pub use video_process::FrameProcessResult;
pub use video_process::VideoFrameProcessor;
pub use compute_shader::ComputeShaderSimulator;
pub use compute_shader::ShaderKernel;
pub use compute_shader::ThreadGroupContext;
pub use histogram_equalization::ClaheConfig;
pub use histogram_equalization::EqualizationStats;
pub use histogram_equalization::HistogramEqualizer;

Modules§

accelerator
GpuAccelerator trait and hardware acceleration abstraction.
async_compute
Async compute queue for overlapping compute and transfer operations.
backend
Backend implementations for GPU and CPU compute
barrier_manager
GPU barrier and synchronization management.
blend_kernel
GPU blend kernels (CPU simulation via Rayon).
buffer
GPU buffer management for staging and device memory
buffer_copy
GPU buffer copy and blit operations.
buffer_pool
Zero-copy buffer pool for GPU-style memory management.
cache
Pipeline cache management for faster startup and reduced compilation overhead
color_convert_kernel
GPU color space conversion kernels (CPU simulation).
command_buffer
GPU command buffer recording and submission.
compiler
Runtime shader compilation and management
compute
Compute pipeline management for GPU operations
compute_dispatch
Compute shader dispatch helpers.
compute_graph
GPU compute graph — a typed, topologically-ordered execution planner.
compute_kernels
Pure-Rust SIMD-optimised compute kernels for image processing.
compute_pass
GPU compute pass management — pass types, buffer bindings, and pass queues.
compute_shader
GPU compute shader simulator.
descriptor_set
Descriptor set and layout management for GPU pipeline bindings.
device
GPU device management and enumeration
double_buffer
Double-buffered GPU command submission.
fence_pool
GPU fence pool for efficient synchronization primitive reuse.
film_grain
GPU-accelerated film grain synthesis.
gpu_buffer
GPU buffer management for oximedia-gpu.
gpu_cpu_verify
GPU vs CPU output comparison and verification utilities.
gpu_fence
GPU fence (synchronization primitive) management for oximedia-gpu.
gpu_profiler
GPU profiling and timing utilities.
gpu_stats
GPU statistics collection and monitoring.
gpu_timer
GPU timing and profiling utilities.
histogram
Multi-channel image histogram analysis.
histogram_equalization
GPU-accelerated histogram equalization.
indirect_dispatch
Indirect dispatch support for GPU compute kernels.
kernel
GPU kernel management — kernel types, specs, and caching.
kernel_scheduler
GPU kernel scheduling simulation.
kernels
GPU compute kernels library
memory
GPU memory management and allocation tracking
memory_pool
GPU memory pool allocator.
mipmap_gen
Mipmap generation utilities for GPU textures.
motion_detect
GPU-accelerated motion detection.
motion_estimation
GPU-accelerated motion estimation for AV1 and VP9 video codecs.
multi_gpu
Multi-GPU load balancing with automatic frame distribution.
occupancy
GPU occupancy calculator and optimization.
ops
GPU compute operations
optical_flow
GPU-accelerated optical flow computation for motion interpolation.
perspective_transform
GPU-accelerated perspective transform and lens distortion correction.
pipeline
GPU processing pipeline management
pipeline_cache
Pipeline object cache for the GPU crate.
pipeline_stages
GPU-style image processing pipeline stage abstraction.
queue
Command queue management for GPU operations
readback
GPU readback utilities.
render_pass
Render pass configuration and builder for oximedia-gpu.
resource_manager
GPU resource allocation and lifetime tracking.
sampler
Texture sampler configuration and caching.
scale_kernel
GPU scaling kernels (CPU simulation via Rayon).
shader
Shader compilation and pipeline management
shader_cache
GPU shader cache management.
shader_params
Shader parameter management — param types, individual params, and uniform blocks.
sync
Synchronization primitives for GPU operations
sync_primitive
GPU synchronisation primitives (semaphores, fences, barriers).
texture
GPU texture management
texture_atlas
Texture atlas packing for GPU uploads.
texture_cache
2D tile-based texture cache simulation.
tone_curve
GPU-accelerated tone curve application.
upload_queue
Asynchronous upload queue for staging CPU data to GPU buffers.
vertex_buffer
Vertex buffer layout descriptions and management.
video_process
GPU-accelerated video frame processing.
viewport
Viewport and scissor-rectangle management for GPU render passes.
workgroup
GPU workgroup configuration and dispatch sizing.

Structs§

GpuContext
GPU context for compute operations

Enums§

GpuError
Error types for GPU operations

Type Aliases§

Result