Skip to main content

Crate oxillama_gpu

Crate oxillama_gpu 

Source
Expand description

§oxillama-gpu

Optional wgpu-based GPU compute backend for OxiLLaMa.

§Feature flags

FeatureDescriptionDefault
gpuEnable wgpu device init, buffer helpers, and WGSL shadersNo

When gpu is disabled (the default) this crate still compiles and all public types are available. GpuContext::try_init returns None and GpuDispatcher::has_gpu returns false.

§Quick start

use oxillama_gpu::{GpuDispatcher, GpuContext};

let dispatcher = GpuDispatcher::new();
if dispatcher.has_gpu() {
    println!("GPU available — will use hardware acceleration");
} else {
    println!("No GPU — CPU fallback active");
}

Re-exports§

pub use context::GpuContext;
pub use context::GpuDeviceInfo;
pub use error::GpuError;
pub use error::GpuResult;
pub use kernels::sampling::SamplingKernel;
pub use kernels::batched_gemv_f32;
pub use kernels::supports_f16;
pub use kernels::BatchedGemvConfig;
pub use kernels::BatchedGpuKernel;
pub use kernels::F16AccumulatorConfig;
pub use kernels::FusedAttentionKernel;
pub use kernels::GpuKernel;
pub use kernels::Iq1MGpuKernel;
pub use kernels::Iq1SGpuKernel;
pub use kernels::Iq2SGpuKernel;
pub use kernels::Iq2XsGpuKernel;
pub use kernels::Iq2XxsGpuKernel;
pub use kernels::Iq3SGpuKernel;
pub use kernels::Iq3XxsGpuKernel;
pub use kernels::Iq4NlGpuKernel;
pub use kernels::Iq4XsGpuKernel;
pub use kernels::Q1_0_G128GpuKernel;
pub use kernels::Q2_KGpuKernel;
pub use kernels::Q3_KGpuKernel;
pub use kernels::Q4_0GpuKernel;
pub use kernels::Q4_1GpuKernel;
pub use kernels::Q4_KGpuKernel;
pub use kernels::Q5_0GpuKernel;
pub use kernels::Q5_1GpuKernel;
pub use kernels::Q5_KGpuKernel;
pub use kernels::Q6_KGpuKernel;
pub use kernels::Q8_0GpuKernel;
pub use kernels::Q8_1GpuKernel;
pub use kernels::Q8_KGpuKernel;
pub use kernels::TiledGemmKernel;
pub use kernels::Tq1_0GpuKernel;
pub use kernels::Tq2_0GpuKernel;

Modules§

buffer
GPU buffer helpers — upload and download f32 arrays.
context
GPU device and queue initialisation.
error
Error types for the GPU compute backend.
kernels
Kernel registry — GPU-accelerated GEMV operations.

Structs§

GpuDispatcher
Central dispatcher that holds an optional GpuContext and vends GPU kernels for supported tensor types.