Crate oxillama_gpu

Expand description

§oxillama-gpu

Optional wgpu-based GPU compute backend for OxiLLaMa.

§Feature flags

Feature	Description	Default
`gpu`	Enable wgpu device init, buffer helpers, and WGSL shaders	No

When gpu is disabled (the default) this crate still compiles and all public types are available. GpuContext::try_init returns None and GpuDispatcher::has_gpu returns false.

§Quick start

use oxillama_gpu::{GpuDispatcher, GpuContext};

let dispatcher = GpuDispatcher::new();
if dispatcher.has_gpu() {
    println!("GPU available — will use hardware acceleration");
} else {
    println!("No GPU — CPU fallback active");
}

Re-exports§

pub use context::GpuContext;
pub use context::GpuDeviceInfo;
pub use error::GpuError;
pub use error::GpuResult;
pub use kernels::sampling::SamplingKernel;
pub use kernels::batched_gemv_f32;
pub use kernels::supports_f16;
pub use kernels::BatchedGemvConfig;
pub use kernels::BatchedGpuKernel;
pub use kernels::F16AccumulatorConfig;
pub use kernels::FusedAttentionKernel;
pub use kernels::GpuKernel;
pub use kernels::Iq1MGpuKernel;
pub use kernels::Iq1SGpuKernel;
pub use kernels::Iq2SGpuKernel;
pub use kernels::Iq2XsGpuKernel;
pub use kernels::Iq2XxsGpuKernel;
pub use kernels::Iq3SGpuKernel;
pub use kernels::Iq3XxsGpuKernel;
pub use kernels::Iq4NlGpuKernel;
pub use kernels::Iq4XsGpuKernel;
pub use kernels::Q1_0_G128GpuKernel;
pub use kernels::Q2_KGpuKernel;
pub use kernels::Q3_KGpuKernel;
pub use kernels::Q4_0GpuKernel;
pub use kernels::Q4_1GpuKernel;
pub use kernels::Q4_KGpuKernel;
pub use kernels::Q5_0GpuKernel;
pub use kernels::Q5_1GpuKernel;
pub use kernels::Q5_KGpuKernel;
pub use kernels::Q6_KGpuKernel;
pub use kernels::Q8_0GpuKernel;
pub use kernels::Q8_1GpuKernel;
pub use kernels::Q8_KGpuKernel;
pub use kernels::TiledGemmKernel;
pub use kernels::Tq1_0GpuKernel;
pub use kernels::Tq2_0GpuKernel;

Modules§

buffer: GPU buffer helpers — upload and download f32 arrays.
context: GPU device and queue initialisation.
error: Error types for the GPU compute backend.
kernels: Kernel registry — GPU-accelerated GEMV operations.

Structs§

GpuDispatcher: Central dispatcher that holds an optional GpuContext and vends GPU kernels for supported tensor types.

Crate oxillama_gpu

Crate oxillama_gpu Copy item path

§oxillama-gpu

§Feature flags

§Quick start

Re-exports§

Modules§

Structs§

Crate oxillama_gpu