pub struct LevelZeroBackend { /* private fields */ }Expand description
Intel Level Zero GPU compute backend.
On Linux and Windows this selects the first Intel GPU via the Level Zero
loader library (libze_loader.so / ze_loader.dll) and allocates device
memory through the Level Zero memory model.
On macOS every operation returns BackendError::DeviceError wrapping
crate::error::LevelZeroError::UnsupportedPlatform.
§Lifecycle
LevelZeroBackend::new()— create an uninitialised backend.init()— load the Level Zero driver and select a GPU.- Use
alloc,copy_htod, compute ops,copy_dtoh,free. synchronize()— wait for all pending GPU work to finish.
Implementations§
Trait Implementations§
Source§impl ComputeBackend for LevelZeroBackend
impl ComputeBackend for LevelZeroBackend
Source§fn init(&mut self) -> BackendResult<()>
fn init(&mut self) -> BackendResult<()>
Initialize the backend (select device, create context). Read more
Source§fn is_initialized(&self) -> bool
fn is_initialized(&self) -> bool
Returns
true if the backend is ready for operations.Source§fn gemm(
&self,
_trans_a: BackendTranspose,
_trans_b: BackendTranspose,
m: usize,
n: usize,
k: usize,
alpha: f64,
a_ptr: u64,
_lda: usize,
b_ptr: u64,
_ldb: usize,
beta: f64,
c_ptr: u64,
_ldc: usize,
) -> BackendResult<()>
fn gemm( &self, _trans_a: BackendTranspose, _trans_b: BackendTranspose, m: usize, n: usize, k: usize, alpha: f64, a_ptr: u64, _lda: usize, b_ptr: u64, _ldb: usize, beta: f64, c_ptr: u64, _ldc: usize, ) -> BackendResult<()>
General matrix multiply:
C = alpha * op(A) * op(B) + beta * C. Read moreSource§fn batched_gemm(
&self,
_trans_a: BackendTranspose,
_trans_b: BackendTranspose,
m: usize,
n: usize,
k: usize,
alpha: f64,
a_ptr: u64,
_lda: usize,
stride_a: usize,
b_ptr: u64,
_ldb: usize,
stride_b: usize,
beta: f64,
c_ptr: u64,
_ldc: usize,
stride_c: usize,
batch_count: usize,
) -> BackendResult<()>
fn batched_gemm( &self, _trans_a: BackendTranspose, _trans_b: BackendTranspose, m: usize, n: usize, k: usize, alpha: f64, a_ptr: u64, _lda: usize, stride_a: usize, b_ptr: u64, _ldb: usize, stride_b: usize, beta: f64, c_ptr: u64, _ldc: usize, stride_c: usize, batch_count: usize, ) -> BackendResult<()>
Strided batched GEMM: for each batch
b in 0..batch_count,
compute C_b = alpha * op(A_b) * op(B_b) + beta * C_b
where A_b starts at a_ptr + b * stride_a * 4 bytes (f32 elements), etc. Read moreSource§fn conv2d_forward(
&self,
input_ptr: u64,
input_shape: &[usize],
filter_ptr: u64,
filter_shape: &[usize],
output_ptr: u64,
output_shape: &[usize],
stride: &[usize],
padding: &[usize],
) -> BackendResult<()>
fn conv2d_forward( &self, input_ptr: u64, input_shape: &[usize], filter_ptr: u64, filter_shape: &[usize], output_ptr: u64, output_shape: &[usize], stride: &[usize], padding: &[usize], ) -> BackendResult<()>
2D convolution forward pass. Read more
Source§fn attention(
&self,
q_ptr: u64,
k_ptr: u64,
v_ptr: u64,
o_ptr: u64,
batch: usize,
heads: usize,
seq_q: usize,
seq_kv: usize,
head_dim: usize,
scale: f64,
causal: bool,
) -> BackendResult<()>
fn attention( &self, q_ptr: u64, k_ptr: u64, v_ptr: u64, o_ptr: u64, batch: usize, heads: usize, seq_q: usize, seq_kv: usize, head_dim: usize, scale: f64, causal: bool, ) -> BackendResult<()>
Scaled dot-product attention. Read more
Source§fn reduce(
&self,
op: ReduceOp,
input_ptr: u64,
output_ptr: u64,
shape: &[usize],
axis: usize,
) -> BackendResult<()>
fn reduce( &self, op: ReduceOp, input_ptr: u64, output_ptr: u64, shape: &[usize], axis: usize, ) -> BackendResult<()>
Reduction along an axis. Read more
Source§fn unary(
&self,
op: UnaryOp,
input_ptr: u64,
output_ptr: u64,
n: usize,
) -> BackendResult<()>
fn unary( &self, op: UnaryOp, input_ptr: u64, output_ptr: u64, n: usize, ) -> BackendResult<()>
Element-wise unary operation. Read more
Source§fn binary(
&self,
op: BinaryOp,
a_ptr: u64,
b_ptr: u64,
output_ptr: u64,
n: usize,
) -> BackendResult<()>
fn binary( &self, op: BinaryOp, a_ptr: u64, b_ptr: u64, output_ptr: u64, n: usize, ) -> BackendResult<()>
Element-wise binary operation. Read more
Source§fn synchronize(&self) -> BackendResult<()>
fn synchronize(&self) -> BackendResult<()>
Synchronize all pending operations on this backend. Read more
Source§fn free(&self, ptr: u64) -> BackendResult<()>
fn free(&self, ptr: u64) -> BackendResult<()>
Free device memory previously allocated with
alloc.Source§impl Debug for LevelZeroBackend
impl Debug for LevelZeroBackend
Auto Trait Implementations§
impl Freeze for LevelZeroBackend
impl RefUnwindSafe for LevelZeroBackend
impl Send for LevelZeroBackend
impl Sync for LevelZeroBackend
impl Unpin for LevelZeroBackend
impl UnsafeUnpin for LevelZeroBackend
impl UnwindSafe for LevelZeroBackend
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more