pub struct Renderer {
pub device: String,
pub has_local: bool,
pub has_shared: bool,
pub has_threads: bool,
pub shared_max: usize,
pub global_max: Option<Vec<usize>>,
pub local_max: Option<usize>,
pub upcast_max: usize,
pub buffer_max: Option<usize>,
pub tensor_cores: Vec<TensorCore>,
}Expand description
Backend renderer capabilities.
Describes what features and optimizations a particular backend supports. Used by the optimizer to determine valid transformations and enforce device limits.
Fields§
§device: StringBackend device identifier (e.g., “CUDA”, “Metal”, “CPU”).
has_local: boolWhether the backend supports local/shared memory (GPU workgroups).
Whether the backend supports shared memory across threads in a workgroup.
has_threads: boolWhether the backend supports CPU-style threading (not GPU threads).
Maximum shared memory size in bytes.
Used to validate GROUP/GROUPTOP optimizations that allocate shared memory. Typical values: 48KB-96KB for modern GPUs.
global_max: Option<Vec<usize>>Maximum global work dimensions [x, y, z].
Maximum size for each global thread dimension. Used to validate thread count in THREAD optimization. None if unlimited or not applicable.
local_max: Option<usize>Maximum local work group size.
Maximum number of threads in a workgroup (product of local dimensions). Typical values: 256-1024 for GPUs.
upcast_max: usizeMaximum vectorization width (upcast limit).
Maximum number of elements that can be processed as a vector. Typical values: 8-16 for SIMD, 4 for GPU float4.
buffer_max: Option<usize>Maximum number of buffers/arguments per kernel.
Some backends have limits on kernel arguments. Metal: 31, WebGPU: 8, CUDA: typically unlimited.
tensor_cores: Vec<TensorCore>Available tensor core configurations.
Hardware-accelerated matrix multiplication units with specific size constraints. Empty if tensor cores not available.
Implementations§
Source§impl Renderer
impl Renderer
Sourcepub fn cuda() -> Self
pub fn cuda() -> Self
Create a CUDA GPU renderer configuration (SM80/Ampere by default).
For specific architectures, use cuda_sm75(), cuda_sm80(), or cuda_sm89().
Sourcepub fn cuda_sm80(allow_tf32: bool) -> Self
pub fn cuda_sm80(allow_tf32: bool) -> Self
Create a CUDA GPU renderer for SM80 (Ampere - A100, RTX 30xx).
Sourcepub fn cuda_sm89(allow_tf32: bool) -> Self
pub fn cuda_sm89(allow_tf32: bool) -> Self
Create a CUDA GPU renderer for SM89 (Hopper - H100).
Trait Implementations§
Auto Trait Implementations§
impl Freeze for Renderer
impl RefUnwindSafe for Renderer
impl Send for Renderer
impl Sync for Renderer
impl Unpin for Renderer
impl UnsafeUnpin for Renderer
impl UnwindSafe for Renderer
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more