Crate cubecl

Re-exports§

benchmark: Module for benchmark timings
channel: Compute channel module.
client: Compute client module.
codegen
compute
config: CubeCL config module.
frontend: Cube Frontend Types.
future: Future utils with a compatible API for native, non-std and wasm environments.
io: Input Output utilities.
ir
post_processing
prelude
server: Compute server module.
tune: Autotune module

comment: Insert a literal comment into the kernel source code.
comptime: Mark the contents of this macro as compile time values, turning off all expansion for this code and using it verbatim
comptime_type: Makes the function return a compile time value Useful in a cube trait to have a part of the trait return comptime values
debug_print: Print a formatted message using the target’s debug print facilities. The format string is target specific, but Vulkan and CUDA both use the C++ conventions. WGSL isn’t currently supported.
debug_print_expand: Print a formatted message using the target’s debug print facilities. The format string is target specific, but Vulkan and CUDA both use the C++ conventions. WGSL isn’t currently supported.
intrinsic: Mark the contents of this macro as an intrinsic, turning off all expansion for this code and calling it with the scope
terminate: Terminate the execution of the kernel for the current unit.
unexpanded

BufferInfo: Information related to a buffer binding.
CubeDim
CubeTuneId: ID used to identify a Just-in-Time environment.
KernelExpansion: The information necessary to compile a kernel definition.
KernelIntegrator: The kernel integrator allows you to create a kernel definition based on kernel expansion and kernel settings.
KernelOptions
KernelSettings
MemoryUsage: Amount of memory in use by this allocator and statistics on how much memory is reserved and wasted in total.
Metadata: Helper to calculate metadata offsets based on buffer count and position
MetadataBuilder: Builder for a serialized metadata struct
ScalarInfo: Information related to a scalar input.
WgpuCompilationOptions
flex32: A floating point type with relaxed precision, minimum f16, max [f32].
tf32: A 19-bit floating point type implementing the tfloat32 format.

Compiler: Compiles the representation into its own representation that can be formatted into tokens.
CubeElement: The base element trait for the jit backend.
Runtime: Runtime for the CubeCL.

calculate_cube_count_elemwise: Calculate the number of cubes required to execute an operation where one cube unit is assigned to one element.
tensor_line_size
tensor_line_size_parallel: Find the maximum line size usable for parallel vectorization along the given axis from the supported line sizes or return 1 if vectorization is impossible.
tensor_line_size_perpendicular: Find the maximum line size usable for perpendicular vectorization along the given axis from the supported line sizes or return 1 if vectorization is impossible.
tensor_vectorization_factor
try_tensor_line_size_parallel: Like try_tensor_line_size_parallel but does not assume 1 is supported
try_tensor_line_size_perpendicular: Like tensor_line_size_perpendicular but does not assume 1 is supported

cube: Mark a cube function, trait or implementation for expansion.
derive_cube_comptime: Attribute macro to define a type that can be used as a kernel comptime argument This derive Debug, Hash, PartialEq, Eq, Clone, Copy

AutotuneKey: Implements display and initialization for autotune keys.
CubeLaunch: Derive macro to define a cube type that is launched with a kernel
CubeType: Derive macro to define a cube type that is not launched