Crate cubecl_core Copy item path Source pub use frontend::cmma ;pub use cubecl_ir as ir;pub use codegen ::*;backtrace Backtrace module to build error reports. benchmark Module for benchmark timings bytes Utilities module to manipulate bytes. cache Cache module for an efficient in-memory and persistent database. client Compute client module. codegen compute device Device module. format Format utilities. frontend Cube Frontend Types. future Some future utilities that work across environments.
Future utils with a compatible API for native, non-std and wasm environments. io Input Output utilities. map Map utilities and implementations. post_processing prelude profile Module for profiling any executable part quant Quantization primitives required outside of cubecl-quant rand Rand module contains types for random number generation for non-std environments and for
std environments. reader Useful when you need to read async data without having to decorate each function with async
notation. server Compute server module. stream_id Stream id related utilities. stub Stub module contains types for stubs for non-std environments and for std environments. tune Autotune module comment Insert a literal comment into the kernel source code. comptime Mark the contents of this macro as compile time values, turning off all
expansion for this code and using it verbatim comptime_type Makes the function return a compile time value
Useful in a cube trait to have a part of the trait return comptime values debug_print Print a formatted message using the target’s debug print facilities. The format string is target
specific, but Vulkan and CUDA both use the C++ conventions. WGSL isn’t currently supported. debug_print_expand Print a formatted message using the target’s debug print facilities. The format string is target
specific, but Vulkan and CUDA both use the C++ conventions. WGSL isn’t currently supported. intrinsic Mark the contents of this macro as an intrinsic, turning off all expansion
for this code and calling it with the scope terminate Terminate the execution of the kernel for the current unit. unexpanded CubeDim The number of units across all 3 axis totalling to the number of working units in a cube. CubeTuneId ID used to identify a Just-in-Time environment. MemoryUsage Amount of memory in use by this allocator
and statistics on how much memory is reserved and
wasted in total. e2m1 A 4-bit floating point type with 2 exponent bits and 1 mantissa bit. e2m3 A 6-bit floating point type with 2 exponent bits and 3 mantissa bits. e2m1x2 A 4-bit floating point type with 2 exponent bits and 1 mantissa bit. Packed with two elements
per value, to allow for conversion to/from bytes. Care must be taken to ensure the shape is
adjusted appropriately. e3m2 A 6-bit floating point type with 3 exponent bits and 2 mantissa bits. e4m3 A 8-bit floating point type with 4 exponent bits and 3 mantissa bits. e5m2 A 8-bit floating point type with 5 exponent bits and 2 mantissa bits. flex32 A floating point type with relaxed precision, minimum f16 , max [f32]. tf32 A 19-bit floating point type implementing the tfloat32 format. ue8m0 An 8-bit unsigned floating point type with 8 exponent bits and no mantissa bits.
Used for scaling factors. CompilationError JIT compilation error. CubeCount Specifieds the number of cubes to be dispatched for a kernel. ExecutionMode The kind of execution to be performed. LineSizeError MemoryConfiguration High level configuration of memory management. Compiler Compiles the representation into its own representation that can be formatted into tokens. CubeElement The base element trait for the jit backend. CubeScalar CubeTask Kernel trait with the ComputeShader that will be compiled and cached based on the
provided id. Runtime Runtime for the CubeCL. calculate_cube_count_elemwise Calculate the number of cubes required to execute an operation where one cube unit is
assigned to one element. tensor_line_size tensor_line_size_parallel Find the maximum line size usable for parallel vectorization along the given axis
from the supported line sizes or return 1 if vectorization is impossible. tensor_line_size_perpendicular Find the maximum line size usable for perpendicular vectorization along the given axis
from the supported line sizes or return 1 if vectorization is impossible. tensor_vectorization_factor try_tensor_line_size_parallel Like try_tensor_line_size_parallel but does not assume 1 is supported try_tensor_line_size_perpendicular Like tensor_line_size_perpendicular but does not assume 1 is supported RuntimeArg Runtime arguments to launch a kernel. cube Mark a cube function, trait or implementation for expansion. derive_cube_comptime Attribute macro to define a type that can be used as a kernel comptime
argument This derive Debug, Hash, PartialEq, Eq, Clone, Copy AutotuneKey Implements display and initialization for autotune keys. CubeLaunch Derive macro to define a cube type that is launched with a kernel CubeType Derive macro to define a cube type that is not launched