Expand description
amdgpu compiler intrinsics.
Intrinsics defined for the amdgpu LLVM backend. Availability of intrinsics varies depending on the target architecture.
Functions§
- ballot
- Returns a bitfield (
i32
ori64
) containing the result of its i1 argument in all active lanes, and zero in all inactive lanes. - dispatch_
id - Returns the id of the dispatch that is currently executed.
- ds_
bpermute ⚠ - Gather data across all lanes in a wavefront.
- ds_
permute ⚠ - Scatter data across all lanes in a wavefront.
- endpgm
- Stop execution of the wavefront.
- global_
atomic_ ⚠cond_ sub - Conditional atomic subtraction
- global_
atomic_ ⚠csub - Clamping atomic subtraction
- groupstaticsize
- Returns the number of LDS bytes statically allocated for this program.
- inverse_
ballot - Indexes into the
value
with the current lane id and returns for each lane if the corresponding bit is set. - mbcnt_
hi - Masked bit count, high 32 lanes.
- mbcnt_
lo - Masked bit count, low 32 lanes.
- perm⚠
- Permute a 64-bit value.
- permlane16_
swap ⚠ - Provide direct access to
v_permlane16_swap_b32
instruction on supported targets. - permlane16_
u32 ⚠ - Performs arbitrary gather-style operation within a row (16 contiguous lanes) of the second input operand.
- permlane16_
var ⚠ - Performs arbitrary gather-style operation within a row (16 contiguous lanes) of the second input operand.
- permlane32_
swap ⚠ - Provide direct access to
v_permlane32_swap_b32
instruction on supported targets. - permlane64_
u32 ⚠ - Swap
value
between upper and lower 32 lanes in a wavefront. - permlanex16_
u32 ⚠ - Performs arbitrary gather-style operation across two rows (16 contiguous lanes) of the second input operand.
- permlanex16_
var ⚠ - Performs arbitrary gather-style operation across two rows (16 contiguous lanes) of the second input operand.
- readfirstlane_
u32 - Get
value
from the first active lane in the wavefront. - readfirstlane_
u64 - Get
value
from the first active lane in the wavefront. - readlane_
u32 ⚠ - Get
value
from the lane at indexlane
in the wavefront. - readlane_
u64 ⚠ - Get
value
from the lane at indexlane
in the wavefront. - s_
barrier - Synchronize all wavefronts in a workgroup.
- s_
get_ waveid_ in_ workgroup - Get the index of the current wavefront in the workgroup.
- s_
memrealtime - Measures time based on a fixed frequency.
- s_
sethalt - Stop execution of the kernel.
- s_sleep
- Sleeps for approximately
count * 64
cycles. - update_
dpp ⚠ - The
update_dpp
intrinsic represents theupdate.dpp
operation in AMDGPU. It takes an old value, a source operand, a DPP control operand, a row mask, a bank mask, and a bound control. This operation is equivalent to a sequence ofv_mov_b32
operations. - wave_id
- Get the index of the current wavefront in the workgroup.
- wavefrontsize
- Returns the number of threads in a wavefront.
- workgroup_
id_ x - Returns the x coordinate of the workgroup index within the dispatch.
- workgroup_
id_ y - Returns the y coordinate of the workgroup index within the dispatch.
- workgroup_
id_ z - Returns the z coordinate of the workgroup index within the dispatch.
- workitem_
id_ x - Returns the x coordinate of the workitem index within the workgroup.
- workitem_
id_ y - Returns the y coordinate of the workitem index within the workgroup.
- workitem_
id_ z - Returns the z coordinate of the workitem index within the workgroup.
- writelane_
u32 ⚠ - Return
value
for the lane at indexlane
in the wavefront. Returndefault
for all other lanes. - writelane_
u64 ⚠ - Return
value
for the lane at indexlane
in the wavefront. Returndefault
for all other lanes.