Expand description
Elementwise GPU operations for OxiCUDA BLAS.
This module provides unary and binary elementwise operations over device buffers, including activation functions (ReLU, GELU, sigmoid, SiLU, tanh), scaling, and fused operations (add+relu, scale+add).
Each function generates PTX on the fly via oxicuda_ptx::templates::elementwise::ElementwiseTemplate from
oxicuda-ptx, loads the resulting module, and launches the kernel on the
handle’s stream.
Enums§
- Elementwise
Op - Elementwise operation types supported by the BLAS elementwise module.
Functions§
- abs_val
- Computes the absolute value element-wise:
output[i] = |input[i]|. - add
- Element-wise addition:
C[i] = A[i] + B[i]. - broadcast_
axes - Broadcasts
src(a reduced tensor) back todst(the full original shape) by replicating values along every axis listed inreduced_axes. - ceil
- Computes the ceiling element-wise:
output[i] = ceil(input[i]). - cmp_eq
- Comparison equal:
C[i] = (A[i] == B[i]) ? 1.0 : 0.0. - cmp_ge
- Comparison greater-or-equal:
C[i] = (A[i] >= B[i]) ? 1.0 : 0.0. - cmp_gt
- Comparison greater-than:
C[i] = (A[i] > B[i]) ? 1.0 : 0.0. - cmp_le
- Comparison less-or-equal:
C[i] = (A[i] <= B[i]) ? 1.0 : 0.0. - cmp_lt
- Comparison less-than:
C[i] = (A[i] < B[i]) ? 1.0 : 0.0. - cmp_ne
- Comparison not-equal:
C[i] = (A[i] != B[i]) ? 1.0 : 0.0. - div
- Element-wise division:
C[i] = A[i] / B[i]. - exp
- Computes the exponential element-wise:
output[i] = exp(input[i]). - fill
- Fills every element of
dst[0..n]withvalueon the GPU. - floor
- Computes the floor element-wise:
output[i] = floor(input[i]). - fused_
add_ relu - Fused Add + ReLU:
C[i] = max(0, A[i] + B[i]). - fused_
scale_ add - Fused Scale-Add:
C[i] = alpha * A[i] + beta * B[i]. - gelu
- Applies the GELU activation element-wise (tanh approximation).
- hard_
sigmoid - Applies hard sigmoid element-wise:
output[i] = max(0, min(1, 0.2*input[i] + 0.5)). - hard_
swish - Applies hard swish element-wise:
output[i] = input[i] * max(0, min(6, input[i]+3)) / 6. - leaky_
relu - Applies leaky relu element-wise with alpha=0.01:
output[i] = input[i] >= 0 ? input[i] : 0.01 * input[i]. - log
- Computes the natural logarithm element-wise:
output[i] = ln(input[i]). - max
- Element-wise maximum:
C[i] = max(A[i], B[i]). - min
- Element-wise minimum:
C[i] = min(A[i], B[i]). - mul
- Element-wise multiplication (Hadamard product):
C[i] = A[i] * B[i]. - nand
- Fuzzy NAND:
C[i] = 1 - A[i]*B[i]. - neg
- Negates every element:
output[i] = -input[i]. - nor
- Fuzzy NOR:
C[i] = 1 - (A[i] + B[i] - A[i]*B[i]). - one_
minus - Applies one-minus element-wise:
output[i] = 1 - input[i]. - or_max
- Fuzzy OR via max:
C[i] = max(A[i], B[i]). - or_
prob_ sum - Probabilistic OR:
C[i] = A[i] + B[i] - A[i]*B[i]. - pow
- Element-wise power:
C[i] = A[i]^B[i]. - relu
- Applies the ReLU activation element-wise.
- rsqrt
- Computes the reciprocal square root element-wise:
output[i] = 1 / sqrt(input[i]). - scale
- Scales every element by a scalar:
output[i] = alpha * input[i]. - sigmoid
- Applies the sigmoid activation element-wise.
- silu
- Applies the SiLU (Swish) activation element-wise.
- softplus
- Applies softplus element-wise:
output[i] = ln(1 + exp(input[i])). - sqrt
- Computes the square root element-wise:
output[i] = sqrt(input[i]). - sub
- Element-wise subtraction:
C[i] = A[i] - B[i]. - tanh_
activation - Applies the hyperbolic tangent activation element-wise.
- xor
- Fuzzy XOR:
C[i] = A[i] + B[i] - 2*A[i]*B[i].