Enum UnaryKind

Source

#[non_exhaustive]
#[repr(u16)]pub enum UnaryKind {
Show 67 variants    Neg = 0,
    Abs = 1,
    Sign = 2,
    Reciprocal = 3,
    Square = 4,
    Cube = 5,
    Sqrt = 10,
    Rsqrt = 11,
    Cbrt = 12,
    Exp = 20,
    Exp2 = 21,
    Expm1 = 22,
    Log = 23,
    Log2 = 24,
    Log10 = 25,
    Log1p = 26,
    Sin = 30,
    Cos = 31,
    Tan = 32,
    Asin = 33,
    Acos = 34,
    Atan = 35,
    Sinh = 40,
    Cosh = 41,
    Tanh = 42,
    Asinh = 43,
    Acosh = 44,
    Atanh = 45,
    Floor = 50,
    Ceil = 51,
    Round = 52,
    Trunc = 53,
    Frac = 54,
    Erf = 60,
    Erfc = 61,
    Erfinv = 62,
    Lgamma = 63,
    Digamma = 64,
    BitwiseNot = 70,
    Popcount = 71,
    Clz = 72,
    Ctz = 73,
    Relu = 100,
    Gelu = 101,
    GeluTanh = 102,
    Silu = 103,
    Mish = 104,
    Sigmoid = 105,
    Logit = 106,
    Softplus = 107,
    Softsign = 108,
    Tanhshrink = 109,
    Relu6 = 110,
    Hardswish = 111,
    Hardsigmoid = 112,
    Hardtanh = 113,
    Selu = 114,
    LeakyRelu = 115,
    Elu = 116,
    Hardshrink = 117,
    Softshrink = 118,
    Threshold = 119,
    PReLU = 120,
    PowI = 121,
    Step = 122,
    Cast = 130,
    Affine = 131,
}

Expand description

Unary elementwise op discriminant.

Stored as u16 in crate::KernelSku::op when category == OpCategory::UnaryElementwise. Variants correspond to the union of PyTorch (torch.<op> / torch.Tensor.<op>) and JAX (jax.numpy.<op> / jax.lax.<op>) unary elementwise ops, plus the activation family from PyTorch nn.functional.

Today only Self::Neg is wired — the Phase 3 unary trailblazer SKU. The other variants are reserved discriminants for the fanout sessions that ship the math (abs / sqrt / exp / log / sin / …) and activation (relu / gelu / silu / …) families.

Ops that return a different dtype than the input (isnan, isinf, isfinite, logical_not) are reserved here but will route through a future UnaryToBoolPlan (or similar) with a distinct output type — not through this enum’s UnaryPlan<T, N>.

Parameterized activations (leaky_relu(α), elu(α), threshold(t, v), hardshrink(λ), softshrink(λ)) carry their parameters via a UnaryParams field on the descriptor — landed when the first parameterized op ships, omitted for the trailblazer.

Variants (Non-exhaustive)§

This enum is marked as non-exhaustive

Non-exhaustive enums could have additional variants added in future. Therefore, when matching against variants of non-exhaustive enums, an extra wildcard arm must be added to account for any future variants.

§

Neg = 0

y = -x — elementwise negation. Trailblazer SKU.

§

Abs = 1

y = |x| — elementwise absolute value.

§

Sign = 2

y = sign(x) — -1 / 0 / +1 per the input’s sign.

§

Reciprocal = 3

y = 1 / x — elementwise reciprocal.

§

Square = 4

y = x * x — elementwise square.

§

Cube = 5

y = x * x * x — elementwise cube.

§

Sqrt = 10

y = sqrt(x).

§

Rsqrt = 11

y = 1 / sqrt(x) — reciprocal square root.

§

Cbrt = 12

y = cbrt(x) — cube root.

§

Exp = 20

y = exp(x).

§

Exp2 = 21

y = 2^x.

§

Expm1 = 22

y = exp(x) - 1.

§

Log = 23

y = ln(x) — natural log.

§

Log2 = 24

y = log_2(x).

§

Log10 = 25

y = log_10(x).

§

Log1p = 26

y = ln(1 + x).

§

Sin = 30

y = sin(x).

§

Cos = 31

y = cos(x).

§

Tan = 32

y = tan(x).

§

Asin = 33

y = asin(x).

§

Acos = 34

y = acos(x).

§

Atan = 35

y = atan(x).

§

Sinh = 40

y = sinh(x).

§

Cosh = 41

y = cosh(x).

§

Tanh = 42

y = tanh(x).

§

Asinh = 43

y = asinh(x).

§

Acosh = 44

y = acosh(x).

§

Atanh = 45

y = atanh(x).

§

Floor = 50

y = floor(x).

§

Ceil = 51

y = ceil(x).

§

Round = 52

y = round(x) — round-half-to-even (PyTorch convention).

§

Trunc = 53

y = trunc(x) — truncate toward zero.

§

Frac = 54

y = x - trunc(x) — fractional part with sign of x.

§

Erf = 60

y = erf(x).

§

Erfc = 61

y = erfc(x) = 1 - erf(x).

§

Erfinv = 62

y = erfinv(x).

§

Lgamma = 63

y = lgamma(x) = ln(|Γ(x)|).

§

Digamma = 64

y = digamma(x) = Γ'(x) / Γ(x).

§

BitwiseNot = 70

y = ~x — bitwise NOT (integer dtypes).

§

Popcount = 71

y = popcount(x) — population count of set bits (integer).

§

Clz = 72

y = clz(x) — count leading zeros (integer).

§

Ctz = 73

y = ctz(x) — count trailing zeros (integer).

§

Relu = 100

y = relu(x) = max(x, 0).

§

Gelu = 101

y = gelu(x) — ERF-EXACT Gaussian Error Linear Unit, 0.5·x·(1+erf(x/√2)) — NOT the tanh approximation (that’s Self::GeluTanh). The sys-level unary_gelu_erf_* symbols are a bit-identical alias of the unary_gelu_* symbols this variant dispatches to.

§

GeluTanh = 102

y = gelu_tanh(x) — tanh APPROXIMATION of gelu, 0.5·x·(1+tanh(√(2/π)·(x+0.044715·x³))). Diverges from the erf-exact Self::Gelu by up to ~1e-4.

§

Silu = 103

y = silu(x) = x · sigmoid(x). Also known as Swish-1.

§

Mish = 104

y = mish(x) = x · tanh(softplus(x)).

§

Sigmoid = 105

y = sigmoid(x) = 1 / (1 + exp(-x)).

§

Logit = 106

y = logit(x) = log(x / (1 - x)). Inverse of sigmoid.

§

Softplus = 107

y = softplus(x) = ln(1 + exp(x)).

§

Softsign = 108

y = softsign(x) = x / (1 + |x|).

§

Tanhshrink = 109

y = tanhshrink(x) = x - tanh(x).

§

Relu6 = 110

y = relu6(x) = min(max(x, 0), 6).

§

Hardswish = 111

y = hardswish(x) — piecewise-linear approximation of swish.

§

Hardsigmoid = 112

y = hardsigmoid(x) — piecewise-linear approximation of sigmoid.

§

Hardtanh = 113

y = hardtanh(x, -1, +1) — piecewise-linear clamp.

§

Selu = 114

y = selu(x) — scaled exponential linear unit.

§

LeakyRelu = 115

y = leaky_relu(x) = x if x > 0 else α·x. Hardcoded α = 0.01 in the current bespoke kernel; will re-emit as a fanout from a parameterized-unary plan once that infrastructure lands.

§

Elu = 116

y = elu(x) = x if x > 0 else α·(exp(x) - 1). Hardcoded α = 1.0 in the current bespoke kernel; same parameterization story as LeakyRelu.

§

Hardshrink = 117

y = hardshrink(x) = x if |x| > λ else 0. Hardcoded λ = 0.5 in the current bespoke kernel; same parameterization story as LeakyRelu.

§

Softshrink = 118

y = softshrink(x) = x - λ if x > λ; x + λ if x < -λ; else 0. Hardcoded λ = 0.5 in the current bespoke kernel; same parameterization story as LeakyRelu.

§

Threshold = 119

Reserved — threshold(x; t, v) = x if x > t else v. Needs the parameterized-unary plan (two scalar parameters); not wired yet.

§

PReLU = 120

prelu(x; α) = x if x > 0 else α·x with per-channel learnable α vector (or single scalar α). Uses a distinct plan shape (PReluPlan / PReluBackwardPlan) because α is a tensor operand, not a scalar parameter. Wired in Milestone 5.3.

§

PowI = 121

powi(x; n) = x^n for a fixed runtime integer exponent n. Distinct from the generic BinaryKind::Pow (which takes an f32 exponent tensor) because the integer-only path can use power-by-squaring — faster than __expf(n · __logf(x)) and also well-defined for negative x (real pow(-1.5, 2) = 2.25, no NaN). The exponent is threaded via the params: [f32; 2] slot 0 with a host-side cast (n as f32); slot 1 is unused. Reasonable |n| values round-trip through f32 exactly (≤ 2^24). Phase 12.1 wires {f32, f16, bf16, f64} through UnaryParamPlan.

§

Step = 122

y = step(x) = 1 if x > 0 else 0 — Heaviside step function. step(0) = 0 and step(-0.0) = 0 (x > 0 is false at both zeros); NaN → 0 (NaN > 0 is false), matching PyTorch’s heaviside(x, values=0) for the > branch. Wires the Phase 31 unary_step_* kernels.

§

Cast = 130

y = (TOut) x — dtype conversion. Heterogeneous input / output element types, so it goes through its own CastPlan (not the same-dtype UnaryPlan<T, N>). The discriminant lives here for telemetry / SKU-tagging consistency with the rest of the unary family. Wired from fuel-cuda-kernels/cast.cu.

§

Affine = 131

y = a * x + b — fused affine (multiply-add) with scalar parameters a / b. Same-dtype input/output but carries two scalar parameters, so it gets its own AffinePlan (the unified UnaryPlan<T, N> doesn’t carry kernel parameters). Wired from fuel-cuda-kernels/affine.cu.