//! Activation dtypes and the typed device-pointer handle.
/// Element type of a GEMM operand.
/// Raw CUDA device pointer tagged with its element dtype.
///
/// The engine never owns memory it computes on — callers pass device
/// pointers allocated on the SAME stream the engine was built with (or
/// properly ordered against it).