hpt-cudakernels 0.1.3

A library implements cuda kernels for hpt
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#pragma once

#define f32 float
#define f64 double
#define bf16 __nv_bfloat16
#define f16 __half

#define i8 int8_t
#define i16 int16_t
#define i32 int32_t
#define i64 int64_t
#define u8 uint8_t
#define u16 uint16_t
#define u32 uint32_t
#define u64 uint64_t

#define bf162 __nv_bfloat162
#define f162 __half2