tl_cuda 0.4.0

CUDA GPU tensor library for TL