cu-embed 0.1.1

Compile CUDA kernels with nvcc, embed cubin/PTX artifacts, and load the best module at runtime.