Structs

  • | @brief A templated class to allow one to wrap | a CPU operator as a CUDA operator. | | This class can be used when one does not have | the CUDA implementation ready yet for an | operator. Essentially, what this op does is to | automatically deal with data copy for | you. Plausibly, this causes a lot of overhead | and is not optimal, so you should use this | operator mostly for quick prototyping purpose. | | All the input and output of the original | operator should be TensorCPU. | | Example usage: if you have a class MyMagicOp | that is CPU based, and you use the registration | code | | REGISTER_CPU_OPERATOR(MyMagic, MyMagicOp); | | to register the CPU side, you can create its | corresponding GPU operator (with performance | hits of course) via | | REGISTER_CUDA_OPERATOR(MyMagic, | GPUFallbackOp); | | Note that you will need to make sure that the | operators actually share the same name. | | Advanced usage: if you want to have some | specific outputs never copied, you can use the | SkipOutputCopy template argument to do that. | | For example, if MyMagic produces two outputs | and the first output is always going to live on | the CPU, you can do | | REGISTER_CUDA_OPERATOR(MyMagic, | GPUFallbackOpEx<SkipIndices<0>>);

Type Definitions