Macro base_do

Source
macro_rules! base_do {
    ($funcname:ident, $($arg:ident),*) => { ... };
}
Expand description

| Common utilities for writing performance | kernels and easy dispatching of different | backends. |

| The general workflow shall be as follows, say we | want to implement a functionality called void | foo(int a, float b). | | In foo.h, do: | void foo(int a, float b); | | In foo_avx512.cc, do: | void foo__avx512(int a, float b) { | [actual avx512 implementation] | } | | In foo_avx2.cc, do: | void foo__avx2(int a, float b) { | [actual avx2 implementation] | } | | In foo_avx.cc, do: | void foo__avx(int a, float b) { | [actual avx implementation] | } | | In foo.cc, do: | // The base implementation should always be provided. | void foo__base(int a, float b) { | [base, possibly slow implementation] | } | decltype(foo__base) foo__avx512; | decltype(foo__base) foo__avx2; | decltype(foo__base) foo__avx; | void foo(int a, float b) { | // You should always order things by their preference, faster | // implementations earlier in the function. | AVX512_DO(foo, a, b); | AVX2_DO(foo, a, b); | AVX_DO(foo, a, b); | BASE_DO(foo, a, b); | } |

| Details: this functionality basically covers | the cases for both build time and run time | architecture support. | | During build time: | | The build system should provide flags | CAFFE2_PERF_WITH_AVX512, | CAFFE2_PERF_WITH_AVX2, and | CAFFE2_PERF_WITH_AVX that corresponds to the | AVX512F, AVX512DQ, AVX512VL, | AVX2, and AVX flags the compiler | provides. Note that we do not use the | compiler flags but rely on the build system | flags, because the common files (like foo.cc | above) will always be built without | AVX512F, AVX512DQ, AVX512VL, | AVX2 and AVX. | | During run time: | | we use cpuinfo to identify cpu support and | run the proper functions. | | DO macros: these should be used in your entry | function, similar to foo() above, that routes | implementations based on CPU capability.