CubeK Reduce
Implements a wide variety of reduction algorithms across multiple instruction sets and hardware targets for efficient tensor reduction.
Running Tests
Important Environment Variables
Test behavior is controlled by the shared CUBE_TEST_MODE env var (see cubek-test-utils).
CUBE_TEST_MODE=Correct(default): numerical errors fail the test; compilation / hardware-incompatibility errors are accepted.CUBE_TEST_MODE=Strict: both numerical and compilation errors fail the test. Useful to surface tests that are silently skipped on your hardware.CUBE_TEST_MODE=PrintAll[:<filter>]/PrintFail[:<filter>]: print tensor elements; seecubek-test-utilsdocs.
Important Feature Flags
extended: enables theCubereduction-routine strategy tests. These are slow on CPU and can stall CI, so they're opt-in.full: alias forextended(room for future growth).
Examples
# Default (fast) test run on CUDA
# Run the full suite, including Cube-strategy tests
# Fail on any silently-skipped tests
CUBE_TEST_MODE=Strict