cubek-test-utils
Shared building blocks for kernel tests in CubeK: test-tensor builders, host-side reference comparisons, and a unified renderer that pretty-prints tensors (or diffs them) under a single config.
Configuration: cubek.toml
There is one place to configure test behavior: a cubek.toml file at
the workspace root. The file is read once at process start; the loader
walks up from the current working directory until it finds it.
The shipped cubek.toml documents every field with comments. Two sections,
nothing more:
[]
= "correct" # "correct" | "strict" | "fail-if-run"
[]
= false # toggle all printing
= "table" # "table" | "lines"
= true # fail every test that prints, so cargo shows stdout
= false # diff: only render cells where Δ > ε
= false # diff: render `got/expected` per cell (else just `got`)
= "" # per-axis filter, same DSL as the slice helper
The whole pipeline obeys one rule: if enabled = false, nothing prints.
Set it to true, run a test, watch your tensors render. That's it.
[test] policy
| Policy | No error | Numerical error | Compilation error |
|---|---|---|---|
correct |
accept | fail | accept |
strict |
accept | fail | fail |
fail-if-run |
fail | accept | accept |
If [print] force-fail = true, every passing test that printed is
additionally rejected — useful so cargo surfaces the dump (it otherwise
swallows stdout from passing tests). Compile errors are also rejected when
force-fail = true, regardless of policy.
Rendering: one path for everything
Both assert_equals_approx(actual, expected, ε) and the free
print_tensors(label, &[&a, &b], Some(ε)) go through the same renderer.
There is no "diff path" vs "pretty-print path"; comparing actual-vs-expected
and pretty-printing two unrelated same-shape tensors are literally the same
call.
Rules:
- One tensor → just values, no color.
- Two tensors of the same rank and shape → cells colored green
(
Δ ≤ ε) or red (Δ > ε). Withshow-expected = truethe cell showsgot/expected; otherwise justgot. - Two tensors of different rank or shape → silently skipped. The renderer never panics on bad input.
- Filter rank ≠ tensor rank → silently skipped.
use print_tensors;
// Single tensor — table or lines per [print] view, no color.
print_tensors;
// Two tensors — colored diff. Same path used by assert_equals_approx.
print_tensors;
The table view never shows Δ/ε numbers (cell color carries the info). The lines view always shows them.
Table view example (with show-expected = true)
=== diff shape=[2, 3] ===
| 0 1 2
----+------------------------------------------------------
0 | 0.000000/0.000000 1.000000/1.000000 2.000000/2.000000 ← green
1 | 4.000000/3.000000 5.000000/4.000000 6.000000/5.000000 ← red
Table view + fail-only = true
=== diff shape=[2, 3] ===
| 0 1 2
----+---------------------------
0 | ← matching cells blanked out
1 | 4.000000 5.000000 6.000000 ← red
Lines view + fail-only = true
=== diff shape=[2, 3] ===
index | got | expected | Δ | ε | status
-----------------------------------------------------------
[1, 0] | 4.000000 | 3.000000 | 1.000000 | 0.003000 | FAIL ← red
[1, 1] | 5.000000 | 4.000000 | 1.000000 | 0.004000 | FAIL ← red
[1, 2] | 6.000000 | 5.000000 | 1.000000 | 0.005000 | FAIL ← red
Filter syntax
Used by both [print] filter and assert_equals_approx_in_slice. A
comma-separated list of dim entries:
.— wildcard (any index along that dim)N— a single indexM-K— inclusive range
Example for a 4-D tensor: .,.,10-20,30 selects all elements where
dim 2 is in 10..=20 and dim 3 is exactly 30. Filter rank must equal
tensor rank.
From Rust:
use ;
// Vec<Range<usize>> works (half-open, like Rust slices).
assert_equals_approx_in_slice;
// Or build the canonical TensorFilter explicitly.
let filter = vec!;
assert_equals_approx_in_slice;
parse_tensor_filter("0,0-2") parses the string DSL into a TensorFilter.
Failure messages
assert_equals_approx collects up to 8 mismatches plus aggregate
stats and reports them in the test panic message:
Test failed: Got incorrect results: 17/4096 elements mismatched
(max |Δ|=0.014648, mean |Δ|=0.004112, worst at [3, 12]) — shape=[16, 256]
First mismatches:
[0, 5]: got 1.234, expected 1.220, |Δ|=0.014 > ε=0.001
...
... and 9 more
When printing is enabled the per-element output goes to stdout; the panic message keeps only the aggregate header so it doesn't duplicate the dump.
Test suites
Four suites are available:
- Light — tractable subset that runs on CI.
- Basic — smoke tests including heavy-tagged variants; may hang on CI (slow on CPU).
- Extended — auto-generated combinatorial tests, kept tractable.
- Full — all generable combinations, may not fit.
# Replace <runtime> with cpu, cuda, rocm, wgpu, vulkan or metal.
Building test inputs
Two equivalent ways to construct a test tensor:
use ;
// Long-form constructor.
let = new
.generate_with_f32_host_data;
// Fluent builder — `dtype` defaults to f32, `stride` defaults to RowMajor.
let = builder
.uniform
.generate_with_f32_host_data;
Builder setters (all optional):
| Setter | Default | Effect |
|---|---|---|
.dtype(d) |
f32 |
Override the input dtype. |
.stride(spec) |
StrideSpec::RowMajor |
Override the stride layout. |
Builder finalizers (each returns a TestInput ready to generate):
| Finalizer | Equivalent DataKind |
|---|---|
.arange() |
Arange { scale: None } |
.arange_scaled(s) |
Arange { scale: Some(s) } |
.eye() |
Eye |
.zeros() |
Zeros |
.uniform(seed, lo, hi) |
Random { Uniform(lo, hi) } |
.bernoulli(seed, p) |
Random { Bernoulli(p) } |
.normal(seed, mean, std) |
Random { Normal { mean, std } } |
.random(seed, dist) |
Random { dist } |
.linspace(start, end) |
Custom { data } with N evenly-spaced values from start..=end |
.custom(data) |
Custom { data } |
After a finalizer, call any of: .generate(), .generate_with_f32_host_data(),
.generate_with_bool_host_data(), .generate_test_tensor(),
.f32_host_data(), .bool_host_data().