N-dimensional tensor contractions — Einstein summation via TTGT.
This crate provides a dense tensor type and Einstein summation (einsum)
that reduces tensor contractions to explicit loops, following the TTGT
(Transpose-Transpose-GEMM-Transpose) strategy for cuTENSOR parity.