1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
use *;
use crateScalar;
use crate;
use crateRawTensor;
/// Backend for tensor operations.
///
/// `Simulation`: interprets operations via mapping expressions on the host CPU (default).
/// `Emulation`: host-side `BufRawTensor` storage for the future Cpu+Buffer interpreter; today
/// every value-producing method is `todo!()` placeholder.
/// `Npu`: host-side code owns native staging buffers and performs DMA in `to_hbm` / `from_hbm`.
/// It does not interpret tensor math on the host; storage-level methods (on `BufRawTensor`)
/// are `todo!()` placeholders shared with the Emulation backend.
/// `Typecheck`: shape/mapping validation only, no value-level loops on host
/// (`--cfg backend="typecheck"`). Returns empty (phantom) tensors throughout.
///
/// `Backend` selects the concrete storage type (`Self::RawTensor<D>`) and supplies only the
/// genuinely cross-tensor, backend-specific protocols: DMA (`to_hbm` / `from_hbm`). Per-operation
/// methods (zip_with, write_scatter, map, reduce, apply_branch_operands, …) live on
/// [`crate::tensor::raw::RawTensor`] — they only need RawTensor-level primitives
/// (`read_index` / `write_index` / `uninit_from_axes`), not Backend state.