pub trait Backend: Send + Sync {
type Error: Error + Display + Send + Sync + From<BackendWalkError> + 'static;
type Tensor: Clone + Send + Sync + 'static + Storage;
Show 33 methods
// Provided methods
fn add(
&self,
a: &Self::Tensor,
b: &Self::Tensor,
) -> Result<Self::Tensor, Self::Error> { ... }
fn sub(
&self,
a: &Self::Tensor,
b: &Self::Tensor,
) -> Result<Self::Tensor, Self::Error> { ... }
fn mul(
&self,
a: &Self::Tensor,
b: &Self::Tensor,
) -> Result<Self::Tensor, Self::Error> { ... }
fn div(
&self,
a: &Self::Tensor,
b: &Self::Tensor,
) -> Result<Self::Tensor, Self::Error> { ... }
fn neg(&self, a: &Self::Tensor) -> Result<Self::Tensor, Self::Error> { ... }
fn abs(&self, a: &Self::Tensor) -> Result<Self::Tensor, Self::Error> { ... }
fn sqrt(&self, a: &Self::Tensor) -> Result<Self::Tensor, Self::Error> { ... }
fn pow(
&self,
a: &Self::Tensor,
b: &Self::Tensor,
) -> Result<Self::Tensor, Self::Error> { ... }
fn exp(&self, a: &Self::Tensor) -> Result<Self::Tensor, Self::Error> { ... }
fn log(&self, a: &Self::Tensor) -> Result<Self::Tensor, Self::Error> { ... }
fn matmul(
&self,
a: &Self::Tensor,
b: &Self::Tensor,
) -> Result<Self::Tensor, Self::Error> { ... }
fn reduce_sum(
&self,
a: &Self::Tensor,
axes: &[i64],
keepdims: bool,
) -> Result<Self::Tensor, Self::Error> { ... }
fn reduce_mean(
&self,
a: &Self::Tensor,
axes: &[i64],
keepdims: bool,
) -> Result<Self::Tensor, Self::Error> { ... }
fn reduce_max(
&self,
a: &Self::Tensor,
axes: &[i64],
keepdims: bool,
) -> Result<Self::Tensor, Self::Error> { ... }
fn reduce_min(
&self,
a: &Self::Tensor,
axes: &[i64],
keepdims: bool,
) -> Result<Self::Tensor, Self::Error> { ... }
fn reshape(
&self,
a: &Self::Tensor,
shape: &[i64],
) -> Result<Self::Tensor, Self::Error> { ... }
fn transpose(
&self,
a: &Self::Tensor,
perm: &[i64],
) -> Result<Self::Tensor, Self::Error> { ... }
fn concat(
&self,
inputs: &[&Self::Tensor],
axis: i64,
) -> Result<Self::Tensor, Self::Error> { ... }
fn slice(
&self,
a: &Self::Tensor,
starts: &[i64],
ends: &[i64],
axes: &[i64],
steps: &[i64],
) -> Result<Self::Tensor, Self::Error> { ... }
fn split(
&self,
a: &Self::Tensor,
axis: i64,
sizes: &[i64],
) -> Result<Vec<Self::Tensor>, Self::Error> { ... }
fn squeeze(
&self,
a: &Self::Tensor,
axes: &[i64],
) -> Result<Self::Tensor, Self::Error> { ... }
fn unsqueeze(
&self,
a: &Self::Tensor,
axes: &[i64],
) -> Result<Self::Tensor, Self::Error> { ... }
fn identity(&self, a: &Self::Tensor) -> Result<Self::Tensor, Self::Error> { ... }
fn cast(
&self,
a: &Self::Tensor,
dtype: i32,
) -> Result<Self::Tensor, Self::Error> { ... }
fn equal(
&self,
a: &Self::Tensor,
b: &Self::Tensor,
) -> Result<Self::Tensor, Self::Error> { ... }
fn greater(
&self,
a: &Self::Tensor,
b: &Self::Tensor,
) -> Result<Self::Tensor, Self::Error> { ... }
fn less(
&self,
a: &Self::Tensor,
b: &Self::Tensor,
) -> Result<Self::Tensor, Self::Error> { ... }
fn where(
&self,
cond: &Self::Tensor,
t: &Self::Tensor,
f: &Self::Tensor,
) -> Result<Self::Tensor, Self::Error> { ... }
fn constant(&self, value: TensorProto) -> Result<Self::Tensor, Self::Error> { ... }
fn gather(
&self,
data: &Self::Tensor,
indices: &Self::Tensor,
axis: i64,
) -> Result<Self::Tensor, Self::Error> { ... }
fn execute(
&self,
graph: &GraphProto,
inputs: HashMap<String, Self::Tensor>,
_attrs: BackendAttrs<'_>,
) -> Result<HashMap<String, Self::Tensor>, Self::Error> { ... }
fn dispatch(
&self,
graph: &GraphProto,
inputs: HashMap<String, Self::Tensor>,
attrs: BackendAttrs<'_>,
completion: CompletionHandle<HashMap<String, Self::Tensor>, Self::Error>,
) -> ContractResponse<HashMap<String, Self::Tensor>, Self::Error> { ... }
fn materialize_from_wire(
&self,
type_hash: u64,
bytes: Vec<u8>,
) -> Result<Self::Tensor, Self::Error> { ... }
}Expand description
User-facing Contract trait for a tensor compute backend.
The Tensor associated type lets backends dispatch over their
native storage (Dense<f32>, an ndarray::ArrayD<f32>, an
opaque GPU handle, …); the framework round-trips through the
producer/consumer SlotValue carriers via the derive bridge
in crate::roles::BackendRuntime.
Self::Tensor: Clone is required because the per-op default
impls clone tensors into a temporary HashMap<String, _> to
feed Backend::execute. Backends overriding the per-op
methods directly never invoke this clone; backends overriding
execute natively pay one clone per per-op call. ndarray’s
ArrayD<f32> clones the shape + bumps an internal refcount —
a few-hundred-nanosecond cost, not a memcpy.
Required Associated Types§
Sourcetype Error: Error + Display + Send + Sync + From<BackendWalkError> + 'static
type Error: Error + Display + Send + Sync + From<BackendWalkError> + 'static
Library-maker-defined error type. The
From<BackendWalkError> bound lets the default per-op /
execute_graph_via_per_op walker surface graph-validation
failures as typed errors instead of panicking on
peer-supplied or malformed GraphProto bodies.
Provided Methods§
Sourcefn add(
&self,
a: &Self::Tensor,
b: &Self::Tensor,
) -> Result<Self::Tensor, Self::Error>
fn add( &self, a: &Self::Tensor, b: &Self::Tensor, ) -> Result<Self::Tensor, Self::Error>
Element-wise a + b with NumPy broadcasting.
Sourcefn sub(
&self,
a: &Self::Tensor,
b: &Self::Tensor,
) -> Result<Self::Tensor, Self::Error>
fn sub( &self, a: &Self::Tensor, b: &Self::Tensor, ) -> Result<Self::Tensor, Self::Error>
Element-wise a - b with NumPy broadcasting.
Sourcefn mul(
&self,
a: &Self::Tensor,
b: &Self::Tensor,
) -> Result<Self::Tensor, Self::Error>
fn mul( &self, a: &Self::Tensor, b: &Self::Tensor, ) -> Result<Self::Tensor, Self::Error>
Element-wise a * b with NumPy broadcasting.
Sourcefn div(
&self,
a: &Self::Tensor,
b: &Self::Tensor,
) -> Result<Self::Tensor, Self::Error>
fn div( &self, a: &Self::Tensor, b: &Self::Tensor, ) -> Result<Self::Tensor, Self::Error>
Element-wise a / b with NumPy broadcasting.
Sourcefn neg(&self, a: &Self::Tensor) -> Result<Self::Tensor, Self::Error>
fn neg(&self, a: &Self::Tensor) -> Result<Self::Tensor, Self::Error>
Element-wise unary negation.
Sourcefn abs(&self, a: &Self::Tensor) -> Result<Self::Tensor, Self::Error>
fn abs(&self, a: &Self::Tensor) -> Result<Self::Tensor, Self::Error>
Element-wise absolute value.
Sourcefn sqrt(&self, a: &Self::Tensor) -> Result<Self::Tensor, Self::Error>
fn sqrt(&self, a: &Self::Tensor) -> Result<Self::Tensor, Self::Error>
Element-wise square root.
Sourcefn pow(
&self,
a: &Self::Tensor,
b: &Self::Tensor,
) -> Result<Self::Tensor, Self::Error>
fn pow( &self, a: &Self::Tensor, b: &Self::Tensor, ) -> Result<Self::Tensor, Self::Error>
Element-wise a ** b with NumPy broadcasting.
Sourcefn exp(&self, a: &Self::Tensor) -> Result<Self::Tensor, Self::Error>
fn exp(&self, a: &Self::Tensor) -> Result<Self::Tensor, Self::Error>
Element-wise natural exponential.
Sourcefn log(&self, a: &Self::Tensor) -> Result<Self::Tensor, Self::Error>
fn log(&self, a: &Self::Tensor) -> Result<Self::Tensor, Self::Error>
Element-wise natural logarithm.
Sourcefn matmul(
&self,
a: &Self::Tensor,
b: &Self::Tensor,
) -> Result<Self::Tensor, Self::Error>
fn matmul( &self, a: &Self::Tensor, b: &Self::Tensor, ) -> Result<Self::Tensor, Self::Error>
Matrix multiplication (NumPy semantics: 2-D × 2-D + batched higher-rank broadcasting).
Sourcefn reduce_sum(
&self,
a: &Self::Tensor,
axes: &[i64],
keepdims: bool,
) -> Result<Self::Tensor, Self::Error>
fn reduce_sum( &self, a: &Self::Tensor, axes: &[i64], keepdims: bool, ) -> Result<Self::Tensor, Self::Error>
Sum-reduce a along axes. keepdims = true preserves
the reduced dims as length-1.
Sourcefn reduce_mean(
&self,
a: &Self::Tensor,
axes: &[i64],
keepdims: bool,
) -> Result<Self::Tensor, Self::Error>
fn reduce_mean( &self, a: &Self::Tensor, axes: &[i64], keepdims: bool, ) -> Result<Self::Tensor, Self::Error>
Mean-reduce a along axes.
Sourcefn reduce_max(
&self,
a: &Self::Tensor,
axes: &[i64],
keepdims: bool,
) -> Result<Self::Tensor, Self::Error>
fn reduce_max( &self, a: &Self::Tensor, axes: &[i64], keepdims: bool, ) -> Result<Self::Tensor, Self::Error>
Max-reduce a along axes.
Sourcefn reduce_min(
&self,
a: &Self::Tensor,
axes: &[i64],
keepdims: bool,
) -> Result<Self::Tensor, Self::Error>
fn reduce_min( &self, a: &Self::Tensor, axes: &[i64], keepdims: bool, ) -> Result<Self::Tensor, Self::Error>
Min-reduce a along axes.
Sourcefn reshape(
&self,
a: &Self::Tensor,
shape: &[i64],
) -> Result<Self::Tensor, Self::Error>
fn reshape( &self, a: &Self::Tensor, shape: &[i64], ) -> Result<Self::Tensor, Self::Error>
Reshape a to the given dims. Total element count must
match.
Sourcefn transpose(
&self,
a: &Self::Tensor,
perm: &[i64],
) -> Result<Self::Tensor, Self::Error>
fn transpose( &self, a: &Self::Tensor, perm: &[i64], ) -> Result<Self::Tensor, Self::Error>
Transpose axes. Empty perm reverses all dims.
Sourcefn concat(
&self,
inputs: &[&Self::Tensor],
axis: i64,
) -> Result<Self::Tensor, Self::Error>
fn concat( &self, inputs: &[&Self::Tensor], axis: i64, ) -> Result<Self::Tensor, Self::Error>
Concatenate inputs along axis.
Sourcefn slice(
&self,
a: &Self::Tensor,
starts: &[i64],
ends: &[i64],
axes: &[i64],
steps: &[i64],
) -> Result<Self::Tensor, Self::Error>
fn slice( &self, a: &Self::Tensor, starts: &[i64], ends: &[i64], axes: &[i64], steps: &[i64], ) -> Result<Self::Tensor, Self::Error>
NumPy-style slice. Empty axes defaults to all dims;
empty steps defaults to 1 per axis.
Sourcefn split(
&self,
a: &Self::Tensor,
axis: i64,
sizes: &[i64],
) -> Result<Vec<Self::Tensor>, Self::Error>
fn split( &self, a: &Self::Tensor, axis: i64, sizes: &[i64], ) -> Result<Vec<Self::Tensor>, Self::Error>
Split a along axis into parts of the given sizes.
Empty sizes means equal-sized splits (count comes from
the consumer side downstream).
Sourcefn squeeze(
&self,
a: &Self::Tensor,
axes: &[i64],
) -> Result<Self::Tensor, Self::Error>
fn squeeze( &self, a: &Self::Tensor, axes: &[i64], ) -> Result<Self::Tensor, Self::Error>
Remove dimensions of size 1. Empty axes removes all
size-1 dims.
Sourcefn unsqueeze(
&self,
a: &Self::Tensor,
axes: &[i64],
) -> Result<Self::Tensor, Self::Error>
fn unsqueeze( &self, a: &Self::Tensor, axes: &[i64], ) -> Result<Self::Tensor, Self::Error>
Insert dimensions of size 1 at the given axes.
Sourcefn identity(&self, a: &Self::Tensor) -> Result<Self::Tensor, Self::Error>
fn identity(&self, a: &Self::Tensor) -> Result<Self::Tensor, Self::Error>
Identity / clone — pass-through useful for graph rewrites.
Sourcefn cast(
&self,
a: &Self::Tensor,
dtype: i32,
) -> Result<Self::Tensor, Self::Error>
fn cast( &self, a: &Self::Tensor, dtype: i32, ) -> Result<Self::Tensor, Self::Error>
Cast to the given ONNX DataType enum value (matches
bb_ir::proto::onnx::tensor_proto::DataType).
Sourcefn equal(
&self,
a: &Self::Tensor,
b: &Self::Tensor,
) -> Result<Self::Tensor, Self::Error>
fn equal( &self, a: &Self::Tensor, b: &Self::Tensor, ) -> Result<Self::Tensor, Self::Error>
Element-wise a == b. Result is boolean-typed.
Sourcefn greater(
&self,
a: &Self::Tensor,
b: &Self::Tensor,
) -> Result<Self::Tensor, Self::Error>
fn greater( &self, a: &Self::Tensor, b: &Self::Tensor, ) -> Result<Self::Tensor, Self::Error>
Element-wise a > b. Result is boolean-typed.
Sourcefn less(
&self,
a: &Self::Tensor,
b: &Self::Tensor,
) -> Result<Self::Tensor, Self::Error>
fn less( &self, a: &Self::Tensor, b: &Self::Tensor, ) -> Result<Self::Tensor, Self::Error>
Element-wise a < b. Result is boolean-typed.
Sourcefn where(
&self,
cond: &Self::Tensor,
t: &Self::Tensor,
f: &Self::Tensor,
) -> Result<Self::Tensor, Self::Error>
fn where( &self, cond: &Self::Tensor, t: &Self::Tensor, f: &Self::Tensor, ) -> Result<Self::Tensor, Self::Error>
Element-wise ternary: where cond { t } else { f }.
Named r#where to dodge the reserved Rust keyword.
Sourcefn constant(&self, value: TensorProto) -> Result<Self::Tensor, Self::Error>
fn constant(&self, value: TensorProto) -> Result<Self::Tensor, Self::Error>
Materialize a constant from an ONNX TensorProto. The
value attribute on the ONNX Constant op carries the
data; rank, dtype, raw bytes all come from the proto.
Sourcefn gather(
&self,
data: &Self::Tensor,
indices: &Self::Tensor,
axis: i64,
) -> Result<Self::Tensor, Self::Error>
fn gather( &self, data: &Self::Tensor, indices: &Self::Tensor, axis: i64, ) -> Result<Self::Tensor, Self::Error>
Gather slices of data along axis indexed by indices.
Sourcefn execute(
&self,
graph: &GraphProto,
inputs: HashMap<String, Self::Tensor>,
_attrs: BackendAttrs<'_>,
) -> Result<HashMap<String, Self::Tensor>, Self::Error>
fn execute( &self, graph: &GraphProto, inputs: HashMap<String, Self::Tensor>, _attrs: BackendAttrs<'_>, ) -> Result<HashMap<String, Self::Tensor>, Self::Error>
Execute every NodeProto in graph.node against the value
env inputs. Returns the subset of values named in
graph.output.
graph.node is topologically ordered per the ONNX spec,
so the default walker (a linear scan) suffices for any
GraphProto whose ops are all in
bb_ir::tensor_primitives::TENSOR_PRIMITIVES_OPS. A
backend overriding this method may detect fused patterns,
compile to GPU, or any other strategy.
Sourcefn dispatch(
&self,
graph: &GraphProto,
inputs: HashMap<String, Self::Tensor>,
attrs: BackendAttrs<'_>,
completion: CompletionHandle<HashMap<String, Self::Tensor>, Self::Error>,
) -> ContractResponse<HashMap<String, Self::Tensor>, Self::Error>
fn dispatch( &self, graph: &GraphProto, inputs: HashMap<String, Self::Tensor>, attrs: BackendAttrs<'_>, completion: CompletionHandle<HashMap<String, Self::Tensor>, Self::Error>, ) -> ContractResponse<HashMap<String, Self::Tensor>, Self::Error>
Dispatch a BackendSubgraph carrier — the engine-facing entry
point for whole-subgraph execution.
The default falls through to Self::execute synchronously,
keeping existing backends’ behaviour identical. Backends with
per-subgraph caching, JIT compilation, or async device execution
override this to:
- Cache the compiled subgraph by identity (e.g. graph name or hash).
- Return
ContractResponse::Laterand retaincompletionwhile the device runs. The engine schedules other work; the backend completes the handle from whatever runtime it uses —std::thread, tokio task, custom event loop, single- thread no-std loop. - Fall through to
Self::executeon compile failure or unsupported op.
The completion parameter in the default impl is intentionally
discarded (let _ = completion) because ContractResponse::Now
does not retain the handle. This is correct — only overriders
that return ContractResponse::Later must hold it.
Sourcefn materialize_from_wire(
&self,
type_hash: u64,
bytes: Vec<u8>,
) -> Result<Self::Tensor, Self::Error>
fn materialize_from_wire( &self, type_hash: u64, bytes: Vec<u8>, ) -> Result<Self::Tensor, Self::Error>
Materialise an inbound tensor SlotFill into this backend’s
native tensor representation.
The framework has already (a) capped bytes.len() against the
envelope’s EnvelopeCaps::max_per_fill_bytes, (b) charged the
length against NodeConfig::ingress_byte_budget, and (c) moved
ownership of the wire bytes into this call. The backend may
adopt the Vec<u8> directly (zero-copy via
ArrayD::from_shape_vec when alignment permits), pull a buffer
from a pool and copy in, or allocate fresh. The framework will
not touch bytes after this call returns.
The default delegates to the global wire-decoder registry: it
looks up the decoder for type_hash, runs it on the bytes,
then downcasts the resulting boxed SlotValue to Self::Tensor
via the registry’s Box<dyn Any> repackaging. Backends that
have not implemented tensor pooling continue to work through
this path; backends that override pay the registry hop only at
override time.
On Err, the engine drops the fill, releases the byte charge,
and emits WireReceiveError { kind: BackendMaterializeFailed }.
Ownership note: bytes: Vec<u8> by value (not &[u8] or
Cow). This is the framework→backend handoff, NOT an external
boundary — the backend lives inside the framework ecosystem
and plays by the runtime contract. Principle 1a (ephemeral
borrowed slices at external boundaries) does not apply here:
the framework copied or owned the bytes already, and a backend
that wants to adopt them (zero-copy) needs ownership.
Dyn Compatibility§
This trait is dyn compatible.
In older versions of Rust, dyn compatibility was called "object safety".