Skip to main content

Backend

Trait Backend 

Source
pub trait Backend: Send + Sync {
    type Error: Error + Display + Send + Sync + From<BackendWalkError> + 'static;
    type Tensor: Clone + Send + Sync + 'static + Storage;

Show 33 methods // Provided methods fn add( &self, a: &Self::Tensor, b: &Self::Tensor, ) -> Result<Self::Tensor, Self::Error> { ... } fn sub( &self, a: &Self::Tensor, b: &Self::Tensor, ) -> Result<Self::Tensor, Self::Error> { ... } fn mul( &self, a: &Self::Tensor, b: &Self::Tensor, ) -> Result<Self::Tensor, Self::Error> { ... } fn div( &self, a: &Self::Tensor, b: &Self::Tensor, ) -> Result<Self::Tensor, Self::Error> { ... } fn neg(&self, a: &Self::Tensor) -> Result<Self::Tensor, Self::Error> { ... } fn abs(&self, a: &Self::Tensor) -> Result<Self::Tensor, Self::Error> { ... } fn sqrt(&self, a: &Self::Tensor) -> Result<Self::Tensor, Self::Error> { ... } fn pow( &self, a: &Self::Tensor, b: &Self::Tensor, ) -> Result<Self::Tensor, Self::Error> { ... } fn exp(&self, a: &Self::Tensor) -> Result<Self::Tensor, Self::Error> { ... } fn log(&self, a: &Self::Tensor) -> Result<Self::Tensor, Self::Error> { ... } fn matmul( &self, a: &Self::Tensor, b: &Self::Tensor, ) -> Result<Self::Tensor, Self::Error> { ... } fn reduce_sum( &self, a: &Self::Tensor, axes: &[i64], keepdims: bool, ) -> Result<Self::Tensor, Self::Error> { ... } fn reduce_mean( &self, a: &Self::Tensor, axes: &[i64], keepdims: bool, ) -> Result<Self::Tensor, Self::Error> { ... } fn reduce_max( &self, a: &Self::Tensor, axes: &[i64], keepdims: bool, ) -> Result<Self::Tensor, Self::Error> { ... } fn reduce_min( &self, a: &Self::Tensor, axes: &[i64], keepdims: bool, ) -> Result<Self::Tensor, Self::Error> { ... } fn reshape( &self, a: &Self::Tensor, shape: &[i64], ) -> Result<Self::Tensor, Self::Error> { ... } fn transpose( &self, a: &Self::Tensor, perm: &[i64], ) -> Result<Self::Tensor, Self::Error> { ... } fn concat( &self, inputs: &[&Self::Tensor], axis: i64, ) -> Result<Self::Tensor, Self::Error> { ... } fn slice( &self, a: &Self::Tensor, starts: &[i64], ends: &[i64], axes: &[i64], steps: &[i64], ) -> Result<Self::Tensor, Self::Error> { ... } fn split( &self, a: &Self::Tensor, axis: i64, sizes: &[i64], ) -> Result<Vec<Self::Tensor>, Self::Error> { ... } fn squeeze( &self, a: &Self::Tensor, axes: &[i64], ) -> Result<Self::Tensor, Self::Error> { ... } fn unsqueeze( &self, a: &Self::Tensor, axes: &[i64], ) -> Result<Self::Tensor, Self::Error> { ... } fn identity(&self, a: &Self::Tensor) -> Result<Self::Tensor, Self::Error> { ... } fn cast( &self, a: &Self::Tensor, dtype: i32, ) -> Result<Self::Tensor, Self::Error> { ... } fn equal( &self, a: &Self::Tensor, b: &Self::Tensor, ) -> Result<Self::Tensor, Self::Error> { ... } fn greater( &self, a: &Self::Tensor, b: &Self::Tensor, ) -> Result<Self::Tensor, Self::Error> { ... } fn less( &self, a: &Self::Tensor, b: &Self::Tensor, ) -> Result<Self::Tensor, Self::Error> { ... } fn where( &self, cond: &Self::Tensor, t: &Self::Tensor, f: &Self::Tensor, ) -> Result<Self::Tensor, Self::Error> { ... } fn constant(&self, value: TensorProto) -> Result<Self::Tensor, Self::Error> { ... } fn gather( &self, data: &Self::Tensor, indices: &Self::Tensor, axis: i64, ) -> Result<Self::Tensor, Self::Error> { ... } fn execute( &self, graph: &GraphProto, inputs: HashMap<String, Self::Tensor>, _attrs: BackendAttrs<'_>, ) -> Result<HashMap<String, Self::Tensor>, Self::Error> { ... } fn dispatch( &self, graph: &GraphProto, inputs: HashMap<String, Self::Tensor>, attrs: BackendAttrs<'_>, completion: CompletionHandle<HashMap<String, Self::Tensor>, Self::Error>, ) -> ContractResponse<HashMap<String, Self::Tensor>, Self::Error> { ... } fn materialize_from_wire( &self, type_hash: u64, bytes: Vec<u8>, ) -> Result<Self::Tensor, Self::Error> { ... }
}
Expand description

User-facing Contract trait for a tensor compute backend.

The Tensor associated type lets backends dispatch over their native storage (Dense<f32>, an ndarray::ArrayD<f32>, an opaque GPU handle, …); the framework round-trips through the producer/consumer SlotValue carriers via the derive bridge in crate::roles::BackendRuntime.

Self::Tensor: Clone is required because the per-op default impls clone tensors into a temporary HashMap<String, _> to feed Backend::execute. Backends overriding the per-op methods directly never invoke this clone; backends overriding execute natively pay one clone per per-op call. ndarray’s ArrayD<f32> clones the shape + bumps an internal refcount — a few-hundred-nanosecond cost, not a memcpy.

Required Associated Types§

Source

type Error: Error + Display + Send + Sync + From<BackendWalkError> + 'static

Library-maker-defined error type. The From<BackendWalkError> bound lets the default per-op / execute_graph_via_per_op walker surface graph-validation failures as typed errors instead of panicking on peer-supplied or malformed GraphProto bodies.

Source

type Tensor: Clone + Send + Sync + 'static + Storage

Native tensor representation.

Provided Methods§

Source

fn add( &self, a: &Self::Tensor, b: &Self::Tensor, ) -> Result<Self::Tensor, Self::Error>

Element-wise a + b with NumPy broadcasting.

Source

fn sub( &self, a: &Self::Tensor, b: &Self::Tensor, ) -> Result<Self::Tensor, Self::Error>

Element-wise a - b with NumPy broadcasting.

Source

fn mul( &self, a: &Self::Tensor, b: &Self::Tensor, ) -> Result<Self::Tensor, Self::Error>

Element-wise a * b with NumPy broadcasting.

Source

fn div( &self, a: &Self::Tensor, b: &Self::Tensor, ) -> Result<Self::Tensor, Self::Error>

Element-wise a / b with NumPy broadcasting.

Source

fn neg(&self, a: &Self::Tensor) -> Result<Self::Tensor, Self::Error>

Element-wise unary negation.

Source

fn abs(&self, a: &Self::Tensor) -> Result<Self::Tensor, Self::Error>

Element-wise absolute value.

Source

fn sqrt(&self, a: &Self::Tensor) -> Result<Self::Tensor, Self::Error>

Element-wise square root.

Source

fn pow( &self, a: &Self::Tensor, b: &Self::Tensor, ) -> Result<Self::Tensor, Self::Error>

Element-wise a ** b with NumPy broadcasting.

Source

fn exp(&self, a: &Self::Tensor) -> Result<Self::Tensor, Self::Error>

Element-wise natural exponential.

Source

fn log(&self, a: &Self::Tensor) -> Result<Self::Tensor, Self::Error>

Element-wise natural logarithm.

Source

fn matmul( &self, a: &Self::Tensor, b: &Self::Tensor, ) -> Result<Self::Tensor, Self::Error>

Matrix multiplication (NumPy semantics: 2-D × 2-D + batched higher-rank broadcasting).

Source

fn reduce_sum( &self, a: &Self::Tensor, axes: &[i64], keepdims: bool, ) -> Result<Self::Tensor, Self::Error>

Sum-reduce a along axes. keepdims = true preserves the reduced dims as length-1.

Source

fn reduce_mean( &self, a: &Self::Tensor, axes: &[i64], keepdims: bool, ) -> Result<Self::Tensor, Self::Error>

Mean-reduce a along axes.

Source

fn reduce_max( &self, a: &Self::Tensor, axes: &[i64], keepdims: bool, ) -> Result<Self::Tensor, Self::Error>

Max-reduce a along axes.

Source

fn reduce_min( &self, a: &Self::Tensor, axes: &[i64], keepdims: bool, ) -> Result<Self::Tensor, Self::Error>

Min-reduce a along axes.

Source

fn reshape( &self, a: &Self::Tensor, shape: &[i64], ) -> Result<Self::Tensor, Self::Error>

Reshape a to the given dims. Total element count must match.

Source

fn transpose( &self, a: &Self::Tensor, perm: &[i64], ) -> Result<Self::Tensor, Self::Error>

Transpose axes. Empty perm reverses all dims.

Source

fn concat( &self, inputs: &[&Self::Tensor], axis: i64, ) -> Result<Self::Tensor, Self::Error>

Concatenate inputs along axis.

Source

fn slice( &self, a: &Self::Tensor, starts: &[i64], ends: &[i64], axes: &[i64], steps: &[i64], ) -> Result<Self::Tensor, Self::Error>

NumPy-style slice. Empty axes defaults to all dims; empty steps defaults to 1 per axis.

Source

fn split( &self, a: &Self::Tensor, axis: i64, sizes: &[i64], ) -> Result<Vec<Self::Tensor>, Self::Error>

Split a along axis into parts of the given sizes. Empty sizes means equal-sized splits (count comes from the consumer side downstream).

Source

fn squeeze( &self, a: &Self::Tensor, axes: &[i64], ) -> Result<Self::Tensor, Self::Error>

Remove dimensions of size 1. Empty axes removes all size-1 dims.

Source

fn unsqueeze( &self, a: &Self::Tensor, axes: &[i64], ) -> Result<Self::Tensor, Self::Error>

Insert dimensions of size 1 at the given axes.

Source

fn identity(&self, a: &Self::Tensor) -> Result<Self::Tensor, Self::Error>

Identity / clone — pass-through useful for graph rewrites.

Source

fn cast( &self, a: &Self::Tensor, dtype: i32, ) -> Result<Self::Tensor, Self::Error>

Cast to the given ONNX DataType enum value (matches bb_ir::proto::onnx::tensor_proto::DataType).

Source

fn equal( &self, a: &Self::Tensor, b: &Self::Tensor, ) -> Result<Self::Tensor, Self::Error>

Element-wise a == b. Result is boolean-typed.

Source

fn greater( &self, a: &Self::Tensor, b: &Self::Tensor, ) -> Result<Self::Tensor, Self::Error>

Element-wise a > b. Result is boolean-typed.

Source

fn less( &self, a: &Self::Tensor, b: &Self::Tensor, ) -> Result<Self::Tensor, Self::Error>

Element-wise a < b. Result is boolean-typed.

Source

fn where( &self, cond: &Self::Tensor, t: &Self::Tensor, f: &Self::Tensor, ) -> Result<Self::Tensor, Self::Error>

Element-wise ternary: where cond { t } else { f }. Named r#where to dodge the reserved Rust keyword.

Source

fn constant(&self, value: TensorProto) -> Result<Self::Tensor, Self::Error>

Materialize a constant from an ONNX TensorProto. The value attribute on the ONNX Constant op carries the data; rank, dtype, raw bytes all come from the proto.

Source

fn gather( &self, data: &Self::Tensor, indices: &Self::Tensor, axis: i64, ) -> Result<Self::Tensor, Self::Error>

Gather slices of data along axis indexed by indices.

Source

fn execute( &self, graph: &GraphProto, inputs: HashMap<String, Self::Tensor>, _attrs: BackendAttrs<'_>, ) -> Result<HashMap<String, Self::Tensor>, Self::Error>

Execute every NodeProto in graph.node against the value env inputs. Returns the subset of values named in graph.output.

graph.node is topologically ordered per the ONNX spec, so the default walker (a linear scan) suffices for any GraphProto whose ops are all in bb_ir::tensor_primitives::TENSOR_PRIMITIVES_OPS. A backend overriding this method may detect fused patterns, compile to GPU, or any other strategy.

Source

fn dispatch( &self, graph: &GraphProto, inputs: HashMap<String, Self::Tensor>, attrs: BackendAttrs<'_>, completion: CompletionHandle<HashMap<String, Self::Tensor>, Self::Error>, ) -> ContractResponse<HashMap<String, Self::Tensor>, Self::Error>

Dispatch a BackendSubgraph carrier — the engine-facing entry point for whole-subgraph execution.

The default falls through to Self::execute synchronously, keeping existing backends’ behaviour identical. Backends with per-subgraph caching, JIT compilation, or async device execution override this to:

  • Cache the compiled subgraph by identity (e.g. graph name or hash).
  • Return ContractResponse::Later and retain completion while the device runs. The engine schedules other work; the backend completes the handle from whatever runtime it uses — std::thread, tokio task, custom event loop, single- thread no-std loop.
  • Fall through to Self::execute on compile failure or unsupported op.

The completion parameter in the default impl is intentionally discarded (let _ = completion) because ContractResponse::Now does not retain the handle. This is correct — only overriders that return ContractResponse::Later must hold it.

Source

fn materialize_from_wire( &self, type_hash: u64, bytes: Vec<u8>, ) -> Result<Self::Tensor, Self::Error>

Materialise an inbound tensor SlotFill into this backend’s native tensor representation.

The framework has already (a) capped bytes.len() against the envelope’s EnvelopeCaps::max_per_fill_bytes, (b) charged the length against NodeConfig::ingress_byte_budget, and (c) moved ownership of the wire bytes into this call. The backend may adopt the Vec<u8> directly (zero-copy via ArrayD::from_shape_vec when alignment permits), pull a buffer from a pool and copy in, or allocate fresh. The framework will not touch bytes after this call returns.

The default delegates to the global wire-decoder registry: it looks up the decoder for type_hash, runs it on the bytes, then downcasts the resulting boxed SlotValue to Self::Tensor via the registry’s Box<dyn Any> repackaging. Backends that have not implemented tensor pooling continue to work through this path; backends that override pay the registry hop only at override time.

On Err, the engine drops the fill, releases the byte charge, and emits WireReceiveError { kind: BackendMaterializeFailed }.

Ownership note: bytes: Vec<u8> by value (not &[u8] or Cow). This is the framework→backend handoff, NOT an external boundary — the backend lives inside the framework ecosystem and plays by the runtime contract. Principle 1a (ephemeral borrowed slices at external boundaries) does not apply here: the framework copied or owned the bytes already, and a backend that wants to adopt them (zero-copy) needs ownership.

Dyn Compatibility§

This trait is dyn compatible.

In older versions of Rust, dyn compatibility was called "object safety".

Implementors§

Source§

impl Backend for CpuBackend

bb::Backend Contract impl. Overrides execute to run through graph_walker::execute_graph rather than the default per-op walker.