# Roles — the Contract reference
This document describes the role traits that library authors implement
when shipping a concrete component for the `bytesandbrains` framework.
The **user-facing surface** is the Contract trait family in
[`bb-runtime/src/contracts/`](../bb-runtime/src/contracts/):
`bb::Aggregator`, `bb::Backend`, `bb::Codec`, `bb::DataSource`,
`bb::Index`, `bb::Model`, `bb::PeerSelector`, plus the `Protocol` slot.
Authors implement Contract methods on their concrete struct and pair
the impl with `#[derive(bb::<Role>)]` (or, for protocols, the
`register_protocol!{}` declarative macro). The framework-internal
`<Role>Runtime` traits in `bb-runtime/src/roles/` are bridges the
derive emits — they are not part of the public authoring surface.
## Part 1 — Overview
Each Contract is a small trait carrying a `type Error`, one method
per operation the role surfaces, and (for `bb::Aggregator`) an
associated `type Metadata` that travels alongside the tensor. Every
method takes the same trio (`Backend` excepted — see Part 4):
1. `ctx: &mut RuntimeResourceRef<'_>` — the engine's per-dispatch
runtime surface. Impls reach declared `#[depends(<role> =
"<slot>")]` siblings through `ctx.dependency::<T>("<slot>")`;
`PeerSelector` additionally walks the local `AddressBook` via
`ctx.peers.addresses` and plans delayed work via `ctx.time`.
2. The op's typed inputs.
3. `completion: CompletionHandle<R, Self::Error>`.
The method returns `ContractResponse<R, Self::Error>` (inline
`Now` vs deferred `Later`).
Every tensor-carrying Contract — `Index`, `Aggregator`, `Model`,
`DataSource`, `Backend`, `Codec` — carries a `Storage`-bound
associated type that declares where in the tensor-type tree this
concrete sits (`TYPE_TENSOR_F32`, `TYPE_TENSOR_U8`, the generic
`TYPE_TENSOR` root, or any user-registered leaf). The compiler reads
`Storage::TYPE` at bind time to stamp `value_info` denotations; the
type solver walks the graph and refuses unbridged mismatches at
compile time. See [`docs/TYPES.md`](TYPES.md) for the `Storage`
trait and `AnyTensor` definitions.
Component authors do not emit `NodeProto`s directly. The DSL surface
that records role calls into a `GraphProto` lives in the **typed
placeholders** under
[`bb-ops/src/placeholders/mod.rs`](../bb-ops/src/placeholders/mod.rs):
the `Backend`, `Model`, `Index`, `Aggregator`, `Codec`,
`DataLoader`, `PeerSelector`, and `Protocol` unit structs. Each
placeholder method records a `NodeProto` under the role's opset
domain (`ai.bytesandbrains.role.<name>`, or `ai.onnx` for Backend)
and stamps `(required_trait, slot_id)` metadata so the compiler can
bind the call to the right concrete impl. Library authors that ship
their own concrete type expose its DSL methods as inherent methods
on the type; those follow the same recording shape.
## Part 2 — The Common Contract Shape
Every Contract method follows the same skeleton:
```rust
fn op(
&mut self, // (or &self for read-only ops)
ctx: &mut RuntimeResourceRef<'_>, // per-dispatch runtime surface
args: …, // typed args
completion: CompletionHandle<R, Self::Error>,
) -> ContractResponse<R, Self::Error>;
```
- `ctx` exposes `ctx.dependency::<T>("<slot>")` for declared
`#[depends(...)]` siblings, `ctx.open_completion::<R, E>()` for
minting fresh completion handles, plus the address book / scheduler
/ bus surface used by selectors and protocols.
- `ContractResponse::Now(Ok(value))` — the result is ready inline.
The framework returns
`DispatchResult::Immediate(vec![(port, Box::new(value) as Box<dyn SlotValue>)])`
— `value` is dropped into the slot table as `Box<dyn SlotValue>` with
no serialization at this boundary; downstream ops downcast via
`as_any` — skips the park / ingress-drain cycle, and ignores
`completion`.
- `ContractResponse::Now(Err(e))` — the call failed synchronously.
The error is propagated as the dispatch error.
- `ContractResponse::Later` — the impl retained `completion` (handed
it to a worker thread, spawned a task, queued a remote RPC). The
framework returns `DispatchResult::Async(handle.cmd_id())` and
parks the dispatched op until `completion.complete(result)` arrives
off-thread.
The bridge generated by `#[derive(bb::<Role>)]` wires Contract
methods into the engine's `<Role>Runtime::dispatch_atomic` entry
point — the per-Node dispatch table routes
`(domain, op_type, instance)` tuples through the bridge to the right
Contract method on the bound concrete, forwarding the bridge's `ctx`
parameter into each Contract call.
Contract methods invoked from a `Module::bootstrap` recording run
through the identical dispatch path as body-phase calls — the engine
seeds bootstrap function bodies under a fresh `ExecId`, and the
per-component `is_op_locked` gate
(`bb-runtime/src/engine/core.rs:1762-1806`) parks body-phase ops
that touch any in-flight bootstrap's `ComponentRef` touch set.
Disjoint components keep firing alongside the bootstrap. See
[ENGINE.md §6.8](ENGINE.md#68-host-driven-bootstrap-entry).
`Backend` is the lone exception: its per-op surface, `execute`, and
`dispatch` all stay `ctx`-free for borrow-checker reasons (Part 4).
### Role bindings shared across install targets
`bb::install(.., targets: &[&str], ..)` constructs **one
`ComponentRef` per slot** even when several targets declare the
same slot (`src/install.rs:524-571`). The install path walks every
target's binding entries, groups by slot name, and asserts every
contributor agrees on the same `(TYPE_NAME, role)` pair; a
disagreement surfaces `InstallError::SlotBindingConflict { slot,
conflicts }` enumerating every contributor in call order
(`src/install.rs:142-148`). Concretely:
- A federated peer hosting both `Client` and `Server` partitions
binds `backend = "compute"` on each target. The compiler stamps
the same `Backend|CpuBackend|<slot_id>` value under
`binding.Client.compute` and `binding.Server.compute`; install
constructs one `CpuBackend`, registers one `ComponentRef`, and
both partitions' role-op dispatches route through the same
instance.
- The same applies to `Aggregator`, `Index`, `Codec`,
`DataSource`, `Model`, `PeerSelector`, and `Protocol` slots —
any slot a Contract impl reaches through
`ctx.dependency::<T>("<slot>")` is shared across every target
declaring the slot. State mutations a Contract method makes
(an Aggregator's contribution buffer, an Index's storage, a
Model's optimizer state) are observable to every other target
sharing the slot, by design.
- Targets that bind the same slot name to different concretes
fail at install time, not at dispatch — the
`SlotBindingConflict` walk runs before the install path
allocates any `ComponentRef`, so a misconfigured deployment
surfaces a typed error before any concrete is instantiated.
Authors who need a per-target slot identity wire two distinct slot
names: `bind_aggregator::<FedAvg>("client_aggregator")` for the
client partition's aggregator slot and
`bind_aggregator::<FedAvg>("server_aggregator")` for the server
partition's. Two slots, two `ComponentRef`s, no sharing — the
binding table addresses the slot, the dispatch table routes by the
addressed slot's `ComponentRef`.
## Part 3 — `bb::Aggregator`
A federated/decentralized aggregator. `contribute` writes one peer's
update into an in-progress buffer; `aggregate` reduces the buffer and
emits the result paired with typed metadata.
```rust
pub trait Aggregator: Send + Sync {
/// Storage element type. Most f32-native aggregators declare
/// `type Element = [f32]`. Use `AnyTensor` for a dtype-agnostic
/// aggregator that delegates numeric ops to a bound `Backend`.
type Element: ?Sized + bb_ir::types::Storage;
type Error: std::error::Error + std::fmt::Display + Send + Sync + 'static;
type Metadata: Clone
+ Default
+ serde::Serialize
+ for<'de> serde::Deserialize<'de>
+ Send + Sync + 'static;
fn contribute(
&mut self,
ctx: &mut RuntimeResourceRef<'_>,
src: PeerId,
tensor: &Self::Element,
metadata: Self::Metadata,
completion: CompletionHandle<(), Self::Error>,
) -> ContractResponse<(), Self::Error>;
fn aggregate(
&mut self,
ctx: &mut RuntimeResourceRef<'_>,
completion: CompletionHandle<(Box<Self::Element>, Self::Metadata), Self::Error>,
) -> ContractResponse<(Box<Self::Element>, Self::Metadata), Self::Error>;
}
```
`type Element` is the storage position. `type Metadata` is the typed
channel hierarchical aggregation rides on: a child `FedAvg`'s
`aggregate` emits `(params, FedAvgMeta { num_samples })`; a parent
layer's `contribute` receives the metadata and uses `num_samples` to
weight the child's contribution. Metadata moves through the slot table
as a typed Rust value; serde fires only when it crosses a wire
boundary. Impls with no metadata channel set `type Metadata = ();`.
The `ctx` parameter is the canonical hook for reaching a bound
`Backend` via `ctx.dependency::<B>("backend")` — that's how
`FedAvg<B>::aggregate` composes its weighted sum from the
backend's `Mul` + `Add` primitives without hardcoding a backend.
**Implementing it.** Pair the Contract impl with
`#[derive(bb::Aggregator)]`. The derive emits the
`AggregatorRuntime` bridge plus the `ConcreteComponent` and
`inventory::submit!` registration; the framework wires the bridge
into the engine's dispatch table.
**DSL surface.** The generic placeholder is
`bb_ops::placeholders::Aggregator`. Its methods record under
`ai.bytesandbrains.role.aggregator`:
- `contribute(g, contribution, metadata) -> Output`
- `aggregate(g, trigger) -> (Output, Output)` — emits
`(params, metadata)`.
A library author shipping a concrete aggregator exposes equivalent
DSL methods as inherent methods on the type.
## Part 4 — `bb::Backend`
A tensor compute backend. The Contract has **two surfaces**, exposed
side-by-side, and a backend overrides whichever side is natural for
its target. Backend's user-facing methods are the **only** Contract
methods that do not take `ctx` — see *Backend ctx exemption* below.
1. **One typed method per mandatory primitive** — the 30 entries in
[`bb_ir::tensor_primitives::TENSOR_PRIMITIVES_OPS`]:
`add`, `sub`, `mul`, `div`, `neg`, `abs`, `sqrt`, `pow`, `exp`,
`log`, `matmul`, `reduce_sum`, `reduce_mean`, `reduce_max`,
`reduce_min`, `reshape`, `transpose`, `concat`, `slice`, `split`,
`squeeze`, `unsqueeze`, `identity`, `cast`, `equal`, `greater`,
`less`, `r#where`, `constant`, `gather`.
2. **One method to execute a subgraph** —
`execute(&GraphProto, HashMap<String, Tensor>, BackendAttrs<'_>)
-> Result<HashMap<String, Tensor>, Self::Error>`.
```rust
pub trait Backend: Send + Sync {
type Error: std::error::Error
+ std::fmt::Display
+ Send + Sync
+ From<backend_default_walk::BackendWalkError>
+ 'static;
type Tensor: Clone + Send + Sync + 'static + bb_ir::types::Storage;
fn add(&self, a: &Self::Tensor, b: &Self::Tensor) -> Result<Self::Tensor, Self::Error> { … }
fn matmul(&self, a: &Self::Tensor, b: &Self::Tensor) -> Result<Self::Tensor, Self::Error> { … }
// … 28 more per-op methods, each with a default that wraps into `execute`.
fn execute(
&self,
graph: &GraphProto,
inputs: HashMap<String, Self::Tensor>,
attrs: BackendAttrs<'_>,
) -> Result<HashMap<String, Self::Tensor>, Self::Error> {
backend_default_walk::execute_graph_via_per_op(self, graph, inputs)
}
}
```
Defaults in [`bb-runtime/src/contracts/backend_default_walk.rs`](../bb-runtime/src/contracts/backend_default_walk.rs)
bridge the two sides so the author overrides only one:
- A CPU backend overrides the 30 per-op methods (`add` runs ndarray's
`Add`, `matmul` runs `dot`, …). `execute`'s default walker calls
the per-op overrides.
- A whole-graph backend (Burn, ORT, Candle, tch) overrides `execute`
natively (compile the `GraphProto` to native IR, run once). Per-op
defaults wrap a one-node `GraphProto` and call `execute`.
A backend overriding **neither** side stack-overflows: every per-op
default delegates to `execute`, whose default walks back into per-op.
Backends MUST override at least one side.
Activation functions, pooling, normalization, Conv, and other
extension ops are not on the Contract surface. A backend can declare
extension opsets and handle them in its `execute` override, or a
future lowering pass decomposes them into primitives.
**Backend ctx exemption.** Every other Contract method takes
`ctx: &mut RuntimeResourceRef<'_>` as its first positional
parameter; Backend's user-facing methods do not. The canonical
pattern that motivates `ctx` —
`let backend = ctx.dependency::<B>("backend")?; backend.mul(&a,
&b)?;` — borrows `&B` from `ctx`, and a `mul` signature that took
`&mut ctx` would re-borrow it while the previous borrow is still
live (E0502). Backend is the terminal dependency in the injection
chain (a leaf, not a composition seam), so the exemption costs
nothing: kernels stay pure tensor functions. The derive-emitted
`BackendRuntime::dispatch_atomic` bridge still receives `ctx` and
threads `current_node_attributes` + `current_node_metadata` into a
`BackendAttrs<'_>` for the `execute` override.
**Implementing it.** Pair the Contract impl with
`#[derive(bb::Backend)]`. The derive emits the `BackendRuntime`
bridge; `dispatch_atomic` routes each `ai.onnx::*` call through
`Backend::execute` on a single-node `GraphProto` so per-op overrides
on the Contract receive the dispatch automatically.
**DSL surface.** `bb_ops::placeholders::Backend` ships the full
`ai.onnx v1` DSL catalog (~48 methods including the 30 primitives,
common activations, normalization, conv/pool, gather/scatter, and
`If`/`Loop` subgraph carriers). Each method records an `ai.onnx::*`
`NodeProto`; the compiler's subgraph-collapsing pass fuses
contiguous `ai.onnx` runs into `BackendSubgraph` calls that the
bound backend's `execute` consumes.
### Backend-owned tensor memory
Tensors flowing through a `Backend`-bound slot are **thin
Arc-style handles** around backend-managed buffers (think
`std::shared_ptr`). The backend allocates, owns, and is free to
pool / reuse / free the underlying buffer; the framework holds
the handle. `Backend::Tensor: Clone + Send + Sync + 'static`
makes the handle cheap to copy across slots
(`Clone` becomes `Arc::clone` for pooling-friendly backends).
`CpuBackend` is the canonical in-tree implementer:
```rust
pub struct CpuTensor(pub(crate) Arc<CpuBackendBuffer>);
pub(crate) struct CpuBackendBuffer {
pub(crate) data: ArrayD<f32>,
pub(crate) dims_i64: Vec<i64>,
pub(crate) charged_bytes: usize,
}
```
(`bb-ops/src/backends/cpu/tensor.rs:44-65`.) `Clone` is `Arc::clone`
(O(1) refcount bump); FedAvg's per-peer `tensor.clone()`
(`bb-ops/src/aggregators/fedavg/mod.rs`) costs one atomic
increment, not a `Vec<f32>` deep copy.
### `materialize_from_wire` Contract method
```rust
fn materialize_from_wire(
&self,
type_hash: u64,
bytes: Vec<u8>,
) -> Result<Self::Tensor, Self::Error>;
```
(`bb-runtime/src/contracts/backend.rs:497-522`.) The framework
calls this when a tensor `SlotFill` arrives at a slot whose
binding is a `Backend` role. Lifecycle:
1. Wire-decode runs `EnvelopeCaps::max_per_fill_bytes` cap +
`Engine::ingress_byte_budget` `try_charge` before the
backend sees anything (Principle 1).
2. The framework `mem::take`s `fill.payload` (already
framework-owned from envelope decode) and hands the
`Vec<u8>` to `materialize_from_wire` **by value** —
ownership transfer, not a borrow.
3. The backend may adopt the bytes zero-copy
(`ArrayD::from_shape_vec` when alignment permits), pull a
buffer from a pool and copy in, or fresh-allocate. The
framework will not touch `bytes` after the call returns.
4. On `Ok(tensor)` the engine wraps the result in
`BackendTensorCarrier`
(`bb-runtime/src/slot_value.rs:43-174`) and stamps the
accounting fields (`charged_bytes`, `backend_ref`).
5. On `Err` the engine releases the byte charge, drops the
fill, and emits
`InfraEvent::WireReceiveError::BackendMaterializeFailed`.
**Ownership rationale.** `Vec<u8>` by value (not `&[u8]` or
`Cow`). This is the framework-to-backend handoff, NOT an
external boundary — Principle 1a (ephemeral borrowed slices)
applies to transport ingress, not to framework-internal handoffs.
The backend lives inside the framework ecosystem and plays by
the runtime contract.
**Default impl.** The trait provides a default that delegates to
the global `wire_decoder_registry()`: look up the decoder for
`type_hash`, run it on the bytes, downcast the resulting
`Box<dyn SlotValue>` to `Self::Tensor`. Backends without tensor
pooling work through this default; backends that override pay
the registry hop only at override time. The derive bridge in
`#[derive(bb::Backend)]` (`bb-derive/src/roles.rs:368-389`)
generates the `BackendRuntime::materialize_from_wire` forwarding
shim automatically.
**`bb::Backend` is the only role with a dedicated wire-materialise
hook.** Other roles (`Aggregator`, `Index`, `Model`, …) receive
their tensors through `RuntimeResourceRef::dependency::<B>()`
already materialised by the bound backend; they never see a
`Vec<u8>` from the wire directly.
## Part 5 — `bb::Codec`
A typed in/out storage bridge — quantizers (affine int8, PQ), dtype
lifts (f32 ↔ f16), opaque-bytes compressors (zstd). `Codec` is the
only Contract with two `Storage`-bound associated types because it
bridges two positions in the tensor-type tree. Authors wire it
explicitly when an upstream output type doesn't unify with a
downstream port type; the compiler reports the mismatch and the
author chooses the appropriate `Codec` impl.
```rust
pub trait Codec: Send + Sync {
/// Input storage position.
type In: ?Sized + bb_ir::types::Storage;
/// Output storage position. Different position from `In`
/// (an identity bridge carries no value — remove it instead).
type Out: ?Sized + bb_ir::types::Storage;
type Error: std::error::Error + std::fmt::Display + Send + Sync + 'static;
/// Optional training pass (calibration for quantizers, k-means
/// for PQ codebooks). Plain dtype casts skip this.
/// Default returns `Now(Ok(()))`.
fn train(
&mut self,
ctx: &mut RuntimeResourceRef<'_>,
samples: &[&Self::In],
completion: CompletionHandle<(), Self::Error>,
) -> ContractResponse<(), Self::Error> { ContractResponse::Now(Ok(())) }
/// `In → Out`.
fn encode(
&self,
ctx: &mut RuntimeResourceRef<'_>,
input: &Self::In,
completion: CompletionHandle<Box<Self::Out>, Self::Error>,
) -> ContractResponse<Box<Self::Out>, Self::Error>;
/// `Out → In`. Lossy codecs implement the best-effort inverse.
fn decode(
&self,
ctx: &mut RuntimeResourceRef<'_>,
encoded: &Self::Out,
completion: CompletionHandle<Box<Self::In>, Self::Error>,
) -> ContractResponse<Box<Self::In>, Self::Error>;
}
```
Example — f32 → u8 affine quantizer:
```rust
#[derive(bb::Concrete, bb::Codec)]
struct Int8AffineQuantizer { scale: f32, zero_point: i32 }
impl Codec for Int8AffineQuantizer {
type In = [f32]; // TYPE_TENSOR_F32
type Out = [u8]; // TYPE_TENSOR_U8
type Error = QuantizeError;
fn train(&mut self, ctx, samples, …) { /* compute scale + zero_point */ }
fn encode(&self, ctx, x, …) { /* affine quantize */ }
fn decode(&self, ctx, y, …) { /* affine dequantize */ }
}
```
A codec that materializes calibration tensors on-device reaches the
bound Backend through `ctx.dependency::<MyBackend>("backend")` —
the same dep-injection chain every non-Backend role uses.
`train(samples)` runs once per codec instance to fit the bridge.
Affine int8 quantizers compute `(scale, zero_point)` from the
sample slice; PQ codecs run k-means per sub-vector to build the
codebooks; plain dtype casts (`f32 → f16`, `bf16 → f32`) skip the
call. The same bootstrap-vs-barrier ordering options apply as
with `Index::train`: record the call inside `Module::bootstrap`
to gate body-phase `encode` / `decode` ops on training
completion (the `is_op_locked` gate parks every body op touching
the bound Codec — see [Part 11](#part-11--bbbootstrap)), or wire
the trigger through a `bb.barrier`.
**Implementing it.** Pair with `#[derive(bb::Codec)]`.
**DSL surface.** `bb_ops::placeholders::CodecSlot` records under
`ai.bytesandbrains.role.codec`:
- `encode(g, input) -> Output`
- `decode(g, encoded) -> Output`
- `train(g, samples) -> Output` (`TYPE_TRIGGER`)
## Part 6 — `bb::DataSource`
A data source / data loader. Produces batches into the Module.
```rust
pub trait DataSource: Send + Sync {
/// Sample storage type. Covers both the batch tensor and the
/// optional labels tensor. Implement as `[f32]` for flat f32
/// sample batches.
type Sample: ?Sized + bb_ir::types::Storage;
type Error: std::error::Error + std::fmt::Display + Send + Sync + 'static;
fn next_batch(
&mut self,
ctx: &mut RuntimeResourceRef<'_>,
completion: CompletionHandle<(Box<Self::Sample>, Box<Self::Sample>), Self::Error>,
) -> ContractResponse<(Box<Self::Sample>, Box<Self::Sample>), Self::Error>;
fn reset(
&mut self,
ctx: &mut RuntimeResourceRef<'_>,
completion: CompletionHandle<(), Self::Error>,
) -> ContractResponse<(), Self::Error>;
fn on_data_loaded(
&mut self,
ctx: &mut RuntimeResourceRef<'_>,
completion: CompletionHandle<(), Self::Error>,
) -> ContractResponse<(), Self::Error>;
}
```
`next_batch` returns `(batch, labels)` as boxed `Self::Sample` slices;
the second slot is zero-length for unsupervised sources.
`on_data_loaded` is a one-shot notification a source fires once its
data is ready to read (e.g. dataset download complete). A source
that lands its batch tensors on a device-resident backend reaches
the bound concrete via `ctx.dependency::<MyBackend>("backend")`.
**Implementing it.** Pair with `#[derive(bb::DataSource)]`.
**DSL surface.** `bb_ops::placeholders::DataLoader` records under
`ai.bytesandbrains.role.data_source`:
- `next_batch(g) -> (Output, Output)` — `(batch, labels)`.
- `reset(g, trigger) -> Output`
- `on_data_loaded(g) -> Output`
## Part 7 — `bb::Index`
A vector index. Wraps an in-process structure (FAISS, ScaNN), a
database (SQLite + extensions, pgvector), or a custom impl.
```rust
pub trait Index: Send + Sync {
/// Vector storage. Pick the position in the type tree:
/// `[f32]` for an f32-native index, `AnyTensor` for an
/// algorithm-class index that outsources distance math to
/// a bound `Backend`, a custom type for specialized dtypes.
type Vector: ?Sized + bb_ir::types::Storage;
type Error: std::error::Error + std::fmt::Display + Send + Sync + 'static;
fn add(
&mut self,
ctx: &mut RuntimeResourceRef<'_>,
vec: &Self::Vector,
completion: CompletionHandle<u64, Self::Error>,
) -> ContractResponse<u64, Self::Error>;
fn search(
&self,
ctx: &mut RuntimeResourceRef<'_>,
query: &Self::Vector,
k: u32,
completion: CompletionHandle<Vec<(u64, f32)>, Self::Error>,
) -> ContractResponse<Vec<(u64, f32)>, Self::Error>;
fn remove(
&mut self,
ctx: &mut RuntimeResourceRef<'_>,
id: u64,
completion: CompletionHandle<(), Self::Error>,
) -> ContractResponse<(), Self::Error>;
/// Optional training pass. IVF needs centroid k-means; Product
/// Quantization (PQ) needs sub-vector codebook learning; flat
/// and hand-tuned indexes skip it. Default returns
/// `Now(Ok(()))` so impls that do not train pay zero cost.
fn train(
&mut self,
ctx: &mut RuntimeResourceRef<'_>,
samples: &[&Self::Vector],
completion: CompletionHandle<(), Self::Error>,
) -> ContractResponse<(), Self::Error> { ContractResponse::Now(Ok(())) }
}
```
An algorithm-class index (e.g. an HNSW shell that delegates
distance math) declares `#[depends(backend = "<slot>")]` and
reaches the bound backend through `ctx.dependency::<B>("<slot>")`
inside `search`. See `examples/component_with_dependency.rs` for
the worked pattern.
`train(samples)` runs once per index instance ahead of `add`
traffic. IVF impls compute centroids over the sample slice and
keep them as the coarse quantizer; PQ impls learn one codebook
per sub-vector and keep them as the encoder. Authors gate body
`add` / `search` ops on training completion either by recording
the call inside `Module::bootstrap` (the per-component
`is_op_locked` gate parks `add` / `search` ops touching the
bound Index until the bootstrap drains — see
[Part 11](#part-11--bbbootstrap)) or by wiring the returned
trigger into a `bb.barrier`.
**Implementing it.** Pair with `#[derive(bb::Index)]`.
**DSL surface.** `bb_ops::placeholders::IndexSlot` records under
`ai.bytesandbrains.role.index`:
- `add(g, vec) -> Output`
- `search(g, query, k) -> Output`
- `remove(g, id) -> Output`
- `train(g, samples) -> Output` (`TYPE_TRIGGER`)
## Part 8 — `bb::Model`
An ML model. Forward / backward / optimizer step / parameter
snapshot.
```rust
pub trait Model: Send + Sync {
/// Tensor storage type. One associated type covers
/// input, output, params, grad, and delta.
/// Implement as `[f32]` for flat f32 tensors.
/// Mixed-precision models wire `Codec` nodes around the model
/// rather than multiplying associated types per port.
type Tensor: ?Sized + bb_ir::types::Storage;
type Error: std::error::Error + std::fmt::Display + Send + Sync + 'static;
fn forward(
&mut self,
ctx: &mut RuntimeResourceRef<'_>,
input: &Self::Tensor,
completion: CompletionHandle<Box<Self::Tensor>, Self::Error>,
) -> ContractResponse<Box<Self::Tensor>, Self::Error>;
fn load_parameters(
&mut self,
ctx: &mut RuntimeResourceRef<'_>,
params: &Self::Tensor,
completion: CompletionHandle<(), Self::Error>,
) -> ContractResponse<(), Self::Error>;
fn backward(
&mut self,
ctx: &mut RuntimeResourceRef<'_>,
grad: &Self::Tensor,
completion: CompletionHandle<(), Self::Error>,
) -> ContractResponse<(), Self::Error>;
fn apply_delta(
&mut self,
ctx: &mut RuntimeResourceRef<'_>,
delta: &Self::Tensor,
completion: CompletionHandle<(), Self::Error>,
) -> ContractResponse<(), Self::Error>;
/// Loss is always a framework-fixed `f32` scalar regardless of
/// the tensor element type.
fn compute_loss(
&mut self,
ctx: &mut RuntimeResourceRef<'_>,
input: &Self::Tensor,
target: &Self::Tensor,
completion: CompletionHandle<f32, Self::Error>,
) -> ContractResponse<f32, Self::Error>;
fn params(
&self,
ctx: &mut RuntimeResourceRef<'_>,
completion: CompletionHandle<Box<Self::Tensor>, Self::Error>,
) -> ContractResponse<Box<Self::Tensor>, Self::Error>;
}
```
`params` returns an owned `Box<Self::Tensor>` snapshot — async
serialization needs owned values. A model whose forward pass runs
on a bound `Backend` reaches the backend through
`ctx.dependency::<B>("<slot>")` and composes the per-op surface
inside `forward` / `backward`.
**Implementing it.** Pair with `#[derive(bb::Model)]`.
**DSL surface.** `bb_ops::placeholders::Model` records under
`ai.bytesandbrains.role.model`:
- `forward(g, input) -> Output`
- `load_parameters(g, params) -> Output`
- `backward(g, grad) -> Output`
- `apply_delta(g, delta) -> Output`
- `compute_loss(g, input, target) -> Output`
- `params(g) -> Output`
## Part 9 — `bb::PeerSelector`
A peer-selection protocol. The framework's gossip overlay provides
one impl; users needing a custom view (constant, weighted,
geographic) write their own.
```rust
pub trait PeerSelector: Send + Sync {
type Error: std::error::Error + std::fmt::Display + Send + Sync + 'static;
fn select(
&mut self,
ctx: &mut RuntimeResourceRef<'_>,
params: SelectParams,
completion: CompletionHandle<Vec<PeerId>, Self::Error>,
) -> ContractResponse<Vec<PeerId>, Self::Error>;
fn sample(
&mut self,
ctx: &mut RuntimeResourceRef<'_>,
n: u32,
completion: CompletionHandle<Vec<PeerId>, Self::Error>,
) -> ContractResponse<Vec<PeerId>, Self::Error> {
self.select(ctx, SelectParams::Random { n }, completion)
}
fn current_view(
&mut self,
ctx: &mut RuntimeResourceRef<'_>,
completion: CompletionHandle<Vec<PeerId>, Self::Error>,
) -> ContractResponse<Vec<PeerId>, Self::Error>;
}
pub enum SelectParams {
Random { n: u32 },
NearKey { key: Vec<u8>, n: u32 },
All,
}
```
Selector impls read `ctx.peers.addresses` to walk the local
`AddressBook`, write through it for membership updates from the
`dispatch_atomic` arm (`Announce` / `Forget`), and reach the
scheduler via `ctx.time` when planning a delayed probe. Declared
dependencies are reached via `ctx.dependency::<T>("<slot>")` —
the same surface every other non-Backend Contract uses.
`SelectParams` is an open enum — new variants are additive. Concrete
impls handle the variants they support and surface
`ContractResponse::Now(Err(_))` for unsupported variants (e.g. a DHT
view handles `NearKey`; a fixed-list view handles only `All`).
`sample` defaults to `select(SelectParams::Random { n })`; impls may
override it for an optimized fast path.
**Implementing it.** Pair with `#[derive(bb::PeerSelector)]`.
**DSL surface.** `bb_ops::placeholders::PeerSelector` records under
`ai.bytesandbrains.role.peer_selector`:
- `sample(g, n) -> Output<PeerId>`
- `current_view(g) -> Output<PeerId>`
The placeholder carries a `class: &'static str` that tags every
emitted `Output<PeerId>` with the peer class it samples from. The
compiler's class-inference pass reads the tag so downstream
`wire.send`s flow to the right destination class — that's how a
gossip self-send partitions correctly. Construct as
`PeerSelector::of_class("gossip_peer")` to retarget; `Default`
selects `bb_ir::peer_class::SELF_CLASS`.
## Part 10 — The Protocol Slot
The `Protocol` role hosts bring-your-own-protocol implementations
(Kademlia, Chord, custom overlays). Unlike the other seven roles,
protocols do not share a fixed verb catalog — each protocol declares
its own atomic opset (`<crate>.<ProtocolName>.atomic v<n>`) and the
op-types in that opset are protocol-specific.
For this reason there is **no `bb::Protocol` Contract trait** and no
`#[derive(bb::Protocol)]`. The user-facing authoring path is the
declarative macro
[`bb::register_protocol!{}`](../bb-derive/src/lib.rs), which writes the
protocol struct's serde impls, `ConcreteComponent` impl,
`AnyComponent` impl, framework-internal `ProtocolRuntime` impl,
`atomic_opset` declaration, `dispatch_atomic` body, and inventory
submission in one block:
```rust
bb::register_protocol! {
struct Kademlia { routing_table: Vec<u64>, k: usize }
domain: "bb-kademlia.kademlia.atomic"
version: 1
ops {
FindNode,
Ping,
}
}
```
The `ProtocolRuntime` trait the macro generates carries the same
pair every other role has: `atomic_opset()` declaring the
protocol's op set, and `dispatch_atomic(op_type, inputs, ctx)`
routing op types to Rust bodies. For inbound envelopes the framework
synthesizes the dispatch inputs from the wire envelope (peer id,
raw payload bytes, correlation handle); for user-graph DSL ops the
inputs come from upstream slot values exactly like any role op.
The DSL placeholder for the slot is
`bb_ops::placeholders::Protocol`. It carries no DSL methods of its
own — protocols surface their per-op DSL methods on the concrete
struct, emitted by `register_protocol!{}` alongside the runtime
bridge. The placeholder exists solely so Modules can declare a
generic Protocol slot the compiler chain binds at compile time.
See [WIRE.md](WIRE.md) for the wire envelope shape and a worked
Gossip-protocol example.
## Part 11 — `bb::Bootstrap`
The optional Component initialization phase. Every Component
(every `#[derive(bb::Concrete)]` type) participates implicitly —
the derive emits a default no-op `impl Bootstrap`
(`bb-derive/src/roles.rs:46-79`,
`bb-runtime/src/contracts/bootstrap.rs:54-67`) so most concretes
need zero boilerplate. Authors **override** when a Component
needs to allocate resources, mmap state, prime a calibration
buffer, or otherwise stage work before any of its other Contract
methods runs.
```rust
pub trait Bootstrap {
type Error: std::error::Error + Send + Sync + 'static;
fn bootstrap(&mut self, _ctx: &mut BootstrapCtx)
-> Result<(), Self::Error>
{
Ok(())
}
}
```
The host fires Component bootstraps explicitly via
`Node::run_bootstrap(BootstrapTarget::Slots(&[slot, ...]))`
(`bb-runtime/src/node/mod.rs`). The engine resolves
`slot → ComponentRef`, allocates a fresh `ExecId`, locks the
`{cref}` touch set on `bootstrap.in_flight`, and invokes the
override through the per-T dispatcher registry the derive
registered. Disjoint Component bootstraps fire concurrently —
the `is_op_locked` gate parks only the touched components, so
body ops on disjoint slots keep firing during a Component
bootstrap.
`DispatchResult::Immediate(_)` retires the in-flight entry
synchronously. `DispatchResult::Async(cmd_id)` parks the body
`ExecId` on `pending_async`; the impl's later
`ctx.complete_command(cmd_id, ...)` drives the drain through
the regular `handle_completion` path.
### When a Concrete should override `Bootstrap`
Override when:
- **Backend pools.** The Backend allocates pinned host buffers,
GPU streams, or a kernel cache before body ops issue tensor
work.
- **Index mmap / file-backed state.** The Index opens its
on-disk store, validates the header, and primes any in-memory
caches before `add` / `search` ops fire.
- **Codec calibration.** A quantization codec pulls a calibration
sample from its bound `DataSource` and computes
`(scale, zero_point)` before `encode` / `decode` ops fire.
- **Protocol kademlia bootstrap.** A protocol Component contacts
its seed peers to populate the routing table before the body
phase emits `FindNode` traffic.
- **Async one-shot setup.** Any setup that returns
`ContractResponse::Later` so the engine can park the body
phase while the work completes off-thread.
Skip the override when the Component is purely reactive — a
stateless Aggregator, a `Backend` whose tensor pool lazy-allocates
on first kernel call, a `DataSource` that loads from an in-memory
buffer constructed at install. The default no-op runs through
the dispatcher just like any other Contract method; the
`is_op_locked` gate clears immediately so body ops fire as soon
as the host kicks the queue.
**DSL note.** A Component-level `Bootstrap` override is **not**
the same as a `Module::bootstrap` recording. The former runs
Rust code once when the host fires the slot (no DSL recording,
no FunctionProto); the latter records a `__bootstrap`
FunctionProto whose body ops dispatch as normal Contract methods.
Modules that need *graph-expressed* one-shot setup (e.g.
`Index::train(g, samples)`) record it inside `Module::bootstrap`;
Components that need *Rust-expressed* one-shot setup (e.g. mmap
a file) implement the `Bootstrap` Contract. The two paths
coexist — a Module bootstrap that calls `Index::train` dispatches
through the Index's Contract methods, which run only after the
Index's `Bootstrap` override completes (the seed order is host-
driven).
See [ENGINE.md §6.8](ENGINE.md#68-host-driven-bootstrap-entry)
for the engine plumbing,
[CONTRACT_DISPATCH.md](CONTRACT_DISPATCH.md#bootstrap-is-just-another-contract-method)
for the derive bridge,
[AUTHORING_COMPONENTS.md](AUTHORING_COMPONENTS.md#authoring-a-component-level-bootstrap)
for an authoring walkthrough.
## Cross-references
- [API_DESIGN.md](API_DESIGN.md) — Module → Compiler → Node
three-phase construction.
- [AUTHORING_COMPONENTS.md](AUTHORING_COMPONENTS.md) — long-form
walkthrough of writing a concrete component.
- [COMPILER.md](COMPILER.md) — compilation pipeline (18 passes) that
binds recorded role-op `NodeProto`s to the concrete impls bound
via the compiler chain.
- [CONTRACT_DISPATCH.md](CONTRACT_DISPATCH.md) — Contract-method
dispatch and the `dispatch_atomic` bridge design.
- [IR_AND_DSL.md](IR_AND_DSL.md) — DSL → ONNX `ModelProto`, role
opset catalog, and the per-op IO contracts.
- [WIRE.md](WIRE.md) — wire envelope and protocol authoring.