Expand description
SamplingCoordinator — coalesces N concurrent
SamplingClient::create_message calls into ⌈N/M⌉ calls within a
configurable time window or batch-size limit.
Per the v0.9.0 design (docs/dev-log/0098-v0.9.0-implementation-plan.md
§4 P4 / Risk #7 / MAJOR 3 batching resolution):
§Why batch?
Each sampling/createMessage call surfaces ONE approval prompt
in the user’s MCP client (Claude Desktop / Claude Code / future
clients). When the daemon-side consolidate-timer fires
solo_storage::triples_batch::run_triples_batch_tick, it
can produce N per-cluster sampling calls in quick succession —
N separate approval prompts spam the user.
SamplingCoordinator collapses N calls within a window window
into ONE coalesced peer.create_message call. The user sees ONE
approval per coalesce window; the per-cluster results are
demultiplexed back to the individual callers via their oneshot
reply channels.
§When NOT to batch
Coordinator is bypassed for non-sampling backends (Anthropic /
Ollama / None) — those don’t surface approval prompts and have
their own rate-limiting concerns. The coordinator inserts itself
ONLY when wrapping [PeerSamplingClient] / a fake equivalent.
§Coalesce strategy
- Single-request batch: passes through as a normal
create_messagecall, with no prompt rewriting. Zero behaviour change from the v0.9.0 P2 path. - Multi-request batch (N > 1): wraps each request in a
numbered JSON object, asks the LLM for a JSON array of
responses, parses the array, demultiplexes per-task. The
prompt template is documented in [
build_coalesced_request].
§Privacy invariant
The audit emit per logical request (one
AuditOperation::LlmSamplingCall row per submitted
[SamplingLlmClient::complete]) STAYS — the coordinator is an
optimisation on the wire, NOT a change to the audit shape. See
plan §11 Risk #8 — operators MUST be able to count per-logical-call
audit rows, not per-coalesce.
Structs§
- Sampling
Coordinator - Wrapper around a
SamplingClientthat coalesces concurrentcreate_messagecalls into batchedcreate_messagecalls (within a configurable time window OR batch-size limit).
Constants§
- DEFAULT_
COALESCE_ MAX_ BATCH - Default max-batch size: 10 logical requests per coalesced
create_message. Plan §4 P4c default — caps the rendered prompt size + prevents one slow batch from holding the worker indefinitely. - DEFAULT_
COALESCE_ WINDOW - Default coalesce window: 5 seconds. Plan §4 P4c default — chosen so the user’s approval-prompt latency stays under typical MCP-session “I’m doing work” tolerance.