pub struct BcpEncoder { /* private fields */ }Expand description
BCP encoder — constructs a binary payload from structured blocks.
The encoder is the tool-facing API that allows agents, MCP servers,
and other producers to build BCP payloads. It follows the builder
pattern defined in RFC §5.6: methods like add_code,
add_conversation, etc. append typed blocks
to an internal list, and chainable modifiers like
with_summary and
with_priority annotate the most recently
added block.
§Compression (RFC §4.6)
Two compression modes are supported, both opt-in:
-
Per-block: call
with_compressionafter adding a block, orcompress_blocksto enable compression for all subsequent blocks. Each block body is independently zstd-compressed if it exceedsCOMPRESSION_THRESHOLDbytes and compression yields a size reduction. The block’sCOMPRESSEDflag (bit 1) is set when compression is applied. -
Whole-payload: call
compress_payloadto zstd-compress all bytes after the 8-byte header. When enabled, per-block compression is skipped (whole-payload subsumes it). The header’sCOMPRESSEDflag (bit 0) is set.
§Content Addressing (RFC §4.7)
When a ContentStore is configured via
set_content_store, blocks can be stored
by their BLAKE3 hash rather than inline:
-
Per-block: call
with_content_addressingafter adding a block. The body is hashed, stored in the content store, and replaced with the 32-byte hash on the wire. The block’sIS_REFERENCEflag (bit 2) is set. -
Auto-dedup: call
auto_dedupto automatically content-address any block whose body has been seen before. First occurrence is stored inline and registered in the store; subsequent identical blocks become references.
Content addressing runs before compression — a 32-byte hash reference is always below the compression threshold, so reference blocks are never compressed.
§Usage
use bcp_encoder::BcpEncoder;
use bcp_types::enums::{Lang, Role, Status, Priority};
let payload = BcpEncoder::new()
.add_code(Lang::Rust, "src/main.rs", b"fn main() {}")
.with_summary("Entry point: CLI setup and server startup.")?
.with_priority(Priority::High)?
.add_conversation(Role::User, b"Fix the timeout bug.")
.add_conversation(Role::Assistant, b"I'll examine the pool config...")
.add_tool_result("ripgrep", Status::Ok, b"3 matches found.")
.encode()?;§Output layout
The .encode() method serializes all accumulated blocks into a
self-contained byte sequence:
┌──────────────┬──────────────────────────────────────────┐
│ [8 bytes] │ File header (magic, version, flags, rsv) │
│ [N bytes] │ Block 0 frame (type + flags + len + body)│
│ [N bytes] │ Block 1 frame ... │
│ ... │ │
│ [2-3 bytes] │ END sentinel (type=0xFF, flags=0, len=0) │
└──────────────┴──────────────────────────────────────────┘When whole-payload compression is enabled, the layout becomes:
┌──────────────┬──────────────────────────────────────────┐
│ [8 bytes] │ Header (flags bit 0 = COMPRESSED) │
│ [N bytes] │ zstd(Block 0 + Block 1 + ... + END) │
└──────────────┴──────────────────────────────────────────┘The payload is ready for storage or transmission — no further framing is required.
Implementations§
Source§impl BcpEncoder
impl BcpEncoder
Sourcepub fn new() -> Self
pub fn new() -> Self
Create a new encoder with default settings (version 1.0, no flags).
The encoder starts with an empty block list, no compression, and
no content store. At least one block must be added before calling
.encode(), otherwise it returns EncodeError::EmptyPayload.
Sourcepub fn add_code(&mut self, lang: Lang, path: &str, content: &[u8]) -> &mut Self
pub fn add_code(&mut self, lang: Lang, path: &str, content: &[u8]) -> &mut Self
Add a CODE block.
Encodes a source code file or fragment. The lang enum identifies
the programming language (used by the decoder for syntax-aware
rendering), path is the file path (UTF-8), and content is the
raw source bytes.
For partial files, use add_code_range
to include line range metadata.
Sourcepub fn add_code_range(
&mut self,
lang: Lang,
path: &str,
content: &[u8],
line_start: u32,
line_end: u32,
) -> &mut Self
pub fn add_code_range( &mut self, lang: Lang, path: &str, content: &[u8], line_start: u32, line_end: u32, ) -> &mut Self
Add a CODE block with a line range.
Same as add_code but includes line_start and
line_end metadata (1-based, inclusive). The decoder can use this
to display line numbers or to correlate with diagnostics.
Sourcepub fn add_conversation(&mut self, role: Role, content: &[u8]) -> &mut Self
pub fn add_conversation(&mut self, role: Role, content: &[u8]) -> &mut Self
Add a CONVERSATION block.
Represents a single chat turn. The role identifies the speaker
(system, user, assistant, or tool) and content is the message
body as raw bytes.
Sourcepub fn add_conversation_tool(
&mut self,
role: Role,
content: &[u8],
tool_call_id: &str,
) -> &mut Self
pub fn add_conversation_tool( &mut self, role: Role, content: &[u8], tool_call_id: &str, ) -> &mut Self
Add a CONVERSATION block with a tool call ID.
Used for tool-role messages that reference a specific tool
invocation. The tool_call_id links this response back to the
tool call that produced it.
Sourcepub fn add_file_tree(
&mut self,
root: &str,
entries: Vec<FileEntry>,
) -> &mut Self
pub fn add_file_tree( &mut self, root: &str, entries: Vec<FileEntry>, ) -> &mut Self
Add a FILE_TREE block.
Represents a directory structure rooted at root. Each entry
contains a name, kind (file or directory), size, and optional
nested children for recursive directory trees.
Sourcepub fn add_tool_result(
&mut self,
name: &str,
status: Status,
content: &[u8],
) -> &mut Self
pub fn add_tool_result( &mut self, name: &str, status: Status, content: &[u8], ) -> &mut Self
Add a TOOL_RESULT block.
Captures the output of an external tool invocation (e.g. ripgrep,
LSP diagnostics, test runner). The status indicates whether the
tool succeeded, failed, or timed out.
Sourcepub fn add_document(
&mut self,
title: &str,
content: &[u8],
format_hint: FormatHint,
) -> &mut Self
pub fn add_document( &mut self, title: &str, content: &[u8], format_hint: FormatHint, ) -> &mut Self
Add a DOCUMENT block.
Represents prose content — README files, documentation, wiki pages.
The format_hint tells the decoder how to render the body
(markdown, plain text, or HTML).
Sourcepub fn add_structured_data(
&mut self,
format: DataFormat,
content: &[u8],
) -> &mut Self
pub fn add_structured_data( &mut self, format: DataFormat, content: &[u8], ) -> &mut Self
Add a STRUCTURED_DATA block.
Encodes tabular or structured content — JSON configs, YAML
manifests, TOML files, CSV data. The format identifies the
serialization format so the decoder can syntax-highlight or
parse appropriately.
Sourcepub fn add_diff(&mut self, path: &str, hunks: Vec<DiffHunk>) -> &mut Self
pub fn add_diff(&mut self, path: &str, hunks: Vec<DiffHunk>) -> &mut Self
Add a DIFF block.
Represents code changes for a single file — from git diffs, editor changes, or patch files. Each hunk captures a contiguous range of modifications in unified diff format.
Sourcepub fn add_annotation(
&mut self,
target_block_id: u32,
kind: AnnotationKind,
value: &[u8],
) -> &mut Self
pub fn add_annotation( &mut self, target_block_id: u32, kind: AnnotationKind, value: &[u8], ) -> &mut Self
Add an ANNOTATION block.
Annotations are metadata overlays that target another block by its
zero-based index in the stream. The kind determines how the
value payload is interpreted (priority level, summary text, or
tag label).
For the common case of attaching a priority to the most recent
block, prefer with_priority.
Sourcepub fn add_embedding_ref(
&mut self,
vector_id: &[u8],
source_hash: &[u8],
model: &str,
) -> &mut Self
pub fn add_embedding_ref( &mut self, vector_id: &[u8], source_hash: &[u8], model: &str, ) -> &mut Self
Add an EMBEDDING_REF block.
Points to a pre-computed vector embedding stored externally (e.g.
in a vector database). The vector_id is an opaque byte identifier
for the vector in the external store, source_hash is the BLAKE3
hash of the content that was embedded (32 bytes), and model is
the name of the embedding model (e.g. "text-embedding-3-small").
§Wire type
Block type 0x09 (EMBEDDING_REF). See RFC §4.4.
Sourcepub fn add_image(
&mut self,
media_type: MediaType,
alt_text: &str,
data: &[u8],
) -> &mut Self
pub fn add_image( &mut self, media_type: MediaType, alt_text: &str, data: &[u8], ) -> &mut Self
Add an IMAGE block.
Encodes an image as inline binary data. The media_type identifies
the image format (PNG, JPEG, etc.), alt_text provides a textual
description for accessibility, and data is the raw image bytes.
Sourcepub fn add_extension(
&mut self,
namespace: &str,
type_name: &str,
content: &[u8],
) -> &mut Self
pub fn add_extension( &mut self, namespace: &str, type_name: &str, content: &[u8], ) -> &mut Self
Add an EXTENSION block.
User-defined block type for custom payloads. The namespace and
type_name together form a unique identifier for the extension
type, preventing collisions across different tools and vendors.
Sourcepub fn with_summary(&mut self, summary: &str) -> Result<&mut Self, EncodeError>
pub fn with_summary(&mut self, summary: &str) -> Result<&mut Self, EncodeError>
Attach a summary to the most recently added block.
Sets the HAS_SUMMARY flag on the block and prepends the summary
sub-block to the body during serialization. The summary is a
compact UTF-8 description that the token budget engine can use as
a stand-in when the full block content would exceed the budget.
§Errors
Returns EncodeError::NoBlockTarget if no blocks have been
added yet. Use this immediately after an .add_*() call.
Sourcepub fn with_priority(
&mut self,
priority: Priority,
) -> Result<&mut Self, EncodeError>
pub fn with_priority( &mut self, priority: Priority, ) -> Result<&mut Self, EncodeError>
Attach a priority annotation to the most recently added block.
This is a convenience method that appends an ANNOTATION block
with kind=Priority targeting the last added block’s index.
The annotation’s value is the priority byte (e.g. 0x02 for
Priority::High).
§Errors
Returns EncodeError::NoBlockTarget if no blocks have been
added yet.
Sourcepub fn with_compression(&mut self) -> Result<&mut Self, EncodeError>
pub fn with_compression(&mut self) -> Result<&mut Self, EncodeError>
Enable zstd compression for the most recently added block.
During .encode(), the block body is compressed with zstd if it
exceeds COMPRESSION_THRESHOLD bytes and compression yields a
size reduction. If compression doesn’t help (output >= input), the
body is stored uncompressed and the COMPRESSED flag is not set.
Has no effect if compress_payload is
also enabled — whole-payload compression takes precedence.
§Errors
Returns EncodeError::NoBlockTarget if no blocks have been
added yet.
Sourcepub fn compress_blocks(&mut self) -> &mut Self
pub fn compress_blocks(&mut self) -> &mut Self
Enable zstd compression for all blocks added so far and all future blocks.
Equivalent to calling with_compression
on every block. Individual blocks still respect the size threshold
and no-savings guard.
Sourcepub fn compress_payload(&mut self) -> &mut Self
pub fn compress_payload(&mut self) -> &mut Self
Enable whole-payload zstd compression.
When set, the entire block stream (all frames + END sentinel) is
compressed as a single zstd frame. The 8-byte header is written
uncompressed with HeaderFlags::COMPRESSED set so the decoder
can detect compression before reading further.
When whole-payload compression is enabled, per-block compression is skipped — compressing within a compressed stream adds overhead without benefit.
If compression doesn’t reduce the total size, the payload is stored uncompressed and the header flag is not set.
Tradeoff: Whole-payload compression disables incremental
streaming in StreamingDecoder — the decoder must buffer and
decompress the entire payload before yielding any blocks. If
streaming is important, use compress_blocks
instead.
Sourcepub fn set_content_store(&mut self, store: Arc<dyn ContentStore>) -> &mut Self
pub fn set_content_store(&mut self, store: Arc<dyn ContentStore>) -> &mut Self
Set the content store used for BLAKE3 content addressing.
The store is shared via Arc so the same store can be passed to
both the encoder and decoder for roundtrip workflows. The encoder
calls store.put() for each content-addressed block; the decoder
calls store.get() to resolve references.
Must be called before .encode() if any block has content
addressing enabled or if auto_dedup is set.
Sourcepub fn with_content_addressing(&mut self) -> Result<&mut Self, EncodeError>
pub fn with_content_addressing(&mut self) -> Result<&mut Self, EncodeError>
Enable content addressing for the most recently added block.
During .encode(), the block body is hashed with BLAKE3,
stored in the content store, and replaced with the 32-byte hash
on the wire. The block’s IS_REFERENCE flag (bit 2) is set.
Requires a content store — call
set_content_store before .encode().
Content addressing runs before compression. Since a 32-byte
hash reference is always below COMPRESSION_THRESHOLD,
reference blocks are never per-block compressed.
§Errors
Returns EncodeError::NoBlockTarget if no blocks have been
added yet.
Sourcepub fn auto_dedup(&mut self) -> &mut Self
pub fn auto_dedup(&mut self) -> &mut Self
Enable automatic deduplication across all blocks.
When set, the encoder hashes every block body with BLAKE3 during
.encode(). If the hash already exists in the content store
(i.e. a previous block in this or a prior encoding had the same
content), the block is automatically replaced with a hash
reference. First-occurrence blocks are stored inline and
registered in the store for future dedup.
Requires a content store — call
set_content_store before .encode().
Sourcepub fn encode(&self) -> Result<Vec<u8>, EncodeError>
pub fn encode(&self) -> Result<Vec<u8>, EncodeError>
Serialize all accumulated blocks into a complete BCP payload.
The encode pipeline processes each PendingBlock through up to
three stages:
-
Serialize — calls
BlockContent::encode_bodyto get the TLV-encoded body bytes. If a summary is present, it is prepended and theHAS_SUMMARYflag is set. -
Content address (optional) — if the block has
content_address = trueor auto-dedup detects a duplicate, the body is hashed with BLAKE3, stored in the content store, and replaced with the 32-byte hash. TheIS_REFERENCEflag (bit 2) is set. -
Per-block compress (optional) — if compression is enabled for this block, whole-payload compression is NOT active, and the body is not a reference, the body is zstd-compressed if it exceeds
COMPRESSION_THRESHOLDand compression yields savings. TheCOMPRESSEDflag (bit 1) is set.
After all blocks, the END sentinel is appended. If whole-payload
compression is enabled, everything after the 8-byte header is
compressed as a single zstd frame and the header’s COMPRESSED
flag is set.
§Errors
EncodeError::EmptyPayloadif no blocks have been added.EncodeError::BlockTooLargeif any block body exceeds 16 MiB.EncodeError::MissingContentStoreif content addressing is requested but no store has been configured.EncodeError::Wireif the underlying wire serialization fails.EncodeError::Ioif writing to the output buffer fails.