Expand description
Translator — cross-vocab token-stream pipe.
Take Agent A’s token IDs in vocab V_A, produce Agent B’s token IDs
in vocab V_B, with no text ever leaving the process. Internally:
ids_A → Detokenizer(V_A) → utf8 → BPETokenizer(V_B) → ids_BThe text intermediate is purely local; agent-to-agent traffic still
carries only token IDs on the wire. Mirrors the TS Translator class
from @codecai/web and the Python Translator from codecai — same
word-boundary buffering rules.
Streaming caveat: BPE merges depend on context, so re-tokenizing
partial words mid-stream produces different IDs than re-tokenizing
the complete word. The Translator buffers text until a safe boundary
(whitespace) before flushing through BPE. Pass partial=true for
incoming chunks and partial=false (or call Translator::finish)
on the last chunk so the buffer drains.
Structs§
- Translator
- Cross-vocab agent-handoff pipe.
Functions§
- static_
translation_ table - Build a static
V_A → V_B[]translation table by rendering eachV_Avocab entry to text and re-tokenizing throughV_B. - translate_
one_ shot - One-shot translator for non-streaming uses where all IDs are in hand.