Expand description
§codec-rs
Rust port of the Codec binary
transport protocol — the functional twin of Codec.Net,
@codecai/web, and codecai.
Codec carries uint32 token IDs on the wire instead of UTF-8 / JSON,
deferring text decoding to the presentation layer. This crate lets a
Rust client decode/encode Codec frames, parse tokenizer maps,
detokenize/tokenize, watch for tool calls, translate cross-vocab, and
load maps over HTTP — all with full sha256 verification.
§Quick start
use codec_rs::{TokenizerMap, Detokenizer, DetokenizeOptions};
let map = TokenizerMap::from_json(json).unwrap();
let mut detok = Detokenizer::new(&map);
for frame in frames {
let opts = DetokenizeOptions { partial: !frame.done, render_special: false };
let text = detok.render(&frame.ids, opts);
print!("{text}");
}See module-level docs and the project README for the full surface.
Re-exports§
pub use version_signaling::parse_version_policy_document;pub use version_signaling::parse_version_required;pub use version_signaling::well_known_version_policy_url;pub use version_signaling::CodecVersionPolicyDocument;pub use version_signaling::CodecVersionRequiredBody;pub use version_signaling::HttpStatus;pub use version_signaling::VersionSignalingError;pub use version_signaling::CODEC_CLIENT_VERSION;pub use version_signaling::CODEC_CLIENT_VERSION_HEADER;pub use version_signaling::CODEC_MIN_VERSION_HEADER;pub use version_signaling::CODEC_REQUIRED_FEATURES_HEADER;pub use version_signaling::discover_version_policy_blocking;pub use byte_encoder::decode_byte_level_token;pub use byte_encoder::encode_byte_level_chars;pub use byte_encoder::METASPACE;pub use compression::hash_zstd_dict;pub use compression::select_zstd_dict_for_response;pub use compression::well_known_dict_url;pub use compression::CodecZstdDictError;pub use compression::ZstdDictDiscoveryError;pub use compression::discover_zstd_dict;pub use compression::discover_zstd_dict_blocking;pub use detokenize::Detokenizer;pub use detokenize::DetokenizeOptions;pub use frame::CodecFrame;pub use frame::IMapCache;pub use frame::MapCache;pub use frame::MemoryMapCache;pub use longest_match::LongestMatchTokenizer;pub use longest_match::Tokenize;pub use map::TokenizerMap;pub use map::TokenizerMapError;pub use map::ToolCallingArgsFormat;pub use map::ToolCallingBlock;pub use map::ToolCallingConvention;pub use map::ToolCallingMarkers;pub use map::ToolCallingResultFormat;pub use map_loader::LoadError;pub use map_loader::LoadOptions;pub use map_loader::MapLoader;pub use map_loader::TokenizerMapHashMismatchError;pub use safety_policy::Category as SafetyCategory;pub use safety_policy::CategoryAction;pub use safety_policy::ClassifierBlock as SafetyClassifierBlock;pub use safety_policy::ClassifierHost;pub use safety_policy::ClientHooksBlock as SafetyClientHooksBlock;pub use safety_policy::EngineFeature;pub use safety_policy::PublisherBlock as SafetyPublisherBlock;pub use safety_policy::RulesSummary as SafetyRulesSummary;pub use safety_policy::SafetyPolicyDescriptor;pub use safety_policy::SafetyPolicyError;pub use safety_policy::SafetyPolicyPointer;pub use safety_policy::POLICY_WELL_KNOWN_BASE;pub use safety_policy::discover_safety_policy;pub use safety_policy::load_safety_policy;pub use stream::decode_msgpack_stream;pub use stream::decode_protobuf_frame;pub use stream::decode_protobuf_stream;pub use stream::MsgpackFrameIter;pub use stream::ProtobufFrameIter;pub use stream::StreamError;pub use pretok_program::run_pretok_program;pub use pretok_program::PreTokOp;pub use pretok_program::PreTokProgram;pub use tokenize::BPETokenizer;pub use tokenize::ITokenizer;pub use tool_watcher::ToolWatcher;pub use tool_watcher::ToolWatcherError;pub use tool_watcher::WatcherEvent;pub use tool_watcher::WatcherEventKind;pub use translator::static_translation_table;pub use translator::translate_one_shot;pub use translator::Translator;
Modules§
- byte_
encoder - GPT-2 byte ↔ unicode mapping table and helpers shared by the Detokenizer and BPE encoder.
- compression
- Client-side helpers for the Codec compression contract.
- detokenize
- Stateful detokenizer: token IDs → text.
- frame
CodecFrameand the pluggable map cache.- longest_
match - Vocab-only longest-prefix-match tokenizer.
- map
TokenizerMap— the per-model dialect record. Maps are content-addressed (sha256) and immutable.- map_
loader - Fetch, verify, and cache tokenizer maps.
- pretok_
program - Pre-tokenizer program interpreter.
- safety_
policy - Safety-policy descriptor loading, validation, and discovery.
- stream
- Stream decoders for the two Codec wire formats.
- tokenize
- Pure-Rust BPE encoder. Text → token IDs.
- tool_
watcher - Tool-call / region watcher.
- translator
- Translator — cross-vocab token-stream pipe.
- version_
signaling - Codec v0.4 version negotiation — client-side primitives.