Skip to main content

Crate cbor2

Crate cbor2 

Source
Expand description

This crate provides an implementation of RFC 8949 — the Concise Binary Object Representation (CBOR) — built on serde.

CBOR adopts and modestly builds on the data model used by JSON, except the encoding is in binary form. Its primary goals include a balance of implementation size, message size and extensibility.

§Quick start

Use to_vec/to_writer to encode any serde::Serialize type and from_slice/from_reader to decode any serde::Deserialize type:

use serde::{Deserialize, Serialize};

#[derive(Debug, PartialEq, Deserialize, Serialize)]
struct Photo {
    title: String,
    pixels: (u32, u32),
    tags: Vec<String>,
}

let photo = Photo {
    title: "Sunrise".into(),
    pixels: (1920, 1080),
    tags: vec!["morning".into(), "gradient".into()],
};

let bytes = cbor2::to_vec(&photo).unwrap();
let back: Photo = cbor2::from_slice(&bytes).unwrap();
assert_eq!(photo, back);

from_slice and from_reader deserialize one leading CBOR item. Use validate first when a byte buffer must contain exactly one item, or use de::Deserializer::into_iter for a CBOR sequence.

§Command line tool

The workspace also publishes cbor2-cli, which installs the cbor command for converting CBOR to and from JSON and for rendering diagnostic notation:

brew install ldclabs/tap/cbor2-cli   # Homebrew, installs cbor
cargo install cbor2-cli              # Cargo, installs cbor

§Byte strings and serde_bytes

Serde’s default data model treats Vec<u8> and &[u8] as sequences, so they serialize as CBOR arrays, not byte strings. Use serde_bytes when the wire type should be major type 2.

let bytes = vec![1u8, 2, 3, 4];

// Bare Vec<u8>: [1, 2, 3, 4]
assert_eq!(hex::encode(cbor2::to_vec(&bytes).unwrap()), "8401020304");

// serde_bytes::ByteBuf: h'01020304'
let bytes = serde_bytes::ByteBuf::from(bytes);
assert_eq!(hex::encode(cbor2::to_vec(&bytes).unwrap()), "4401020304");

For struct fields, use serde’s field adapter:

use serde::{Deserialize, Serialize};

#[derive(Debug, PartialEq, Deserialize, Serialize)]
struct Packet {
    #[serde(with = "serde_bytes")]
    payload: Vec<u8>,
}

let packet = Packet { payload: vec![0xde, 0xad, 0xbe, 0xef] };
assert_eq!(
    hex::encode(cbor2::to_vec(&packet).unwrap()),
    "a1677061796c6f616444deadbeef"
);

When building dynamic data directly, Value::Bytes already represents a CBOR byte string:

let value = cbor2::Value::Bytes(vec![0xde, 0xad]);
assert_eq!(hex::encode(cbor2::to_vec(&value).unwrap()), "42dead");

§Dynamic values

When the shape of the data is not known in advance, decode into a Value, the CBOR equivalent of serde_json::Value. The cbor! macro builds Values with a JSON-like syntax:

use cbor2::{cbor, Value};

let value = cbor!({
    "code": 415,
    "message": null,
    "tags": ["legacy", 1.5],
}).unwrap();

let bytes = cbor2::to_vec(&value).unwrap();
let back: Value = cbor2::from_slice(&bytes).unwrap();
assert_eq!(value, back);

Value::serialized and Value::deserialized convert between Value and any type implementing the serde traits.

use serde::{Deserialize, Serialize};

#[derive(Debug, PartialEq, Deserialize, Serialize)]
struct Point {
    x: i64,
    y: i64,
}

let value = cbor2::Value::serialized(&Point { x: -2, y: 5 }).unwrap();
assert_eq!(value.to_string(), r#"{"x": -2, "y": 5}"#);

let point: Point = value.deserialized().unwrap();
assert_eq!(point, Point { x: -2, y: 5 });

§Raw values

A RawValue keeps one CBOR item as its raw encoded bytes — validated for well-formedness, but never decoded. Serializing splices the bytes into the stream untouched and deserializing captures them byte for byte, which preserves the exact wire encoding for signature payloads, pass-through items and deferred decoding:

use serde::{Deserialize, Serialize};

#[derive(Debug, PartialEq, Deserialize, Serialize)]
struct Signed {
    #[serde(with = "serde_bytes")]
    signature: Vec<u8>,
    payload: cbor2::RawValue,
}

let bytes = cbor2::to_vec(&Signed {
    signature: vec![0xde, 0xad],
    payload: cbor2::RawValue::serialized(&("untouched", 42)).unwrap(),
}).unwrap();

let signed: Signed = cbor2::from_slice(&bytes).unwrap();
// Verify `signed.signature` over `signed.payload.as_bytes()`, then:
let (text, n): (String, u8) = signed.payload.deserialized().unwrap();
assert_eq!((text.as_str(), n), ("untouched", 42));

TryFrom converts in both directions between RawValue and Value: decoding one way, encoding the other.

§CBOR sequences

CBOR sequences (RFC 8742) are streams of adjacent complete CBOR items. Write them by calling to_writer repeatedly, and read them with de::Deserializer::into_iter:

let mut stream = Vec::new();
cbor2::to_writer(&"hello", &mut stream).unwrap();
cbor2::to_writer(&42u64, &mut stream).unwrap();

let items: Vec<cbor2::Value> = cbor2::de::Deserializer::from_reader(&stream[..])
    .into_iter()
    .collect::<Result<_, _>>()
    .unwrap();

assert_eq!(items, vec![cbor2::Value::from("hello"), cbor2::Value::from(42)]);
assert!(cbor2::validate(&stream[..]).is_err()); // not exactly one item

§Tags

CBOR data items can be wrapped in semantic tags (RFC 8949 §3.4). The wrapper types in the tag module capture and emit tags through serde:

use cbor2::tag::RequireExact;

// Tag 32: a URI.
type Uri = RequireExact<String, 32>;

let uri: Uri = RequireExact("https://example.com".into());
let bytes = cbor2::to_vec(&uri).unwrap();
assert_eq!(bytes[0], 0xd8); // tag(32)

§Integer map keys and tags (COSE)

Protocols like COSE (RFC 9052) key their maps with integers and wrap their messages in tags, which serde’s data model cannot express. With the derive feature, #[derive(Cbor)] declares both — a textual #[serde(rename = "1")] stays a text key, so there is no ambiguity between the two. The derive generates the Serialize and Deserialize impls itself, so serde’s derives must not be repeated alongside it:

use cbor2::Cbor;

#[derive(Debug, PartialEq, Cbor)]
#[cbor(tag = 98)]
struct CoseSign {
    #[cbor(key = 1)]
    kty: u8,
    #[cbor(key = 3)]
    alg: i8,
}

let key = CoseSign { kty: 2, alg: -7 };
let bytes = cbor2::to_vec(&key).unwrap();
assert_eq!(hex::encode(&bytes), "d862a201020326"); // 98({1: 2, 3: -7})
assert_eq!(cbor2::from_slice::<CoseSign>(&bytes).unwrap(), key);

The tag is optional, and the serde attributes (alias, default, skip, with, …) work as usual; map types like HashMap<String, _> are unaffected. The declared keys and tag stay inspectable at runtime through the Cbor trait, which the derive implements alongside the serde traits.

The derive touches neither the field names nor the type name — the protocol details ride along on a hidden shadow type (see ser::STRUCT_MARKER) recognized only by this crate’s serializers — so the same type still serializes naturally everywhere else. JSON, for example, just works, with the original field names and no tag:

let json = serde_json::to_string(&key).unwrap();
assert_eq!(json, r#"{"kty":2,"alg":-7}"#);
assert_eq!(serde_json::from_str::<CoseSign>(&json).unwrap(), key);

§Allocation-free helpers

Three helpers work without touching the heap: validate checks that an input is exactly one well-formed CBOR item (including text UTF-8 validity), serialized_size computes the exact encoded size of any serializable value, and to_slice encodes into a caller-provided buffer.

let value = ("hello", vec![1u8, 2, 3]);
let bytes = cbor2::to_vec(&value).unwrap();

assert_eq!(cbor2::serialized_size(&value).unwrap(), bytes.len() as u64);
assert!(cbor2::validate(&bytes[..]).is_ok());
assert!(cbor2::validate(&bytes[..bytes.len() - 1]).is_err()); // truncated

let mut buffer = [0u8; 16];
assert_eq!(cbor2::to_slice(&value, &mut buffer).unwrap(), &bytes[..]);

§Crate features

  • std (default) — implements the io traits for every std::io::Read/std::io::Write and adds the HashMap conversions. Implies alloc.
  • alloc — everything that needs a heap, without std: Value, to_vec/from_slice/from_reader, RawValue, diagnostic/diagnostic_pretty, the deterministic encoders and the cbor! macro. Readers and writers are byte slices, Vec<u8>, or custom io trait implementations.
  • neither — a #![no_std] core for constrained targets: streaming serialization with to_writer/to_slice/serialized_size, validate, the tag wrappers and the core header codec. Deserializing through serde requires alloc.
  • derive — the Cbor derive macro; works in all three modes (deserialization again requiring alloc).

§Diagnostic notation

diagnostic renders raw CBOR as the compact human-readable text form of RFC 8949 §8; diagnostic_pretty does the same with two-space indentation. Both work on the wire and preserve what a Value cannot represent: indefinite-length markers, undefined, and unassigned simple values. Value implements Display with the same compact notation, and Debug pretty-prints it with indentation.

let bytes = hex::decode("bf61610161629f0203ffff").unwrap();
assert_eq!(
    cbor2::diagnostic(&bytes[..]).unwrap(),
    r#"{_ "a": 1, "b": [_ 2, 3]}"#
);
assert_eq!(
    cbor2::diagnostic_pretty(&bytes[..]).unwrap(),
    "{_\n  \"a\": 1,\n  \"b\": [_\n    2,\n    3\n  ]\n}"
);

let value = cbor2::cbor!({ "k": [1, -2.5, null] }).unwrap();
assert_eq!(value.to_string(), r#"{"k": [1, -2.5, null]}"#);

§Low-level headers

The core module exposes the pull/push header codec for applications that need to preserve wire structure such as indefinite-length strings:

use cbor2::core::{Decoder, Encoder, Header};

let mut bytes = Vec::new();
let mut enc = Encoder::from(&mut bytes);
enc.push(Header::Array(None)).unwrap();
enc.text("chunked").unwrap();
enc.bytes(&[0xde, 0xad]).unwrap();
enc.push(Header::Break).unwrap();

let mut dec = Decoder::from(&bytes[..]);
assert_eq!(dec.pull().unwrap(), Header::Array(None));

let Header::Text(len) = dec.pull().unwrap() else { unreachable!() };
let mut text = String::new();
dec.text_body(len, &mut text).unwrap();
assert_eq!(text, "chunked");

let Header::Bytes(len) = dec.pull().unwrap() else { unreachable!() };
let mut body = Vec::new();
dec.bytes_body(len, &mut body).unwrap();
assert_eq!(body, vec![0xde, 0xad]);
assert_eq!(dec.pull().unwrap(), Header::Break);

§Deterministic encoding

to_canonical_vec/to_canonical_writer produce output satisfying the core deterministic encoding requirements of RFC 8949 §4.2.1: preferred (smallest) serializations, definite lengths only, and map keys sorted in the bytewise lexicographic order of their encodings. Value::canonicalize applies the same normalization to a Value in place.

use std::collections::HashMap;

// HashMap iteration order is random, but the encoding is stable.
let map: HashMap<&str, i32> = [("z", 1), ("aa", 2), ("b", 3)].into();

let bytes = cbor2::to_canonical_vec(&map).unwrap();
assert_eq!(bytes, cbor2::to_canonical_vec(&map).unwrap());
assert_eq!(hex::encode(&bytes), "a3616203617a01626161 02".replace(' ', ""));

Many existing protocols instead use the older “Canonical CBOR” key order of RFC 7049 §3.9 (kept as RFC 8949 §4.2.3), where shorter encoded keys sort first. Pass KeyOrder::LengthFirst to the *_with variants for that:

use cbor2::KeyOrder;

let map: std::collections::HashMap<i64, bool> = [(100, true), (-1, false)].into();

// Bytewise (RFC 8949 §4.2.1): 100 (0x1864) sorts before -1 (0x20).
let core = cbor2::to_canonical_vec(&map).unwrap();
assert_eq!(hex::encode(&core), "a2 1864f5 20f4".replace(' ', ""));

// Length-first (RFC 7049 §3.9): -1 sorts before 100.
let legacy = cbor2::to_canonical_vec_with(&map, KeyOrder::LengthFirst).unwrap();
assert_eq!(hex::encode(&legacy), "a2 20f4 1864f5".replace(' ', ""));

§Design decisions

This implementation is wire-compatible with ciborium, whose design it follows:

  • Numbers are always encoded in their smallest lossless form, as deterministic encoding (RFC 8949 §4.2.1) requires. Integer width in Rust is treated as an in-memory detail, not a wire property: 1u64 encodes as one byte, and that byte happily decodes into a u128 or an i8.
  • u128/i128 values outside the 64-bit range are encoded as bignums (tags 2 and 3), and bignums small enough to fit are accepted for any integer type.
  • Maps are represented as Vec<(Value, Value)> in Value, preserving wire order and arbitrary (even duplicate) keys.
  • Be liberal in what you accept: decoding handles indefinite-length items, segmented strings, half-width floats, leading zeros in bignums and unknown tags in most positions, even though encoding never produces most of those forms.
  • Deeply nested input fails with RecursionLimitExceeded instead of exhausting the stack; see de::Deserializer::with_recursion_limit.

§History

This crate descends from cbor by Andrew Gallant, whose 0.4 and earlier releases were built on the long-deprecated rustc-serialize framework and predate both serde 1.0 and RFC 8949. Version 0.5 was a from-scratch rewrite published under the cbor2 name — the original crates.io name stays with the legacy release — and 1.0 stabilizes it; none of the old API survives.

Modules§

core
Low-level CBOR encoding and decoding.
de
Serde deserialization support for CBOR.
io
The reader and writer traits used by the encoder and decoder.
ser
Serde serialization support for CBOR.
tag
Helper types for capturing and emitting CBOR tags (RFC 8949 §3.4).
value
A dynamic CBOR value.

Macros§

cbor
Builds a Value from JSON-like syntax.

Structs§

RawValue
A valid CBOR item, kept as its raw encoded bytes.

Enums§

KeyOrder
The map key ordering used by deterministic encoding.
Value
A representation of any CBOR item that can be inspected and manipulated dynamically.

Traits§

Cbor
The CBOR protocol details a #[derive(Cbor)] type declares: its integer map keys and its tag.

Functions§

diagnostic
Renders one CBOR item from a reader in diagnostic notation (RFC 8949 §8).
diagnostic_pretty
Like diagnostic, but pretty-prints arrays and maps with two-space indentation, one element per line.
from_reader
Deserializes a value from CBOR read out of a Read.
from_slice
Deserializes a value from a byte slice of CBOR.
serialized_size
Computes the exact number of bytes that to_writer would produce for a value, without writing or buffering anything.
to_canonical_vec
Serializes a value as deterministically encoded CBOR into a new Vec<u8>, satisfying the core deterministic encoding requirements of RFC 8949 §4.2.1.
to_canonical_vec_with
Serializes a value as deterministically encoded CBOR into a new Vec<u8>, sorting map keys in the given KeyOrder.
to_canonical_writer
Serializes a value as deterministically encoded CBOR into a Write, satisfying the core deterministic encoding requirements of RFC 8949 §4.2.1.
to_canonical_writer_with
Serializes a value as deterministically encoded CBOR into a Write, sorting map keys in the given KeyOrder.
to_slice
Serializes a value as CBOR into the front of buffer, returning the written prefix.
to_vec
Serializes a value as CBOR into a new Vec<u8>.
to_writer
Serializes a value as CBOR into a Write.
validate
Checks that the input contains exactly one well-formed CBOR item.

Derive Macros§

Cbor
Derives serde::Serialize and serde::Deserialize with CBOR protocol details: integer map keys and a CBOR tag (COSE, RFC 9052).