# cbor-ld
CBOR-LD 1.0 processor for Rust, built on top of
[`cbor2`](https://github.com/ldclabs/cbor2).
`cbor-ld` converts JSON-LD documents to and from the current W3C CBOR-LD 1.0
wire shape:
```text
51997([registryEntryId, payload])
```
Registry entry ID `0` stores an uncompressed payload. Other registry IDs use
the default CBOR-LD semantic compression model: JSON-LD context processing,
dynamic term IDs, registry type tables, and value codecs for URLs, multibase
values, `xsd:date`, and `xsd:dateTime`.
The crate intentionally targets modern CBOR-LD 1.0. Older legacy singleton and
range tags from pre-1.0 JavaScript releases are not implemented.
## Why cbor-ld
| CBOR-LD 1.0 envelope | Emits and validates CBOR tag `0xcb1d` / `51997` with `[registryEntryId, payload]`. |
| Uncompressed payloads | Registry entry ID `0` stores the JSON-LD document directly. |
| Semantic compression | Compresses JSON-LD terms through active contexts and deterministic dynamic term IDs. |
| Context processing | Handles embedded, remote, imported, type-scoped, and property-scoped contexts through a caller-provided loader. |
| Registry dictionaries | Uses `TypeTable` for `context`, `url`, `none`, and datatype-specific tables. |
| Value codecs | Compresses URL schemes, UUID URNs, data URLs, DID key/nym base58 parts, multibase values, dates, and datetimes. |
| Stable protocol bytes | Uses `cbor2::to_canonical_vec` by default for deterministic CBOR output. |
| Safe decoding | Calls `cbor2::validate` before decoding so trailing or malformed CBOR is rejected. |
| Dynamic data model | Exposes documents as `cbor2::Value`, preserving integer map keys, byte strings, arrays, maps, and tags. |
Licensed under the MIT License.
## Quick Start
```toml
[dependencies]
cbor-ld = "0.1"
```
When developing this repository alongside `cbor2`, use the local path
dependency:
```toml
[dependencies]
cbor-ld = { path = "../cbor-ld" }
cbor2 = { path = "../cbor2", version = "1" }
```
Encode and decode an uncompressed CBOR-LD document:
```rust
use cbor2::Value;
use cbor_ld::{decode, encode, DecodeOptions, EncodeOptions};
let document = Value::Map(vec![(
Value::Text("hello".into()),
Value::Text("world".into()),
)]);
let bytes = encode(&document, EncodeOptions::uncompressed())?;
let decoded = decode(&bytes, DecodeOptions::default())?;
assert_eq!(decoded, document);
# Ok::<(), cbor_ld::Error>(())
```
Use compression with inline contexts:
```rust
use cbor2::Value;
use cbor_ld::{decode, encode, DecodeOptions, EncodeOptions, TypeTable};
let document = Value::Map(vec![
(
Value::Text("@context".into()),
Value::Map(vec![(
Value::Text("name".into()),
Value::Text("https://schema.org/name".into()),
)]),
),
(Value::Text("name".into()), Value::Text("Alice".into())),
]);
let table = TypeTable::new();
let bytes = encode(&document, EncodeOptions::compressed(1, &table))?;
let decoded = decode(&bytes, DecodeOptions::default())?;
assert_eq!(
cbor2::to_canonical_vec(&decoded)?,
cbor2::to_canonical_vec(&document)?
);
# Ok::<(), Box<dyn std::error::Error>>(())
```
Use `encode_with_loader` and `decode_with_loader` when a document contains
remote context URLs. The loader receives the URL and returns the loaded JSON-LD
document as a `cbor2::Value` with an `@context` entry.
## Type Tables
`TypeTable` maps original values to compressed integer IDs. Every table is
normalized before use so the core `context`, `url`, and `none` subtables are
present.
```rust
use cbor_ld::TypeTable;
let mut table = TypeTable::new();
table.insert("context", "https://example.com/context/v1", 0x8000);
table.insert("https://example.com/type#status", "active", 0x8001);
```
`TypeTable::with_common_tables()` includes the common string and cryptosuite
tables mirrored from the JavaScript reference implementation.
## cbor2 Integration
The implementation uses `cbor2` directly rather than building a parallel CBOR
layer:
- `Value::Tag(0xcb1d, ...)` represents the CBOR-LD envelope.
- `Value::Map` stores compressed term maps with integer keys.
- `Value::Bytes` stores registry-table byte encodings and binary URL/multibase
codec outputs.
- `to_canonical_vec` is the default encoder for deterministic protocol bytes.
- `validate` checks that input is exactly one well-formed CBOR item before
decode.
This keeps CBOR-LD-specific logic focused on JSON-LD semantics while reusing
`cbor2` for RFC 8949 encoding, tags, validation, canonical ordering, and the
dynamic value model.
## Current Scope
Implemented:
- CBOR-LD 1.0 tag `0xcb1d`.
- Registry entry ID `0` uncompressed payloads.
- Default semantic compression for nonzero registry entry IDs.
- Embedded, remote, imported, type-scoped, and property-scoped JSON-LD
contexts.
- URL, UUID URN, data URL, DID key/nym, multibase, `xsd:date`, and
`xsd:dateTime` value codecs.
Not implemented:
- Pre-1.0 legacy singleton tags `0x0500..0x0501`.
- Pre-1.0 legacy range tags `0x0600..0x06ff`.
## Verification
Useful checks while developing:
```bash
cargo fmt --all --check
cargo test
cargo clippy --all-targets -- -D warnings
git diff --check
```