cbor-ld 0.1.0

CBOR-LD 1.0 processor built on cbor2 with semantic compression, JSON-LD context processing, type tables and deterministic CBOR output.
Documentation
# cbor-ld

CBOR-LD 1.0 processor for Rust, built on top of
[`cbor2`](https://github.com/ldclabs/cbor2).

`cbor-ld` converts JSON-LD documents to and from the current W3C CBOR-LD 1.0
wire shape:

```text
51997([registryEntryId, payload])
```

Registry entry ID `0` stores an uncompressed payload. Other registry IDs use
the default CBOR-LD semantic compression model: JSON-LD context processing,
dynamic term IDs, registry type tables, and value codecs for URLs, multibase
values, `xsd:date`, and `xsd:dateTime`.

The crate intentionally targets modern CBOR-LD 1.0. Older legacy singleton and
range tags from pre-1.0 JavaScript releases are not implemented.

## Why cbor-ld

| Need | Built in |
| --- | --- |
| CBOR-LD 1.0 envelope | Emits and validates CBOR tag `0xcb1d` / `51997` with `[registryEntryId, payload]`. |
| Uncompressed payloads | Registry entry ID `0` stores the JSON-LD document directly. |
| Semantic compression | Compresses JSON-LD terms through active contexts and deterministic dynamic term IDs. |
| Context processing | Handles embedded, remote, imported, type-scoped, and property-scoped contexts through a caller-provided loader. |
| Registry dictionaries | Uses `TypeTable` for `context`, `url`, `none`, and datatype-specific tables. |
| Value codecs | Compresses URL schemes, UUID URNs, data URLs, DID key/nym base58 parts, multibase values, dates, and datetimes. |
| Stable protocol bytes | Uses `cbor2::to_canonical_vec` by default for deterministic CBOR output. |
| Safe decoding | Calls `cbor2::validate` before decoding so trailing or malformed CBOR is rejected. |
| Dynamic data model | Exposes documents as `cbor2::Value`, preserving integer map keys, byte strings, arrays, maps, and tags. |

Licensed under the MIT License.

## Quick Start

```toml
[dependencies]
cbor-ld = "0.1"
```

When developing this repository alongside `cbor2`, use the local path
dependency:

```toml
[dependencies]
cbor-ld = { path = "../cbor-ld" }
cbor2 = { path = "../cbor2", version = "1" }
```

Encode and decode an uncompressed CBOR-LD document:

```rust
use cbor2::Value;
use cbor_ld::{decode, encode, DecodeOptions, EncodeOptions};

let document = Value::Map(vec![(
    Value::Text("hello".into()),
    Value::Text("world".into()),
)]);

let bytes = encode(&document, EncodeOptions::uncompressed())?;
let decoded = decode(&bytes, DecodeOptions::default())?;

assert_eq!(decoded, document);
# Ok::<(), cbor_ld::Error>(())
```

Use compression with inline contexts:

```rust
use cbor2::Value;
use cbor_ld::{decode, encode, DecodeOptions, EncodeOptions, TypeTable};

let document = Value::Map(vec![
    (
        Value::Text("@context".into()),
        Value::Map(vec![(
            Value::Text("name".into()),
            Value::Text("https://schema.org/name".into()),
        )]),
    ),
    (Value::Text("name".into()), Value::Text("Alice".into())),
]);

let table = TypeTable::new();
let bytes = encode(&document, EncodeOptions::compressed(1, &table))?;
let decoded = decode(&bytes, DecodeOptions::default())?;

assert_eq!(
    cbor2::to_canonical_vec(&decoded)?,
    cbor2::to_canonical_vec(&document)?
);
# Ok::<(), Box<dyn std::error::Error>>(())
```

Use `encode_with_loader` and `decode_with_loader` when a document contains
remote context URLs. The loader receives the URL and returns the loaded JSON-LD
document as a `cbor2::Value` with an `@context` entry.

## Type Tables

`TypeTable` maps original values to compressed integer IDs. Every table is
normalized before use so the core `context`, `url`, and `none` subtables are
present.

```rust
use cbor_ld::TypeTable;

let mut table = TypeTable::new();
table.insert("context", "https://example.com/context/v1", 0x8000);
table.insert("https://example.com/type#status", "active", 0x8001);
```

`TypeTable::with_common_tables()` includes the common string and cryptosuite
tables mirrored from the JavaScript reference implementation.

## cbor2 Integration

The implementation uses `cbor2` directly rather than building a parallel CBOR
layer:

- `Value::Tag(0xcb1d, ...)` represents the CBOR-LD envelope.
- `Value::Map` stores compressed term maps with integer keys.
- `Value::Bytes` stores registry-table byte encodings and binary URL/multibase
  codec outputs.
- `to_canonical_vec` is the default encoder for deterministic protocol bytes.
- `validate` checks that input is exactly one well-formed CBOR item before
  decode.

This keeps CBOR-LD-specific logic focused on JSON-LD semantics while reusing
`cbor2` for RFC 8949 encoding, tags, validation, canonical ordering, and the
dynamic value model.

## Current Scope

Implemented:

- CBOR-LD 1.0 tag `0xcb1d`.
- Registry entry ID `0` uncompressed payloads.
- Default semantic compression for nonzero registry entry IDs.
- Embedded, remote, imported, type-scoped, and property-scoped JSON-LD
  contexts.
- URL, UUID URN, data URL, DID key/nym, multibase, `xsd:date`, and
  `xsd:dateTime` value codecs.

Not implemented:

- Pre-1.0 legacy singleton tags `0x0500..0x0501`.
- Pre-1.0 legacy range tags `0x0600..0x06ff`.

## Verification

Useful checks while developing:

```bash
cargo fmt --all --check
cargo test
cargo clippy --all-targets -- -D warnings
git diff --check
```