Expand description
§Processing CBOR Diagnostic Notation (EDN)
This crate provides tools to interconvert CBOR data between its binary and its diagnostic form, and can manipulate the diagnostic representation.
§What this works with
CBOR is a self-describing data format that is compact and efficient to use; think JSON but binary. As a binary format, it is not human readable, but there exists a Diagnostic Notation for it called EDN (which is currently being revised). CBOR can express all information of JSON and more, and the diagnostic notation extends JSON. As examples, the compact binary data (represented in hex in the first line) is equivalent to the diagnostic notation in the 2nd line:
83 01 02 62 68 69
[1, 2, "hi"]§API Overview
The main entry points to this crate are:
StandaloneItemcan parse both CBOR and diagnostic notation.Sequencecan parse multiple concatenated CBOR items (called CBOR sequences).
In all cases, the library preserves loaded data through serialization back into the original format:
-
Choices that are exclusive to EDN are preserved when saving as EDN.
This includes whether a byte string is shown as ASCII or hexadecimal, comments, optional commas, and even spaces.
-
Choices that mainly exist in CBOR.
This includes whether a list is encoded in definite length or indefinite length, and in how many bytes short numbers are encoded.
Converting between the format is, of course, preserving the CBOR information content, but generally loses aspects such as comments or trailing zeros in decimal numbers.
Beyond converting CBOR to EDN and vice versa, this crate can also be used to manipulate CBOR,
e. g. to provide explanatory comments around items or to use more specialized representations.
Handling those usually involves a single Item, which is distinct from a StandaloneItem
in that any space or comments around it are part of the surrounding structure (which means that
it is not sensible to parse EDN into a single Item because even an innocent line break at
the end of the EDN would throw the parser off).
§Example usage
This example shows how to convert EDN into CBOR and back to EDN.
// Ingest CBOR Diagnostic Notation.
let input: &str = &r#"[1, 2, "x"]"#;
let parsed = StandaloneItem::parse(input).unwrap();
// Emit it as CBOR.
let cbor = parsed.to_cbor().unwrap();
// Parse the CBOR
let parsed = StandaloneItem::from_cbor(&cbor).unwrap();
let edn = parsed.serialize();
assert_eq!(edn.as_str(), input);§Implementation remarks
The parser used by this crate is a PEG (Parsing Expression Grammer) parser built from the ABNF used in the EDN specification.
The types’ data model is oriented more towards EDN than towards CBOR, as that has richer information and is generally needed for tasks such as annotation or delayed processing of application oriented literals.
Parsed values are expected to round-trip to identical representations when serialized. Most manipulations of the values will ensure that their serialization output can also be round-tripped from the internal format to the EDN serialization and back into the internal format, but this can not be provided by all. (For example, removing all optional commas while retaining comments would make the previous distinction between whether a comment was before or after a comma indistinguishable).
Correct parsing does not guarantee that the value can also be encoded into CBOR. While there
are aspects that could be handled at parsing time and are not (eg. tag numbers exceeding the
encodable number space), there are cases that can not be handled by a library without further
context or privileges (eg. the e’’ application oriented literal that needs application context,
or the ref’’ application oriented literal that defers to relative files, accessing which can
involve file or network access). Consequentially, conversion to CBOR through the various
.to_cbor() methods is inherently fallible, while .serialize()ing into EDN is not.
§Completeness
Known limitations are:
-
Support for inspecting and constructing CBOR items is incomplete. The most common types can be constructed; contructing or inspecting more exotic items is possible through parsing hand-crafted EDN/CBOR and using the generated serializations, respectively.
-
Options for attaching comments and space are limited and immature:
-
Item::with_comment()&StandaloneItem::set_commentcan be used to add comments, but mainly produce top-level items. Deeper items are not configurable that way, as the comments don’t live in the item but its container. -
Comments can be added to items through visitors such as
Item::visit_map_elements; both the success and the error path of a visiting function can set comments around a tag. -
Replacing an item with hand-crafted EDN (possibly from serialized item) is always an option.
-
-
Indenting EDN works for the easy cases, but more exotic cases such as overflowing the limited width, long keys, or hash comments, easily disrupt the visual result.
§Security
This library does not access network or file system in any surprising ways and does not
endanger memory safety on its own. The main threat in using it is not resource bound: even
without packed CBOR, heavy nesting can easily overflow the stack, and the float conversions are
costly in time. Unless resource usage per user is limited, it is recommended to limit untrusted
user input to the length of repeated { characters that do not yet overflow the stack.
The crate has not been audited internally or externally. As the licenses state, the software is provided “as is”.
§CLI application
Some functionality is available through a binary included with this crate:
$ echo "[1, 2, 'x', ip'2001:db1::/64']" | cbor-edn diag2diag
[1, 2, 'x', ip'2001:db1::/64']Modules§
- application
- Converters for application-oriented literals
- error
- Error types
Structs§
- Item
- A CBOR Item.
- Sequence
- A CBOR Sequence.
- Standalone
Item - A CBOR Item, including any space and comments surrounding it in a serialization.
Enums§
- Delimiter
Policy - Rule set for the
set_delimiters()family of methods - Trailing
Newline Policy