binary-data-schema 0.2.0

Meta language for raw binary serialization
Documentation

Binary Data Schema (BDS) is an extension of JSON schema. With this extension it is possible to convert JSON documents into raw bytes and reverse.

The intention is to use BDS in WoT Thing Descriptions in order to allow application/octet-stream as a content type for forms.

Capabilities

Following a list of all types that can be encoded. The list may not be exhaustive. Further explanation can be found in the description of the respective modules.

  • Boolean values (true, false)
  • Integer values:
    • Length 1 to 8 bytes
    • Big and Little Endianness
    • Signed or unsigned
  • Number values:
    • Single and double precision according to IEEE 754
    • Via linear transformation (value / scale - offset) encoding as integer
  • Boolean, integer and numeric values can be encoded as bitfields (cover only a certain number of bits instead of whole bytes)
  • Object and array schemata allow for encoding complex data structures
  • UTF-8 strings
  • Hex-encoded strings (regex: ^[0-9a-f]{2}*$)
  • Variable sized values, i.e. strings and arrays, have different ways to define their length:
    • Fixed size
    • Explicit length → Length of the value is encoded at the beginning of the value
    • End pattern → The end of the value is marked by a sentinel value, like in C with \0
    • Capacity → A fixed space is reserved. Unused space is filled with padding
    • Till end → The value continues until the end of the message

Features

The specific features for each schema are explained in their sub module:

Each feature is explained with an example. The examples follow the same structure as the (commented) default example below.

BDS is by far not feature complete. If you do not find a feature described it is probably safe to assume that it is not yet implemented. If you require a specific feature file an issue, please. PRs are also welcome.

default

The only feature described on this level is default.

In general binary protocols often have some kind of magic start and end bytes. To simulate those BDS uses the default keyword. When encoding a JSON document fields whose schema has a default value those do not have to be provided.

  • Fields with default are not required for encoding but included when decoded.
  • To keep BDS aligned with JSON schema it is recommended to add "required" to object schemata.

Example

# use binary_data_schema::*;
# use valico::json_schema;
# use serde_json::{json, from_value};
let schema = json!({
    "type": "object",
    "properties": {
        "start": {
            "type": "string",
            "format": "binary",
            "minLength": 2,
            "maxLength": 2,
            "default": "fe",
            "position": 1
        },
        "is_on": {
            "type": "boolean",
            "position": 5
        },
        "end": {
            "type": "string",
            "format": "binary",
            "minLength": 2,
            "maxLength": 2,
            "default": "ef",
            "position": 10
        }
    },
    "required": ["is_on"]
});
let mut scope = json_schema::Scope::new();
// Valid JSON schema
let j_schema = scope.compile_and_return(schema.clone(), false)?;
// Valid Binary Data schema
let schema = from_value::<DataSchema>(schema)?;

let value = json!({ "is_on": true });
// 'value' is valid for the JSON schema
assert!(j_schema.validate(&value).is_valid());
let mut encoded = Vec::new();
// 'value' is valid for the Binary Data schema
schema.encode(&mut encoded, &value)?;
# let expected = [0xfe, 1, 0xef];
# assert_eq!(&expected, encoded.as_slice());

let mut encoded = std::io::Cursor::new(encoded);
let back = schema.decode(&mut encoded)?;
let expected = json!({
    "start": "fe",
    "is_on": true,
    "end": "ef"
});
// The retrieved value is valid for the JSON schema
assert!(j_schema.validate(&back).is_valid());
// The retrieved value is as expected
assert_eq!(back, expected);
# Ok::<(), anyhow::Error>(())