Module binn_ir::specification

source ·
Expand description

Specification

Below is a copy of Binn’s specification, with some minor changes of formatting.


Binn Specification

Format

Each value is stored with 4 possible parameters:

[type][size][count][data]

But most are optional. Only the type parameter is used in all of them. Here is a list of used parameters for basic data types:

boolean, null:
[type]

int, float (storage: byte, word, dword or qword):
[type][data]

string, blob:
[type][size][data]

list, object, map:
[type][size][count][data]

Each parameter can be stored with polymorphic size:

ParameterSize
[type]1 or 2 bytes
[size]1 or 4 bytes
[count]1 or 4 bytes
[data]n bytes

[Type]

Each value is stored starting with the data type. It can use 1 or 2 bytes. The first byte is divided as follows:

 +-------- Storage type
 |  +----- Sub-type size
 |  |  +-- Sub-type
000 0 0000
Storage

The 3 most significant bits are used for the storage type. It has information about how many bytes the data will use. The storage type can be any of:

  • No additional bytes
  • 1 Byte
  • Word (2 bytes, big endian)
  • Dword (4 bytes, big endian)
  • Qword (8 bytes, big endian)
  • String (UTF-8, null terminated)
  • Blob
  • Container

And the constants are:

StorageBitsHexDec
NOBYTES000 0 00000x000
BYTE001 0 00000x2032
WORD010 0 00000x4064
DWORD011 0 00000x6096
QWORD100 0 00000x80128
STRING101 0 00000xA0160
BLOB110 0 00000xC0192
CONTAINER111 0 00000xE0224
Sub-type size

The next bit informs if the type uses 1 or 2 bytes.

If the bit is 0, the type uses only 1 byte, and the sub-type has 4 bits (0 to 15)

 +-------- Storage type
 |  +----- Sub-type size
 |  |  +-- Sub-type
000 0 0000

When the bit is 1, another byte is used for the type and the sub-type has 12 bits (up to 4096)

 +-------- Storage type
 |  +----- Sub-type size
 |  |
000 1 0000  0000 0000
      |  Sub-type   |
      +-------------+
Sub-type

Each storage can have up to 4096 sub-types. They hold what kind of value is stored in that storage space.

Example: a DWORD can contain a signed integer, an unsigned integer, a single precision floating point number, and many more… even user defined types

Here are the values for basic data types, with the sub-type highlighted:

TypeStorageBitsHexDec
NullNOBYTES0000 00000x000
TrueNOBYTES0000 00010x011
FalseNOBYTES0000 00100x022
UInt8BYTE0010 00000x2032
Int8BYTE0010 00010x2133
UInt16WORD0100 00000x4064
Int16WORD0100 00010x4165
UInt32DWORD0110 00000x6096
Int32DWORD0110 00010x6197
FloatDWORD0110 00100x6298
UInt64QWORD1000 00000x80128
Int64QWORD1000 00010x81129
DoubleQWORD1000 00100x82130
TextSTRING1010 00000xA0160
DateTimeSTRING1010 00010xA1161
DateSTRING1010 00100xA2162
TimeSTRING1010 00110xA3163
DecimalStrSTRING1010 01000xA4164
BlobBLOB1100 00000xC0192
ListCONTAINER1110 00000xE0224
MapCONTAINER1110 00010xE1225
ObjectCONTAINER1110 00100xE2226

User Defined Types

An application can use a different DateTime type and store the value in a DWORD or QWORD.

Storage = QWORD (0x80)
Sub-type = 5 (0x05) [choose any unused]

Type DateTime = (0x80 | 0x05 => 0x85)

An application can send HTML inside a Binn structure and can define a type to differ from plain text.

Storage = STRING (0xA0)
Sub-type = 9 (0x09) [choose any unused]

Type HTML = (0xA0 | 0x09 => 0xA9)

If the sub-type is greater than 15, a new byte must be used, and the sub-type size bit must be set:

Storage = STRING (0xA000)
Sub-type size = (0x0100)
Sub-type = 21 (0x0015)

Type HTML = (0xA000 | 0x1000 | 0x0015 => 0xB015)

The created type parameter must be stored as big-endian.

[Size]

This parameter is used in strings, blobs and containters. It can have 1 or 4 bytes.

If the first bit of size is 0, it uses only 1 byte. So when the data size is up to 127 (0x7F) bytes the size parameter will use only 1 byte.

Otherwise a 4 byte size parameter is used, with the msb 1. Leaving us with a high limit of 2 GigaBytes (0x7FFFFFFF).

Data sizeSize Parameter Uses
<= 127 bytes1 byte
> 127 bytes4 bytes

There is no problem if a small size is stored using 4 bytes. The reader must accept both.

For strings, the size parameter does not include the null terminator.

For containers, the size parameter includes the type parameter. It stores the size of the whole structure.

[Count]

This parameter is used only in containers to inform the number of items inside them. It can have 1 or 4 bytes, formatted exactly as the size parameter.

CountCount Parameter Uses
<= 127 items1 byte
> 127 items4 bytes

Containers

List

Lists are containers that store values one after another.

The count parameter informs the number of values inside the container.

[123, “test”, 2.5, true]

Map

Maps are associative arrays using integer numbers for the keys.

The keys are stored using a big-endian DWORD (4 bytes) that are read as a signed integer.

So the current limits are from INT32_MIN to INT32_MAX. But there is room for increase if needed.

The count parameter informs the number of key/value pairs inside the container.

{1: 10, 5: “the value”, 7: true}

Object

Objects are associative arrays using text for the keys.

The keys are not null terminated and the limit is 255 bytes long.

The keys are stored preceded by the key length using a single byte for it.

The count parameter informs the number of key/value pairs inside the container.

{“id”: 1, “name”: “John”, “points”: 30.5, “active”: true}

Limits

TypeMinMax
IntegersINT64_MINUINT64_MAX
Floating point numbersIEEE 754
Strings02 GB
Blobs02 GB
Containers42 GB

Associative Arrays

Key typeMinMax
NumberINT32_MININT32_MAX
Text0255 bytes

Sub-types: up to 4096 for each storage type

Example Structures

A json data such as {“hello”:“world”} is serialized as:

Binn: (17 bytes)

  \xE2           // [type] object (container)
  \x11           // [size] container total size
  \x01           // [count] key/value pairs
  \x05hello      // key
  \xA0           // [type] = string
  \x05           // [size]
  world\x00      // [data] (null terminated)
A list of 3 integers:

Json: (14 bytes) >[123, -456, 789]

Binn: (11 bytes)

  \xE0           // [type] list (container)
  \x0B           // [size] container total size
  \x03           // [count] items
  \x20           // [type] = uint8
  \x7B           // [data] (123)
  \x41           // [type] = int16
  \xFE\x38       // [data] (-456)
  \x40           // [type] = uint16
  \x03\x15       // [data] (789)
A list inside a map:

Json: (25 bytes) >{1: “add”, 2: [-12345, 6789]}

Binn: (26 bytes)

 \xE1             // [type] map (container)
 \x1A             // [size] container total size
 \x02             // [count] key/value pairs
 \x00\x00\x00\x01 // key
 \xA0             // [type] = string
 \x03             // [size]
 add\x00          // [data] (null terminated)
 \x00\x00\x00\x02 // key
 \xE0             // [type] list (container)
 \x09             // [size] container total size
 \x02             // [count] items
 \x41             // [type] = int16
 \xCF\xC7         // [data] (-12345)
 \x40             // [type] = uint16
 \x1A\x85         // [data] (6789)
A list of objects:

Json: (47 bytes) >[ {“id”: 1, “name”: “John”}, {“id”: 2, “name”: “Eric”} ]

Binn: (43 bytes)

 \xE0           // [type] list (container)
 \x2B           // [size] container total size
 \x02           // [count] items

 \xE2           // [type] object (container)
 \x14           // [size] container total size
 \x02           // [count] key/value pairs

 \x02id         // key
 \x20           // [type] = uint8
 \x01           // [data] (1)

 \x04name       // key
 \xA0           // [type] = string
 \x04           // [size]
 John\x00       // [data] (null terminated)

 \xE2           // [type] object (container)
 \x14           // [size] container total size
 \x02           // [count] key/value pairs

 \x02id         // key
 \x20           // [type] = uint8
 \x02           // [data] (2)

 \x04name       // key
 \xA0           // [type] = string
 \x04           // [size]
 Eric\x00       // [data] (null terminated)