Module no_proto::schema [−][src]
Schemas are used to describe the shape and types of buffer objects
NoProto schemas describe how the data in a buffer is stored and what types of data are stored. Schemas are required to create buffers and each buffer is a descendant of the schema that created it.
Schemas can be loaded from JSON, ES6 or Bytes.
As a quick example, the schemas below are indentical in what they describe, only different in syntax.
/* List Of Strings */
// JSON Schema
{"type": "list", "of": {"type": "string"}}
// ES6 Schema
list({of: string()})
// Byte schema (not human readable)
[23, 2, 0, 0, 0, 0, 0]
NoProto provides complete import and export interop for all schema syntax variants. You can create a NoProto factory using any schema syntax then export to any syntax. This means you can compile your schema into bytes using the runtime, then later expand the bytes schema to JSON or IDL if you need to inspect it.
Buffers are forever related to the schema that created them, buffers created from a given schema can only later be decoded, edited or compacted by that same schema or a safe mutation of it.
Schemas are validated and sanity checked upon creation. You cannot pass an invalid JSON or ES6 schema into a factory constructor and build/parse buffers with it.
Schemas can be as simple as a single scalar type, for example a perfectly valid schema for a buffer that contains only a string:
// JSON
{
"type": "string"
}
// ES6
string()
However, you will likely want to store more complicated objects, so that's easy to do as well.
// JSON
{
"type": "struct",
"fields": [
["userID", {"type": "string"}], // userID field contains a string
["password", {"type": "string"}], // password field contains a string
["email", {"type": "string"}], // email field contains a string
["age", {"type": "u8"}] // age field contains a Uint8 number (0 - 255)
]
}
// ES6
struct({fields: {
userID: string(), // userID field contains a string
password: string(), // password field contains a string
email: string(), // email field contains a string
age: u8() // age field contains a Uint8 number (0 - 255)
}})
There are multiple collection types and they can be nested.
For example, this is a list of structs. Every item in the list is a struct with two fields: id and title. Both fields are a string type.
// JSON
{
"type": "list",
"of": {
"type": "struct",
"fields": [
["id", {"type": "string"}]
["title", {"type": "string"}]
]
}
}
// ES6
list({of: struct({fields: {
id: string(),
title: string()
}})})
You can nest collections as much and however you'd like, up to 255 levels.
A list of strings is just as easy...
// JSON
{
"type": "list",
"of": { "type": "string" }
}
// ES6
list({of: string()})
ES6 Schemas
NoProto's ES6/Javascript IDL schemas use a very strict subset of the ES6 syntax. Expressions like 2 + 3
, variables and most other javascripty things aren't supported. The ES6 IDL is not intended to provide a JS runtime, only a familiar syntax.
The following ES6 syntax is supported:
- Calling functions with or without arguments like
myFn()
,myFn(1, 2)
, ormyFn("hello", [1, 2])
- Single line comments on their own line or at the end of a line using double slash
//
. - Arrays with any valid JS object. Examples:
[]
,[1, 2]
,["hello", myFn()]
- Objects with string keys and any valid JS object for values. Keys cannot use quotes. Examples:
{}
,{key: "value"}
,{foo: "bar", baz: myFn()}
- Arrays and objects can be safely nested. There is a nesting limit of 255 levels.
- Numbers, Strings contained in double quotes '
"
', and Boolean values. - Strings can safely contain escaped double quotes
\"
inside them. - ES6 arrow methods that contain comments or statements seperated by semicolons. Example:
() => { string(); }
If the syntax is not in the above list, it will not be parsed correctly by NoProto.
ES6 schemas are not as expensive to parse as JSON schemas, but nowhere near as fast to parse as byte schemas.
JSON Schemas
If you're familiar with Typescript, JSON schemas can be described by this recursive interface:
interface NP_Schema {
// table, string, bytes, etc
type: string;
// used by string & bytes types
size?: number;
// used by decimal type, the number of decimal places every value has
exp?: number;
// used by tuple to indicite bytewise sorting of children
sorted?: boolean;
// used by list types
of?: NP_Schema
// used by map types
value?: NP_Schema
// used by tuple types
values?: NP_Schema[]
// used by struct types
fields?: [string, NP_Schema][];
// used by option/enum types
choices?: string[];
// used by unions
types?: [string, NP_Schema][];
// used by portals
to?: string
// default value for this item
default?: any;
}
Schema Data Types
Each type has trade offs associated with it. The table and documentation below go into further detail.
Supported Data Types
Schema Type | Rust Type | Zero Copy Type | Bytewise Sorting | Bytes (Size) | Limits / Notes |
---|---|---|---|---|---|
struct | NP_Struct | - | 𐄂 | 4 bytes - ~4GB | Set of vtables with up to 255 named fields. |
list | NP_List | - | 𐄂 | 8 bytes - ~4GB | Linked list with integer indexed values and up to 255 items. |
map | NP_Map | - | 𐄂 | 4 bytes - ~4GB | Linked list with &str keys, up to 255 items. |
tuple | NP_Tuple | - | ✓ * | 4 bytes - ~4GB | Static sized collection of specific values. Up to 255 values. |
any | NP_Any | - | 𐄂 | 2 bytes - ~4GB | Generic type. |
string | String | &str | ✓ ** | 2 bytes - ~4GB | Utf-8 formatted string. |
bytes | Vec<u8> | &u8 | ✓ ** | 2 bytes - ~4GB | Arbitrary bytes. |
int8 | i8 | - | ✓ | 1 byte | -127 to 127 |
int16 | i16 | - | ✓ | 2 bytes | -32,768 to 32,768 |
int32 | i32 | - | ✓ | 4 bytes | -2,147,483,648 to 2,147,483,648 |
int64 | i64 | - | ✓ | 8 bytes | -9,223,372,036,854,775,808 to 9,223,372,036,854,775,808 |
uint8 | u8 | - | ✓ | 1 byte | 0 - 255 |
uint16 | u16 | - | ✓ | 2 bytes | 0 - 65,535 |
uint32 | u32 | - | ✓ | 4 bytes | 0 - 4,294,967,295 |
uint64 | u64 | - | ✓ | 8 bytes | 0 - 18,446,744,073,709,551,616 |
float | f32 | - | 𐄂 | 4 bytes | -3.4e38 to 3.4e38 |
double | f64 | - | 𐄂 | 8 bytes | -1.7e308 to 1.7e308 |
enum | NP_Enum | - | ✓ | 1 byte | Up to 255 string based options in schema. |
bool | bool | - | ✓ | 1 byte | |
decimal | NP_Dec | - | ✓ | 8 bytes | Fixed point decimal number based on i64. |
geo4 | NP_Geo | - | ✓ | 4 bytes | 1.1km resolution (city) geographic coordinate |
geo8 | NP_Geo | - | ✓ | 8 bytes | 11mm resolution (marble) geographic coordinate |
geo16 | NP_Geo | - | ✓ | 16 bytes | 110 microns resolution (grain of sand) geographic coordinate |
ulid | NP_ULID | &NP_ULID | ✓ | 16 bytes | 6 bytes for the timestamp (5,224 years), 10 bytes of randomness (1.2e24) |
uuid | NP_UUID | &NP_UUID | ✓ | 16 bytes | v4 UUID, 2e37 possible UUIDs |
date | NP_Date | - | ✓ | 8 bytes | Good to store unix epoch (in milliseconds) until the year 584,866,263 |
portal | - | - | 𐄂 | 0 bytes | A type that just points to another type in the buffer. |
- *
sorting
must be set totrue
in the schema for this object to enable sorting. - ** String & Bytes can be bytewise sorted only if they have a
size
property in the schema
Legend
Bytewise Sorting
Bytewise sorting means that two buffers can be compared at the byte level without deserializing and a correct ordering between the buffer's internal values will be found. This is extremely useful for storing ordered keys in databases.
Each type has specific notes on wether it supports bytewise sorting and what things to consider if using it for that purpose.
You can sort by multiple types/values if a tuple is used. The ordering of values in the tuple will determine the sort order. For example if you have a tuple with types (A, B) the ordering will first sort by A, then B where A is identical. This is true for any number of items, for example a tuple with types (A,B,C,D) will sort by D when A, B & C are identical.
Compaction
Campaction is an optional operation you can perform at any time on a buffer, typically used to recover free space. NoProto Buffers are contiguous, growing arrays of bytes. When you add or update a value sometimes additional memory is used and the old value is dereferenced, meaning the buffer is now occupying more space than it needs to. This space can be recovered with compaction. Compaction involves a recursive, full copy of all referenced & valid values of the buffer, it's an expensive operation that should be avoided.
Sometimes the space you can recover with compaction is minimal or you can craft your schema and upates in such a way that compactions are never needed, in these cases compaction can be avoided with little to no consequence.
Deleting a value will almost always mean space can be recovered with compaction, but updating values can have different outcomes to the space used depending on the type and options.
Each type will have notes on how updates can lead to wasted bytes and require compaction to recover the wasted space.
Schema Mutations
Once a schema is created all the buffers it creates depend on that schema for reliable de/serialization, data access, and compaction.
There are safe ways you can mutate a schema after it's been created without breaking old buffers, however those updates are limited. The safe mutations will be mentioned for each type, consider any other schema mutations unsafe.
Changing the type
property of any value in the schame is unsafe. It's only sometimes safe to modify properties besides type
.
Schema Types
Every schema type maps exactly to a native data type in your code.
struct
Structs represnt a fixed number of named fields, with each field having it's own data type.
- Bytewise Sorting: Unsupported
- Compaction: Fields without values will be removed from the buffer durring compaction.
- Schema Mutations: The ordering of items in the
fields
property must always remain the same. It's safe to add new fields to the bottom of the field list or rename fields, but never to remove fields. field types cannot be changed safely. If you need to depreciate a field, set it's name to an empty string.
Struct schemas have a single required property called fields
. The fields
property is an array of arrays that represent all possible fields in the struct and their data types. Any type can be used in fields, including other structs. Structs cannot have more than 255 fields, and the field names cannot be longer than 255 UTF8 bytes.
Structs do not store the field names in the buffer, only the field index, so this is a very efficient way to store associated data.
If you need flexible field names use a map
type instead.
// JSON
{
"type": "struct",
"fields": [ // can have between 1 and 255 fields
["field name", {"type": "data type for this field"}],
["name", {"type": "string"}],
["tags", {"type": "list", "of": { // nested list of strings
"type": "string"
}}],
["age", {"type": "u8"}], // Uint8 number
["meta", {"type": "struct", "fields": [ // nested struct
["favorite_color", {"type": "string"}],
["favorite_sport", {"type": "string"}]
]}]
]
}
// ES6
struct({fields: {
// data_type() isn't a real data type...
field_name: data_type(),
name: string(),
tags: list({of: string()}),
age: u8(),
meta: struct({fields: {
favorite_color: string(),
favorite_sport: string()
}})
}})
list
Lists represent a dynamically sized list of items. The type for every item in the list is identical and the order of entries is mainted in the buffer. Lists do not have to contain contiguous entries, gaps can safely and efficiently be stored.
- Bytewise Sorting: Unsupported
- Compaction: Indexes that have had their value cleared will be removed from the buffer. If a specific index never had a value, it occupies zero space.
- Schema Mutations: None
Lists have a single required property in the schema, of
. The of
property contains another schema for the type of data contained in the list. Any type is supported, including another list.
The more items you have in a list, the slower it will be to seek to values towards the end of the list or loop through the list.
// a list of list of strings
// JSON
{
"type": "list",
"of": {
"type": "list",
"of": {"type": "string"}
}
}
// ES6
list({of: list({of: string()})})
// list of numbers
// JSON
{
"type": "list",
"of": {"type": "i32"}
}
// ES6
list({of: i32()})
map
A map is a dynamically sized list of items where each key is a &str
. Every value of a map has the same type.
- Bytewise Sorting: Unsupported
- Compaction: Keys without values are removed from the buffer
- Schema Mutations: None
Maps have a single required property in the schema, value
. The property is used to describe the schema of the values for the map. Values can be any schema type, including another map.
If you expect to have fixed, predictable keys then use a table
type instead. Maps are less efficient than tables because keys are stored in the buffer.
The more items you have in a map, the slower it will be to seek to values or loop through the map. Tables are far more performant for seeking to values.
// a map where every value is a string
// JSON
{
"type": "map",
"value": {
"type": "string"
}
}
// ES6
map({value: string()})
tuple
A tuple is a fixed size list of items. Each item has it's own type and index. Tuples support up to 255 items.
- Bytewise Sorting: Supported if all children are scalars that support bytewise sorting and schema
sorted
is set totrue
. - Compaction: If
sorted
is true, compaction will not save space. Otherwise, tuples only reduce in size if children are deleted or children with a dyanmic size are updated. - Schema Mutations: No mutations are safe
Tuples have a single required property in the schema called values
. It's an array of schemas that represnt the tuple values. Any schema is allowed, including other Tuples.
Sorting
You can use tuples to support compound bytewise sorting across multiple values of different types. By setting the sorted
property to true
you enable a strict mode for the tuple that enables sorting features. When sorted
is enabled only scalar values that support sorting are allowed in the schema. For example, strings/bytes types can only be fixed size.
When sorted
is true the order of values is gauranteed to be constant in every buffer and all buffers will be identical in size.
// JSON
{
"type": "tuple",
"values": [
{"type": "string"},
{"type": "list", "of": {"type": "strings"}},
{"type": "u64"}
]
}
// ES6
tuple({values: [string(), list({of: string()}), u64()]})
// tuple for bytewise sorting
// JSON
{
"type": "tuple",
"sorted": true,
"values": [
{"type": "string", "size": 25},
{"type": "u8"},
{"type": "i64"}
]
}
// ES6
tuple({storted: true, values: [
string({size: 25}),
u8(),
i64()
]})
string
A string is a fixed or dynamically sized collection of utf-8 encoded bytes.
- Bytewise Sorting: Supported only if
size
property is set in schema. - Compaction: If
size
property is set, compaction cannot reclaim space. Otherwise it will reclaim space unless all updates have been identical in length. - Schema Mutations: If the
size
property is set it's safe to make it smaller, but not larger (this may cause existing string values to truncate, though). If the field is being used for bytewise sorting, no mutation is safe.
The size
property provides a way to have fixed size strings in your buffers. If a provided string is larger than the size
property it will be truncated. Smaller strings will be padded with white space.
// JSON
{
"type": "string"
}
// ES6
string()
// fixed size
// JSON
{
"type": "string",
"size": 20
}
// ES6
string({size: 20})
// with default value
// JSON
{
"type": "string",
"default": "Default string value"
}
// ES6
string({default: "Default string value"})
More Details:
bytes
Bytes are fixed or dynimcally sized Vec
- Bytewise Sorting: Supported only if
size
property is set in schema. - Compaction: If
size
property is set, compaction cannot reclaim space. Otherwise it will reclaim space unless all updates have been identical in length. - Schema Mutations: If the
size
property is set it's safe to make it smaller, but not larger (this may cause existing bytes values to truncate, though). If the field is being used for bytewise sorting, no mutation is safe.
The size
property provides a way to have fixed size &[u8]
in your buffers. If a provided byte slice is larger than the size
property it will be truncated. Smaller byte slices will be padded with zeros.
// JSON
{
"type": "bytes"
}
// ES6
bytes()
// fixed size
// JSON
{
"type": "bytes",
"size": 20
}
// ES6
bytes({size: 20})
// with default value
// JSON
{
"type": "bytes",
"default": [1, 2, 3, 4]
}
// ES6
bytes({default: [1, 2, 3, 4]})
More Details:
int8, int16, int32, int64
Signed integers allow positive or negative whole numbers to be stored. The bytes are stored in big endian format and converted to unsigned types to allow bytewise sorting.
// JSON
{
"type": "i8"
}
// ES6
i8()
// with default value
// JSON
{
"type": "i8",
"default": 20
}
// ES6
i8({default: 20})
- Bytewise Sorting: Supported
- Compaction: Updates are done in place, never use additional space.
- Schema Mutations: None
More Details:
uint8, uint16, uint32, uint64
Unsgined integers allow only positive whole numbers to be stored. The bytes are stored in big endian format to allow bytewise sorting.
- Bytewise Sorting: Supported
- Compaction: Updates are done in place, never use additional space.
- Schema Mutations: None
// JSON
{
"type": "u8"
}
// ES6
u8()
// with default value
// JSON
{
"type": "u8",
"default": 20
}
// ES6
u8({default: 20})
More Details:
float, double
Allows the storage of floating point numbers of various sizes. Bytes are stored in big endian format.
- Bytewise Sorting: Unsupported, use decimal type.
- Compaction: Updates are done in place, never use additional space.
- Schema Mutations: None
// JSON
{
"type": "f32"
}
// ES6
f32()
// with default value
// JSON
{
"type": "f32",
"default": 20.283
}
// ES6
f32({default: 20.283})
More Details:
enum
Allows efficeint storage of a selection between a known collection of ordered strings. The selection is stored as a single u8 byte, limiting the max number of choices to 255. Also the choices themselves cannot be longer than 255 UTF8 bytes each.
- Bytewise Sorting: Supported
- Compaction: Updates are done in place, never use additional space.
- Schema Mutations: You can safely add new choices to the end of the list or update the existing choices in place. If you need to delete a choice, just make it an empty string. Changing the order of the choices is destructive as this type only stores the index of the choice it's set to.
There is one required property of this schema called choices
. The property should contain an array of strings that represent all possible choices of the option.
// JSON
{
"type": "enum",
"choices": ["choice 1", "choice 2", "etc"]
}
// ES6
enum({choices: ["choice 1", "choice 2", "etc"]})
// with default value
// JSON
{
"type": "enum",
"choices": ["choice 1", "choice 2", "etc"],
"default": "etc"
}
// ES6
enum({choices: ["choice 1", "choice 2", "etc"], default: "etc"})
More Details:
bool
Allows efficent storage of a true or false value. The value is stored as a single byte that is set to either 1 or 0.
- Bytewise Sorting: Supported
- Compaction: Updates are done in place, never use additional space.
- Schema Mutations: None
// JSON
{
"type": "bool"
}
// ES6
bool()
// with default value
// JSON
{
"type": "bool",
"default": false
}
// ES6
bool({default: false})
More Details:
decimal
Allows you to store fixed point decimal numbers. The number of decimal places must be declared in the schema as exp
property and will be used for every value.
- Bytewise Sorting: Supported
- Compaction: Updates are done in place, never use additional space.
- Schema Mutations: None
There is a single required property called exp
that represents the number of decimal points every value will have.
// JSON
{
"type": "decimal",
"exp": 3
}
// ES6
decimal({exp: 3})
// with default value
// JSON
{
"type": "decimal",
"exp": 3,
"default": 20.293
}
// ES6
decimal({exp: 3, default: 20.293})
More Details:
geo4, ge8, geo16
Allows you to store geographic coordinates with varying levels of accuracy and space usage.
- Bytewise Sorting: Not supported
- Compaction: Updates are done in place, never use additional space.
- Schema Mutations: None
Larger geo values take up more space, but allow greater resolution.
Type | Bytes | Earth Resolution | Decimal Places |
---|---|---|---|
geo4 | 4 | 1.1km resolution (city) | 2 |
geo8 | 8 | 11mm resolution (marble) | 7 |
geo16 | 16 | 110 microns resolution (grain of sand) | 9 |
// JSON
{
"type": "geo4"
}
// ES6
geo4()
// with default
{
"type": "geo4",
"default": {"lat": -20.283, "lng": 19.929}
}
// ES6
geo4({default: {lat: -20.283, lng: 19.929}})
More Details:
ulid
Allows you to store a unique ID with a timestamp. The timestamp is stored in milliseconds since the unix epoch.
- Bytewise Sorting: Supported, orders by timestamp. Order is random if timestamp is identical between two values.
- Compaction: Updates are done in place, never use additional space.
- Schema Mutations: None
// JSON
{
"type": "ulid"
}
// ES6
ulid()
// no default supported
More Details:
uuid
Allows you to store a universally unique ID.
- Bytewise Sorting: Supported, but values are random
- Compaction: Updates are done in place, never use additional space.
- Schema Mutations: None
// JSON
{
"type": "uuid"
}
// ES6
uuid()
// no default supported
More Details:
date
Allows you to store a timestamp as a u64 value. This is just a thin wrapper around the u64 type.
- Bytewise Sorting: Supported
- Compaction: Updates are done in place, never use additional space.
- Schema Mutations: None
// JSON
{
"type": "date"
}
// ES6
date()
// with default value (default should be in ms)
// JSON
{
"type": "date",
"default": 1605909163951
}
// ES6
date({default: 1605909163951})
More Details:
portal
Portals allow types/schemas to be "teleported" from one part of a schema to another.
You can use these for duplicating a type many times in a schema or for recursive data types.
The one required property is to
, it should be a dot notated path to the type being teleported. If to
is an empty string, the root is used.
Recursion works up to 255 levels of depth.
- Bytewise Sorting: Not Supported
- Compaction: Same behavior as type being teleported.
- Schema Mutations: None
// JSON
{
"type": "struct",
"fields": [
["value", {"type": "u8"}],
["next", {"type": "portal", "to": ""}]
]
}
// ES6
struct({fields: {
value: u8(),
next: portal({to: ""})
}})
With the above schema, values can be stored at value
, next.value
, next.next.next.value
, etc.
Here is an example where portal
is used to duplicate a type.
// JSON
{
"type": "struct",
"fields": [
["username", {"type": "string"}],
["email", {"type": "portal", "to": "username"}]
]
}
// ES6
struct({fields: {
username: string(),
email: portal({to: "username"})
}})
In the schema above username
and email
are both resolved to the string
type.
Even though structs are the only type used in the examples above, the portal
type will work with any collection type.
More Details:
Next Step
Read about how to initialize a schema into a NoProto Factory.
Enums
NP_TypeKeys | Simple enum to store the schema types |