xpans_spe_midi 0.1.0

Wraps xpans Spatial Property Exchange (SPE) messages in MIDI
Documentation

xpans SPE MIDI

SPE MIDI allows for sending and receiving spatial properties of virtual audio sources through MIDI SysEx, introducing potential for spatially-aware processing and adaptive rendering in a digital audio workflow.

Crates.io Version docs.rs

What is spatially-aware audio processing?

Spatially-aware audio processing is the ability for audio plugins, DAWs, and other applications to receive and interpret spatial properties (i.e. position, extent, etc.) while processing audio.

An example of this is a reverb that generates reflections based on where the input signal is located in a virtual room, creating a much more immersive spatial impression than traditional reverb.

What is SPE?

SPE stands for Spatial Property Exchange, a protocol for sending and receiving spatial properties across applications within the xpans Ecosystem.

Why MIDI?

MIDI is natively used in audio plugin frameworks and DAWs. SPE via MIDI allows for more predictability and control of the flow of spatial properties on the user's end. For the most part, the way MIDI flows through plugins and the DAW is almost identical to how spatial properties should behave in an ideal implementation of a dedicated SPE protocol.

Disclaimer

SPE MIDI is only a temporary workaround until a proper SPE protocol is established for plugins and hosts (and other applications). anticipate breaking changes and possibly deprecation of this project once a proper SPE protocol is stable and natively supported in several DAWs and plugin frameworks.

Specification

SPE-MIDI messages are System Exclusive MIDI Messages.

Offset Length Content
0 1 SysEx start byte (0xF0)
1 2 14-bit Source ID
3 1 Value byte length (un-MIDIfied)
4 Variable Property
? Variable Values
? 1 SysEx end byte (0xF7)

Note: All multi-byte values are little-endian.

Source ID

The Source ID is 14-bits, spread across 2 bytes. Both bytes contain 7 bits of the Source ID. A bitmask for a byte would be 0b_01111111.

Value byte length

MIDI SysEx bytes only have 7 usable bits. The MSB is always 0. SPE MIDI spreads bits of values across bytes. For example, a 32-bit float will occupy 5 bytes in a SysEx message. However, its un-MIDIfied byte length would be 4. This is the number used as the value byte length.

Property

The Property of the message has potential to occupy a variable length of bytes. At the time of writing, it occupies only 2 bytes. The first byte indicates the property the message targets, and additional bytes may describe finer details.

The first byte can have the following values:

Value Property
0 Position
1 Extent

Axis Combo

Both Position and Extent properties include an additional byte to describe the axis or axes of the property targeted by the message.

Value Axis/Axes
0 X
1 Y
2 XY
3 Z
4 XZ
5 YZ
6 XYZ

Tip: The axis combo byte can be easily generated by using bitflags of the targeted axis or axes and subtracting it by 1. The 1st bit would correspond to the X axis, the 2nd to the Y axis, and the 3rd to the Z axis.

Values

The Values region of the message is an array of floating-point numbers. The Property field dictates how many values will be in this region.

For example, a message targeting (Position, XYZ) would have 3 values. A message targeting (Position, X) would just have 1 value. There should not be any more or less values than what the property indicates.

The values are in the order the property indicates. For example, a message targeting (Position, XZ) would first have the X coordinate, then the Z coordinate.

Since MIDI SysEx only allows for 7 usable bits in a byte, each value's bits are spread across as many bytes as needed to fully encode the value with no information loss. The MSB of each byte can be thought of as padding.

The amount of bytes used by a MIDIfied value can be calculated using (bits / 7) + 1, where bits is the number of bits the value's type occupies normally. (32 bits for a 32-bit float).