gvariant 0.4.0

A pure-rust implementation of the GVariant serialisation format
Documentation

A pure-rust implementation of the GVariant serialisation format intended for fast reading of in-memory buffers.

# use gvariant::{aligned_bytes::copy_to_align, gv, Marker};
let data = copy_to_align(b"\x22\x00\x00\x00William\0");
let (age, name) = gv!("(is)").cast(data.as_ref()).into();
assert_eq!(
format!("My name is {} and I am {} years old!", name, age),
"My name is William and I am 34 years old!");

This library operates by reinterpreting byte buffers as a GVariant type. It doesn't do any of its own allocations. As a result proper alignment of byte buffers is the responsibility of the user. See [aligned_bytes].

It's intended to conform to the GVariant specification and match the behaviour of the reference GLib implementation, preferring the latter rather than the former where they disagree. Exceptions to this are described in "Deviations from the Specification and reference implementation" below.

This library assumes you know the types of the data you are dealing with at compile time. This is in contrast to the GLib implementation where you could construct a GVariant type string dynamically. This allows for a much smaller and faster implementation, more in line with Alexander Larsson's GVariant Schema Compiler. As a result GVariant structs are supported through use of code generation via macros. See the gvariant-macro subdirectory.

The library is intended to be sound and safe to run on untrusted input, although the implementation does include use of unsafe. See "Use of unsafe" below. Help with validating the unsafe portions of the library would be gratefully received.

This library works Rust stable. As a result we can't use const-generics, which would make some of the code much more streightforward. A future version of this library may use const-generics, once they are available in stable rust.

Status

  • Serialization and Deserialization is supported
  • Support for all GVariant types is implemented
  • Behaviour is identical to GLib's implementation for all data in "normal form". This has been confirmed with fuzz testing. There are some differences for data not in normal form. See GNOME/glib#2121 for more information.

TODO

  • Benchmarking and performance improvements
  • Ensure that deserialisation of non-normal structures matches GLib in all cases.

Features

std - enabled by default

Required for:

  • our errors to implement [std::error::Error]
  • [Marker::deserialize]
  • [aligned_bytes::read_to_slice]
  • Some CPU dependent string handling optimisations in the memchr crate
  • Serialisation: although this requirement could be relaxed in the future

Disable this feature for no-std support.

alloc - enabled by default

Required for:

  • Allocating [AlignedSlice]s with [ToOwned], [copy_to_align][aligned_bytes::copy_to_align] and [alloc_aligned][aligned_bytes::alloc_aligned].
  • The convenience API Marker::from_bytes - use Marker::cast instead
  • Correctly displaying non-utf-8 formatted strings
  • Copying unsized GVariant objects with to_owned()
  • The std feature

Deviations from the Specification and reference implementation

This implementation is intended to conform to the GVariant specification and match the behaviour of the reference GLib implementation, preferring the latter rather than the former where they disagree.

Maximum size of objects

The spec says:

2.3.6 Framing Offsets

There is no theoretical upper limit in how large a framing offset can be. This fact (along with the absence of other limitations in the serialisation format) allows for values of arbitrary size.

In this implementation the maximum size of an object is [usize] (typically 64-bits). This should not be a problem in practice on 64-bit machines.

Equality of Variant v type for non-normal form data

See note under [Variant].

Validation of non-normal form object path "o" and signature "g" types

The spec says:

2.7.3 Handling Non-Normal Serialised Data

Invalid Object Path

If the serialised form of an object path is not a valid object path followed by a zero byte then the default value is used.

Invalid Signature

If the serialised form of a signature string is not a valid DBus signature followed by a zero byte then the default value is used.

We don't currently do any validation of the object path or signature types, treating them as normal strings.

Data that overlaps framing offsets (non-normal form)

This applies to arrays of non-fixed size type in non-normal form and to structures in non-normal form. We follow the behaviour of GLib reference implementation rather than the GVariant spec in this instance.

The spec says:

Child Values Overlapping Framing Offsets

If the byte sequence of a child value overlaps the framing offsets of the container it resides within then this error is ignored. The child is given a value that corresponds to the normal deserialisation process performed on this byte sequence (including the bytes from the framing offsets) with the type of the child.

Whereas we give the child value the default value for the type consistent with the GLib implementation. This is the behaviour in GLib since 2.60, 2.58.2 and 2.56.4.

There are still some differences to GLib in the way we handle non-normal, non-fixed size structures. These will be fixed.

See GNOME/glib#2121 for more information.

Handling of non-normal form strings

We are consistent with the GLib implementation in this regard rather than the spec. See the note under [Str::to_str]

Design

The intention is to build abstractions that are transparent to the compiler, such that they compile down to simple memory accesses, like reading the fields of a struct. For many of the GVariant types rust already has a type with the same representation (such as i32 for i or [u8] for ay). For other types this library defines such types (such as [gvariant::Str][Str] for s or [gvariant::NonFixedWidthArray<[i32]>][NonFixedWidthArray] for aai). For structure types this library provides a macro [gv!] to generate the code for struct types.

If we have a type with the same representation as the underlying bytes we can just cast the data to the appropriate type and then read it. The macro [gv!] maps from GVariant typestrs to compatible Rust types returning a [Marker]. This [Marker] can then be used to cast data into that type by calling Marker::cast.

So typically code might look like:

use gvariant::{aligned_bytes::alloc_aligned, gv, Marker};

use std::io::Read;

fn a() -> std::io::Result<()> {

let mut file = std::fs::File::open("")?;

let mut buf = alloc_aligned(4096); let len = file.read(&mut buf)?; let data = gv!("a(sia{sv})").cast(&buf[..len]);

todo!()

}

For casting data to be valid and safe the byte buffer must be aligned...

Use of unsafe

I've tried to concentrate almost all of the unsafe in [aligned_bytes] and [casting] to make it easier to review. I also take advantage of the [ref_cast] crate to avoid some unsafe casting that I'd otherwise require.

A review of the use of unsafe, or advice on how the amount of unsafe could be reduced would be greatly appreciated.

Comparison to and relationship with other projects

  • GVariant Schema Compiler - Similar to this project the GSC generates code at compile time to represent the types the user is interested in. GSC targets the C language. Unlike this project the types are generated from schema files, allowing structures to have named fields. In gvariant-rs we generate our code just from the plain GVariant type strings using macros. This makes the build process simpler - there are no external tools, and it makes it easier to get started - there is no new schema format to learn. The cost is that the user is responsible for remember which field means what and what endianness should be used to interpret the data.

It might make sense in the future to extend GSC to generate rust code as well - in which case the generated code may depend on this library.

  • gtk-rs glib::variant - This is a binding to the GLib GVariant implementation in C, so depends on glib. It's currently incomplete. The docs say "Although GVariant supports arbitrarily complex types, this binding is currently limited to the basic ones: bool, u8, i16, u16, i32, u32, i64, u64, f64 and &str/String."
  • zvariant - Implements the similar DBus serialisation format rather than GVariant. Docs say: "GVariant ... will be supported by a future version of this crate."
  • serde_gvariant - Implements the same format, but for serde integration. Described as "WIP" and not published on crates.io