serde_mosaic
Composable serialization and deserialization for Rust structs.
This crate allows a composed struct to be serialized into the serialized forms of its individual components. Likewise, a composed struct can be deserialized from multiple serialized component forms. This enables sharing serialized components across multiple composed structs – even of different types – and reduces duplication when the serialized data is stored in a database.
Currently, only a file-system based database type is available, but the concept can be easily expanded for other database types (e.g. an in-memory database). Please open an issue on Github if needed.
This crate builds on Serde but is not affiliated with the Serde project.
An introductory example
Suppose we have a Shirt struct which holds the name of its owner, its
Material and its size. Its Material is a cotton-linen blend:
We now want to serialize different shirt instances made from the same material
and store them in a database. Clearly, it would be a waste to store the
serialized form of the material multiple times as part of the Shirt
serialization. This is where serde_mosaic comes into play: It provides the
functions serialize_link and deserialize_link which can be used in
conjunction with the serialize_with and deserialize_with field
attributes of the serde crate to mark the Material component of the
Shirt for composed serialization / deserialization. To provide a unique
identifier of the component, the DatabaseEntry trait needs to be implemented
for Material. The #[typetag::serde] macro from the
typetag crate is also needed:
use OsStr;
use ;
use *;
Now, the location of the database in the file system and its format must be
specified. serde_mosaic provides multiple predefined formats such as
SerdeYaml or SerdeJson, but it is also possible to define your own
format by implementing the Format trait. For the example, let's stick with
SerdeYaml:
use OsStr;
use ;
use *;
let pure_cotton = Material ;
let mikes_shirt = Shirt ;
let joes_shirt = Shirt ;
let mut dbm = new.expect;
// Now serialize the shirt representations. `WriteOptions` allows you to detail
// how the actual representation looks like
let write_options = default;
dbm.write.expect;
dbm.write.expect;
This creates the following files:
/path/to/db/Material/pure_cotton.yaml
/path/to/db/Shirt/joe.yaml
/path/to/db/Shirt/mike.yaml
The files joe.yaml and mike.yaml do not contain the serialized
representation of pure_cotton, but only a link to pure_cotton.yaml. When
deserializing joe.yaml, the DatabaseManager interprets the link,
deserializes pure_cotton.yaml and puts the resulting Material into the
material field of Shirt:
let mut dbm = new.expect;
let joes_shirt: Shirt = dbm.read.expect;
"Normal" serialization and deserialization without a DatabaseManager is
still possible - the attribute functions serialize_link and
deserialize_link are no-ops in such a case, serialization and
deserialization works as expected.
Reference-counted components
If the same Material should be shared between different Shirts not just in
the database, but also within memory, a common Rust pattern is to use a
reference counter such as Arc. serde_mosaic supports this with the
serialize_arc_link and deserialize_arc_link functions:
use OsStr;
use Arc;
use ;
use *;
The DatabaseManager maintains a cache of Arc<Material> instances which
have already been deserialized before. If the database manager encounters a
cached link / file name in a second Shirt it is currently deserializing, it
reuses the cached instance by cloning the Arc pointer and inserting the clone
in the newly deserialized Shirt.
Optional fields
It is also possible to have optional fields used for composition:
use OsStr;
use Arc;
use ;
use *;
When the optional field is empty, the link in the serialized representation is simply empty as well.
Serialized representation
As mentioned before, the serialized representation of a composed struct contains
a "link" instead of the actual field contents. For example, the
yaml-representation of Shirt created by the DatabaseManager looks like
this:
---
Shirt:
name: mike
material:
name: pure_cotton
checksum: 94637245
size: 40
The "link" consists of the two fields name and checksum. The name field
tells the DatabaseManager to look for a file "pure_cotton.yaml" (the file
extension is derived from DatabaseManager::file_ext) and deserialize that
file; the resulting Material instance is then put into Shirt. The checksum
field is optional and should be omitted when creating a database entry manually.
The number is a hash which is used to check if a file changed during the
lifetime of a DatabaseManager. This avoids a stale cache for
reference-counted components.
One difference to the "standard" yaml-representation of Shirt is the fact that
the type is stated at the very top of the hierarchy. This is necessary because
internally, Shirt is serialized as a DatabaseEntry trait object via
[typetag] (which in turn is necessary to allow for arbitrary Formats). Since
[typetag] treats trait objects as enum variants, the "variant name" (which is
the type name) needs to be stated explictly.
Creating a database entry for a pure cotton shirt manually could look like this:
/path/to/db/Shirt/sarah.yaml
---
Shirt:
name: sarah
material:
name: pure_cotton
size: 39
/path/to/db/Material/pure_cotton.yaml
---
Material:
name: pure_cotton
cotton_content: 100
Predefined database formats
This crate offers several predefined Formats which are gates behind feature
flags.
JSON
Enabling the serde_json feature provides the SerdeJson database format.
This format uses the serde_json crate for serializing and deserializing the
database entries.
YAML
Enabling the serde_yaml feature provides the SerdeYaml database format.
This format uses the serde_yaml crate for serializing and deserializing the
database entries.
Examples in the /tests directory
The repository contains a fully-fledged database within test/test_database as
well as various examples for reading from and writing to that database in
tests. I tried very hard to make these tests as self-explanatory as possible,
but please open an issue on
Github if help is needed.
tests/basic_db_manipulation.rs: Interaction with the database via theDatabaseManager(e.g. checking if an entry already exists, clearing database entries based on their name etc.)tests/read.rs: Deserializing composed structs from the database, with examples forArc(incl. in-memory sharing),Optionand nested composed structs.tests/serialize_and_deserialize.rs: Serializing and deserializing structs with the.._linkattributes without aDatabaseManager(i.e. "normal" [serde] behaviour).tests/utilities.rs: Definition of the structs used within the tests.tests/write_and_read.rs: Serializing to and serialization from the database, basically a composition oftests/read.rsandtests/write.rstests/write.rs: Serializing composed structs into the database, with examples forArc,Optionand nested composed structs.
It is recommended to first check out tests/write.rs and tests/read.rs to
understand how to work with this crate.
Documentation
The full API documentation is available at https://docs.rs/serde_mosaic/0.2.0/serde_mosaic/.