Expand description
This module contains type and trait declarations that support the DBSP dynamic dispatch architecture.
§Background
DBSP operators manipulate batches of records that represent keys, values, and weights in various collections. Each record is represented as a concrete Rust type, e.g., a struct or a tuple. Operators are parameterized by the types of their input and output batches, which are in turn parameterized by the types of keys, values, and weights in the batch. This means that, had we used static typing (which is the default in Rust), the Rust compiler would generate a specialized implementation of each operator for each combination of input and output types that occur in the program. In an earlier version of this library, this approach led to very long compilation times for even medium-sized programs.
The solution we adopt relies on dynamic dispatch instead. We bundle primitive operations on types (comparison, cloning, etc.) as object-safe traits and expose them to operators as trait objects. This speeds up compilation without sacrificing much performance.
Implementing this approach involves some machinery. First, trait objects cannot be
efficiently combined into any kind of containers (vectors, sets, or even tuples or structs)
without wrapping each object in a Box or some other heap-allocated type, leading to many
small allocations. To avoid this, we represent such containers as separate trait objects.
For instance, options (Option<T>), 2-tuples ((T1, T2)), vectors (Vec<T>), and sets
(Set<T>) are modeled as traits.
Second, this design introduces unsafe code that cannot be nicely encapsulated. By
replacing concrete types with trait objects, we lose the ability to distinguish between trait
objects backed by different types. Operations like comparison and cloning must downcast
their arguments to the same concrete type. This can be done safely by checking the
TypeIds
of the arguments, but this check is expensive in Rust when performed on the critical
path. We therefore elide this check in release builds, making all such operations unsafe.
Finally, operators require the ability to create new values, which in turn requires passing a factory object for each type the operator may need to instantiate. Most operators (as well as low-level data structures they use internally like batches and traces) require multiple factory objects.
§Dynamic API design considerations
Since trait objects don’t have a statically known size, they cannot be returned by value (without boxing them, which we don’t want to do in most cases). We therefore pass mutable references for functions to write return values to. This requires the caller to allocate space for the return value using a factory object.
Likewise, trait objects cannot be passed by value; therefore functions that would normally
take an owned value instead take a mutable reference, which they are allowed to use without
cloning, e.g., with ClonableTrait::move_to, which
moves the value, leaving the default value in its place.
§Trait hierarchy
We encapsulate common operations in a hierarchy of object-safe traits, i.e., traits for which
the compiler can build vtables and expose them as trait objects (dyn <Trait>).
§Basic traits
The following traits must be implemented for all values that DBSP can compute on. All of these traits include blanket implementation for all the types that can support them:
AsAny- convert a reference to a type to&dyn Any.Clonable- an object-safe version ofClone.Comparable- an object-safe version ofOrd.SerializeDyn- an object-safe version ofrkyv::Serialize.DeserializableDyn- an object-safe version ofrkyv::Deserialize.Data- the lowest common denominator for all types that DBSP can compute on. Combines all of the above traits, along withDebug,SizeOf, andSend. This is an object-safe version ofcrate::DBData.
In addition, the Erase trait is provided to convert concrete types into trait objects.
§Trait object traits
The following traits are meant to be implemented by trait objects rather than regular
types, i.e., an instance of one of these traits is a dyn Trait (which is a type in Rust).
The reason we need these traits is ergonomics, since not all useful behaviors can be
captured nicely in an object-safe way. For instance Eq, Ord, and Clone traits
are not object-safe, so we cannot expose them directly via the vtable. We instead introduce
Comparable and Clonable traits mentioned above. These traits define an unsafe
API that accepts one of the arguments as a raw byte pointer (*mut u8). These are not
meant to be used directly. Instead, we use them to implement Eq, Ord, and Clone for
trait objects.
DowncastTrait- cast a trait object reference to a concrete type.ClonableTrait- trait for trait objects whose concrete type can be cloned and moved.ArchiveTrait- trait for trait objects that can be serialized and deserialized withrkyv.DataTrait- the lowest common denominator for all trait objects that DBSP can compute on. Combines all of the above traits, along withData. This is the trait object version ofcrate::DBData.
§Creating trait objects
There are three ways to create a trait object:
-
The
Factorytrait can be used to allocate an instance of a concrete type with the default value on the heap or on the stack and return a reference to it as a trait object. -
The
Erasetrait converts a reference to a concrete type into a trait object. -
DeserializableDynandDeserializeDyndeserialize values of the concrete type from a byte array or from an archived representation respectively and returns them as trait objects.
§Derived traits
The traits described above are applicable to all DBSP values. The following traits specify extended functionality available for certain types and their combinations.
DynWeight(WeightTrait) - types with the addition operation and a zero element. This is an object-safe version ofcrate::DBWeight.DynOpt- a dynamically typed version of theOptiontype.DynPair- a dynamically typed version of a two-tuple(T1, T2).DynVec- a dynamically typed vector.DynSet- a dynamically typed B-tree set.DynPairs- a vector of pairs.DynWeightedPairs- a vector of key-value pairs, where the value behaves as weight, meaning that tuples with the same key can be consolidated by adding their weights.
Re-exports§
Modules§
Structs§
- BSet
- Wrapper around BTreeSet whose archived representation implements the required ordering traits.
- Deserialize
Impl - LeanVec
- RawIter
- RawVec
Traits§
- Archive
Trait - Trait for trait objects that can be serialized and deserialized with
rkyv. - ArchivedDB
Data - Trait for DBData that can be deserialized with
rkyv. - AsAny
- Trait for types that return a reference to self as
dyn Any. - Clonable
- Clonable
Trait - Trait for trait objects whose concrete type can be cloned and moved.
- Comparable
- An object-safe verson of comparison traits.
- Data
- Defines the minimal set of operations that must be supported by all data types stored in DBSP batches.
- Data
Trait - The base trait for trait objects that DBSP can compute on.
- Data
Trait Typed - A trait for trait objects that include the concrete type behind the trait object in their type signatures.
- Deserializable
Dyn - Deserialize
Dyn - Object-safe version of the
Deserializetrait. - Downcast
Trait - A trait for trait objects that can be downcast to a reference to the concrete type they wrap.
- Erase
- Type erasure operations.
- Factory
- Create instances of a concrete type wrapped in a trait object.
- Opt
- An dynamically typed interface to
Option<>. - Pairs
- A dynamically typed interface to
LeanVec<Tup2<_,_>> - Pairs
Trait - Serialize
Dyn - Object-safe version of the
Serializetrait. - Set
- A dynamically typed interface to
BTreeSet. - SetTrait
- VecTrait
- Vector
- A dynamically typed interface to
LeanVec - Weight
- A trait for types that can be used as weights in DBSP.
- Weight
Trait - A trait for trait objects that represent weights.
- Weight
Trait Typed - A trait for trait objects that include the concrete weight type in their type signatures.
- Weighted
Pairs - A vector of (key, value) pairs, where the value implements
WeightTrait, meaning that tuples with the same key can be consolidated by adding their weights. Tuples that have zero weights after consolidation are removed from the vector. - Weighted
Pairs Trait - With
Factory - Trait for trait objects that can be created from instances of a concrete type
T.
Type Aliases§
- DynBool
- A trait object backed by
bool. - DynData
- DynData
Typed - DynOpt
- DynPairs
- DynSet
- DynUnit
- A trait object backed by the unit type
(). - DynVec
- DynWeight
- DynWeight
Typed - DynWeighted
Pairs