Struct StorageBackend

Source

pub struct StorageBackend<D>where
    D: DB,
{ /* private fields */ }

Expand description

A storage back-end, that wraps interactions with the database and provides in-memory caching. Its public API provides a mapping from ArenaHash keys to OnDiskObject objects, and a way to persist such objects to the DB, taking care of reference counting along the way.

§Pub access to `StorageBackend` objects

There is no pub API to construct StorageBackend. Rather, lib users are expected to construct a crate::Storage, and then access the StorageBackend via crate::arena::Arena::with_backend method of the crate::Storage::arena field.

§Overview

This module intermediates between the in-memory arena and the persistent database, managing reference counts for the Merkle-ized DAGs we store in the DB, and providing a caching layer. The cache addresses several concerns:

reducing disk reads: by keeping data read from the on-disk DB in memory, we can avoid going to disk again the next time that data is accessed.
reducing disk writes: the arena creates a lot of temporary data structures that will never be persisted – i.e. be marked as a gc-root or be in the transitive closure of a GC root – and so instead of eagerly writing these data to disk, we keep them in memory in case they get dropped right away. Also, we track reference counts in the DB, and so by doing reference count updates only in cache where possible, we avoid having to write intermediate states to disk.
bulking disk writes: transactions in SQLite (used by [crate::db::SqlDB]) are very expensive, and so we want to do many writes at once when possible. By collecting the potential writes in memory in the cache, we can then flush them periodically en masse.

However, this caching layer also adds complexity, because we now have multiple potential sources of truth, the DB and the cache. The main complexity here arises from concern (2), reducing disk writes. We optimize the common case, where the arena creates temporary data structures and StorageBackend::caches them into the cache, only to drop them soon after. The tricky thing here is deducing that the net change of a bunch of arena cache mutations is the identity, i.e. that we don’t need to write anything to the DB. In the case of new object creation – meaning objects that aren’t already in the DB – this is easy: when the arena announces it’s done with a particular key – by calling StorageBackend::uncache on that key – we simply check if that key has any remaining references to it, and drop it if not. The harder case is when the arena changes the reference counts for keys that already exist in the DB, by creating larger structures that reference these existing keys. To handle this case, we track reference count deltas instead of cardinal reference counts: if the net effect of all the deltas is zero, then we know the key can safely be dropped from the cache.

Another source of complexity arises from a concern that has nothing to do with the goals of caching, and instead conflicts with caching: we want to support the creation of data structures that are too large to fit in memory, without the user needing to carefully checkpoint the construction. To deal with this, we track the size of the write cache, and provide StorageBackend::flush_cache_evictions_to_db to bulk-write mutations to disk when the write cache has become too large. In these cases we can’t know at the time of disk-writing if the written mutations will persist. So, we may end up with unused temporary values in the DB, and a separate GC operation is responsible for cleaning these up periodically.

§Assumptions

the database is not changing under our feet, meaning in particular that there is only one back-end. When an object obj is cached into the back-end, what happens depends on whether obj is already in the database. To support the database changing under our feet, we would need to handle the case where obj was in the database when it was cached, but was then removed from or modified in the database by another thread before obj was uncached.
there is only one “logical” arena calling into / manipulating the back-end. Here “one logical arena” includes multiple clones of a single initial arena, since cloned arenas share their metadata structures and avoid cacheing the same object more than once. Indeed, the key assumption we make here is that no object will be cacheed more than once, independently. It would be possibly to support multiple, distinct arenas – non-clones, with distinct metadata – manipulating the back-end, but would require more careful tracking of cache calls, to handle the case where two different arenas cache the same object, and we need to be careful to keep it around until all these arena’s are done using it.
objects are cached only after all of their children have been cached or are already in the db. In particular, this means the arena is responsible for sanitizing user controlled inputs before passing them to the back-end, to make sure they are well formed, in terms of children references.
if the caller caches an object obj, and wants obj to continue to exist in the back-end, then before uncacheing obj, the user needs to first do either of:
- cache but not uncache another object which has obj in its transitive closure.
- persist an object that has obj in its transitive closure.
Note that it’s okay for the caller to e.g. uncache interior nodes of a large data structure, as long as a reference to the root has been persisted or cached-but-not-uncached.

§Terminology and APIs

A key will never be in read_cache and write_cache at the same time, but may be in database and either cache at the same time. If a key is in either cache, then we say it’s “in memory”. If a key is in memory, then the value stored in memory under that key describes the canonical version of the object.

The back-end provides various public APIs related to manipulating in-memory representations of objects. The get API brings an object into memory from the database. The cache API attempts to create a new object in memory, but falls back on any existing version already in memory or the DB. The uncache API informs the back-end that an object is no longer of interest to the caller, which allows the back-end to remove it from memory if it has no references or pending updates in memory. These APIs act on ArenaHash keys and OnDiskObject values, but internally manipulate more complex states, in the form of CacheValue values.

Struct StorageBackend Copy item path

§Pub access to StorageBackend objects

§Overview

§Assumptions

§Terminology and APIs

Implementations§

impl<D> StorageBackend<D>where D: DB,

pub fn get( &mut self, key: &ArenaHash<<D as DB>::Hasher>, ) -> Option<&OnDiskObject<<D as DB>::Hasher>>

pub fn get_stats(&self) -> StorageBackendStats

pub fn get_roots(&self) -> HashMap<ArenaHash<<D as DB>::Hasher>, u32>

pub fn unpersist(&mut self, key: &ArenaHash<<D as DB>::Hasher>)

pub fn persist(&mut self, key: &ArenaHash<<D as DB>::Hasher>)

pub fn pre_fetch( &mut self, key: &ArenaHash<<D as DB>::Hasher>, max_depth: Option<usize>, truncate: bool, )

pub fn flush_cache_evictions_to_db(&mut self)

§Note

pub fn flush_all_changes_to_db(&mut self)

pub fn gc(&mut self)

§Note

pub fn get_write_cache_len(&self) -> usize

§Note

pub fn get_write_cache_obj_bytes(&self) -> usize

§Note

Trait Implementations§

impl<D> Debug for StorageBackend<D>where D: Debug + DB, <D as DB>::Hasher: Debug,

fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), Error>

Auto Trait Implementations§

impl<D> !Freeze for StorageBackend<D>

impl<D> !RefUnwindSafe for StorageBackend<D>

impl<D> Send for StorageBackend<D>

impl<D> !Sync for StorageBackend<D>

impl<D> Unpin for StorageBackend<D>where D: Unpin, <<<D as DB>::Hasher as OutputSizeUser>::OutputSize as ArrayLength<u8>>::ArrayType: Unpin,

impl<D> UnsafeUnpin for StorageBackend<D>where D: UnsafeUnpin,

impl<D> UnwindSafe for StorageBackend<D>where D: UnwindSafe, <<<D as DB>::Hasher as OutputSizeUser>::OutputSize as ArrayLength<u8>>::ArrayType: UnwindSafe + RefUnwindSafe,

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> Conv for T

fn conv<T>(self) -> Twhere Self: Into<T>,

impl<T> Fake for T

fn fake<U>(&self) -> Uwhere Self: FakeBase<U>,

fn fake_with_rng<U, R>(&self, rng: &mut R) -> Uwhere R: Rng + ?Sized, Self: FakeBase<U>,

impl<T> FmtForward for T

fn fmt_binary(self) -> FmtBinary<Self>where Self: Binary,

fn fmt_display(self) -> FmtDisplay<Self>where Self: Display,

fn fmt_lower_exp(self) -> FmtLowerExp<Self>where Self: LowerExp,

fn fmt_lower_hex(self) -> FmtLowerHex<Self>where Self: LowerHex,

fn fmt_octal(self) -> FmtOctal<Self>where Self: Octal,

fn fmt_pointer(self) -> FmtPointer<Self>where Self: Pointer,

fn fmt_upper_exp(self) -> FmtUpperExp<Self>where Self: UpperExp,

fn fmt_upper_hex(self) -> FmtUpperHex<Self>where Self: UpperHex,

fn fmt_list(self) -> FmtList<Self>where &'a Self: for<'a> IntoIterator,

impl<T> From<T> for T

fn from(t: T) -> T

impl<T> Instrument for T

fn instrument(self, span: Span) -> Instrumented<Self>

fn in_current_span(self) -> Instrumented<Self>

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T> Pipe for Twhere T: ?Sized,

fn pipe<R>(self, func: impl FnOnce(Self) -> R) -> Rwhere Self: Sized,

fn pipe_ref<'a, R>(&'a self, func: impl FnOnce(&'a Self) -> R) -> Rwhere R: 'a,

fn pipe_ref_mut<'a, R>(&'a mut self, func: impl FnOnce(&'a mut Self) -> R) -> Rwhere R: 'a,

fn pipe_borrow<'a, B, R>(&'a self, func: impl FnOnce(&'a B) -> R) -> Rwhere Self: Borrow<B>, B: 'a + ?Sized, R: 'a,

fn pipe_borrow_mut<'a, B, R>( &'a mut self, func: impl FnOnce(&'a mut B) -> R, ) -> Rwhere Self: BorrowMut<B>, B: 'a + ?Sized, R: 'a,

fn pipe_as_ref<'a, U, R>(&'a self, func: impl FnOnce(&'a U) -> R) -> Rwhere Self: AsRef<U>, U: 'a + ?Sized, R: 'a,

fn pipe_as_mut<'a, U, R>(&'a mut self, func: impl FnOnce(&'a mut U) -> R) -> Rwhere Self: AsMut<U>, U: 'a + ?Sized, R: 'a,

fn pipe_deref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> Rwhere Self: Deref<Target = T>, T: 'a + ?Sized, R: 'a,

fn pipe_deref_mut<'a, T, R>( &'a mut self, func: impl FnOnce(&'a mut T) -> R, ) -> Rwhere Self: DerefMut<Target = T> + Deref, T: 'a + ?Sized, R: 'a,

impl<T> PolicyExt for Twhere T: ?Sized,

fn and<P, B, E>(self, other: P) -> And<T, P>where T: Policy<B, E>, P: Policy<B, E>,

fn or<P, B, E>(self, other: P) -> Or<T, P>where T: Policy<B, E>, P: Policy<B, E>,

impl<T> Same for T

type Output = T

impl<T> Tap for T

fn tap(self, func: impl FnOnce(&Self)) -> Self

fn tap_mut(self, func: impl FnOnce(&mut Self)) -> Self

Struct StorageBackend

§Pub access to `StorageBackend` objects

impl<D> StorageBackend<D>
where D: DB,

impl<D> Debug for StorageBackend<D>
where D: Debug + DB, <D as DB>::Hasher: Debug,

impl<D> Unpin for StorageBackend<D>
where D: Unpin, <<<D as DB>::Hasher as OutputSizeUser>::OutputSize as ArrayLength<u8>>::ArrayType: Unpin,

impl<D> UnsafeUnpin for StorageBackend<D>
where D: UnsafeUnpin,

impl<D> UnwindSafe for StorageBackend<D>
where D: UnwindSafe, <<<D as DB>::Hasher as OutputSizeUser>::OutputSize as ArrayLength<u8>>::ArrayType: UnwindSafe + RefUnwindSafe,

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

fn conv<T>(self) -> T
where Self: Into<T>,

fn fake<U>(&self) -> U
where Self: FakeBase<U>,

fn fake_with_rng<U, R>(&self, rng: &mut R) -> U
where R: Rng + ?Sized, Self: FakeBase<U>,

fn fmt_binary(self) -> FmtBinary<Self>
where Self: Binary,

fn fmt_display(self) -> FmtDisplay<Self>
where Self: Display,

fn fmt_lower_exp(self) -> FmtLowerExp<Self>
where Self: LowerExp,

fn fmt_lower_hex(self) -> FmtLowerHex<Self>
where Self: LowerHex,

fn fmt_octal(self) -> FmtOctal<Self>
where Self: Octal,

fn fmt_pointer(self) -> FmtPointer<Self>
where Self: Pointer,

fn fmt_upper_exp(self) -> FmtUpperExp<Self>
where Self: UpperExp,

fn fmt_upper_hex(self) -> FmtUpperHex<Self>
where Self: UpperHex,

fn fmt_list(self) -> FmtList<Self>
where &'a Self: for<'a> IntoIterator,

impl<T, U> Into<U> for T
where U: From<T>,

impl<T> Pipe for T
where T: ?Sized,

fn pipe<R>(self, func: impl FnOnce(Self) -> R) -> R
where Self: Sized,

fn pipe_ref<'a, R>(&'a self, func: impl FnOnce(&'a Self) -> R) -> R
where R: 'a,

fn pipe_ref_mut<'a, R>(&'a mut self, func: impl FnOnce(&'a mut Self) -> R) -> R
where R: 'a,

fn pipe_borrow<'a, B, R>(&'a self, func: impl FnOnce(&'a B) -> R) -> R
where Self: Borrow<B>, B: 'a + ?Sized, R: 'a,

fn pipe_borrow_mut<'a, B, R>( &'a mut self, func: impl FnOnce(&'a mut B) -> R, ) -> R
where Self: BorrowMut<B>, B: 'a + ?Sized, R: 'a,

fn pipe_as_ref<'a, U, R>(&'a self, func: impl FnOnce(&'a U) -> R) -> R
where Self: AsRef<U>, U: 'a + ?Sized, R: 'a,

fn pipe_as_mut<'a, U, R>(&'a mut self, func: impl FnOnce(&'a mut U) -> R) -> R
where Self: AsMut<U>, U: 'a + ?Sized, R: 'a,

fn pipe_deref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R
where Self: Deref<Target = T>, T: 'a + ?Sized, R: 'a,

fn pipe_deref_mut<'a, T, R>( &'a mut self, func: impl FnOnce(&'a mut T) -> R, ) -> R
where Self: DerefMut<Target = T> + Deref, T: 'a + ?Sized, R: 'a,

impl<T> PolicyExt for T
where T: ?Sized,

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

fn tap_borrow<B>(self, func: impl FnOnce(&B)) -> Self
where Self: Borrow<B>, B: ?Sized,

fn tap_borrow_mut<B>(self, func: impl FnOnce(&mut B)) -> Self
where Self: BorrowMut<B>, B: ?Sized,

fn tap_ref<R>(self, func: impl FnOnce(&R)) -> Self
where Self: AsRef<R>, R: ?Sized,

fn tap_ref_mut<R>(self, func: impl FnOnce(&mut R)) -> Self
where Self: AsMut<R>, R: ?Sized,

fn tap_deref<T>(self, func: impl FnOnce(&T)) -> Self
where Self: Deref<Target = T>, T: ?Sized,

fn tap_deref_mut<T>(self, func: impl FnOnce(&mut T)) -> Self
where Self: DerefMut<Target = T> + Deref, T: ?Sized,

fn tap_borrow_dbg<B>(self, func: impl FnOnce(&B)) -> Self
where Self: Borrow<B>, B: ?Sized,

fn tap_borrow_mut_dbg<B>(self, func: impl FnOnce(&mut B)) -> Self
where Self: BorrowMut<B>, B: ?Sized,

fn tap_ref_dbg<R>(self, func: impl FnOnce(&R)) -> Self
where Self: AsRef<R>, R: ?Sized,

fn tap_ref_mut_dbg<R>(self, func: impl FnOnce(&mut R)) -> Self
where Self: AsMut<R>, R: ?Sized,

fn tap_deref_dbg<T>(self, func: impl FnOnce(&T)) -> Self
where Self: Deref<Target = T>, T: ?Sized,

fn tap_deref_mut_dbg<T>(self, func: impl FnOnce(&mut T)) -> Self
where Self: DerefMut<Target = T> + Deref, T: ?Sized,

fn try_conv<T>(self) -> Result<T, Self::Error>
where Self: TryInto<T>,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,