Crate value_log

Source
Expand description

Generic value log implementation for key-value separated storage.

This crate is intended as a building block for key-value separated storage. You probably want to use https://github.com/fjall-rs/fjall instead.

The value log’s contents are split into segments, each segment holds a sorted list of key-value pairs:

[k0, v0][k1, v1][k2, v2][k3, v3][k4, v4]

The value log does not have an index - to efficiently retrieve an item, a ValueHandle needs to be retrieved from an IndexReader. Using the value handle then allows loading the value from the value log.

Recently retrieved (“hot”) items may be cached by an in-memory value cache to avoid repeated disk accesses.

As data changes, old values will unnecessarily occupy disk space. As space amplification increases, stale data needs to be discarded by rewriting old segments (garbage collection). This process can happen on-line.

Even though segments are internally sorted, which may help with range scans, data may not be stored contiguously, which hurts read performance of ranges. Point reads also require an extra level of indirection, as the value handle needs to be retrieved from the index. However, this index is generally small, so ideally it can be cached efficiently. And because compaction needs to rewrite less data, more disk I/O is freed to fulfill write and read requests.

In summary, a value log trades read & space amplification for superior write amplification when storing large blobs.

Use a value log, when:

  • you are storing large values (HTML pages, big JSON, small images, archiving, …)
  • your data is rarely deleted or updated, or you do not have strict disk space requirements
  • your access pattern is point read heavy

Structs§

Config
Value log configuration
GcReport
Statistics report for garbage collection
SegmentWriter
Segment writer, may write multiple segments
Slice
An immutable byte slice that can be cloned without additional heap allocation
SpaceAmpStrategy
Tries to find a least-effort-selection of segments to merge to reach a certain space amplification
StaleThresholdStrategy
Picks segments that have a certain percentage of stale blobs
ValueHandle
A value handle points into the value log
ValueLog
A disk-resident value log

Enums§

Error
Represents errors that can occur in the value log
Version
Disk format version

Traits§

BlobCache
Blob cache, in which blobs are cached in-memory after being retrieved from disk
Compressor
Generic compression trait
GcStrategy
GC strategy
IndexReader
Trait that allows reading from an external index
IndexWriter
Trait that allows writing into an external index

Type Aliases§

Result
Value log result
UserKey
User defined key
UserValue
User defined data (blob of bytes)
ValueLogId
Unique value log ID