Skip to main content

Crate spillover

Crate spillover 

Source
Expand description

Generic, disk-spilling external sort with pluggable deduplication.

spillover provides the core machinery for sorting datasets that exceed available memory by flushing sorted runs to temporary files on disk and merging them back via a k-way merge. It is deliberately unopinionated about the data being sorted, the sort key, the deduplication strategy, and the on-disk serialization format.

Domain-specific crates (like spillover-bio for genomics) inject their own implementations of these traits to build a complete sorting pipeline tailored to their data types and workflows.

Modules§

chunk
In-memory chunk sorting strategies.
codec
Serialization traits for reading and writing items to disk.
compare
Key comparison traits and built-in comparators.
dedup
Post-merge deduplication strategies.
key
Sort key extraction from items.
merge
K-way merge of sorted runs from temporary files on disk.
sorter
The external sorter — the main entry point for sorting larger-than-memory datasets.

Enums§

SpilloverError
Errors originating from spillover’s own operations.

Traits§

GetSize
Determine the size in bytes an object occupies inside RAM.

Derive Macros§

GetSize