Expand description
Generic, disk-spilling external sort with pluggable deduplication.
spillover provides the core machinery for sorting datasets that
exceed available memory by flushing sorted runs to temporary files
on disk and merging them back via a k-way merge. It is deliberately
unopinionated about the data being sorted, the sort key, the
deduplication strategy, and the on-disk serialization format.
Domain-specific crates (like spillover-bio for genomics) inject
their own implementations of these traits to build a complete
sorting pipeline tailored to their data types and workflows.
Modules§
- chunk
- In-memory chunk sorting strategies.
- codec
- Serialization traits for reading and writing items to disk.
- compare
- Key comparison traits and built-in comparators.
- dedup
- Post-merge deduplication strategies.
- key
- Sort key extraction from items.
- merge
- K-way merge of sorted runs from temporary files on disk.
- sorter
- The external sorter — the main entry point for sorting larger-than-memory datasets.
Enums§
- Spillover
Error - Errors originating from spillover’s own operations.
Traits§
- GetSize
- Determine the size in bytes an object occupies inside RAM.