shardio 0.6.0

Out-of-memory sorting and streaming of large datasets
docs.rs failed to build shardio-0.6.0
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Visit the last successful build: shardio-0.8.2

rust-shardio

Crates.io Downloads Crates.io Version Crates.io License Build Status Coverage Status API Docs

Library for handling out-of-memory sorting of large datasets which need to be processed in multiple passes map / sort / reduce passes.

You write a stream of items of type T implementing Serialize and Deserialize to a ShardWriter. The items are buffered, sorted according to a customizable sort key, then serialized to disk in chunks with serde + lz4 and written to disk, while maintaing an index of the position and key range of each chunk. You use a ShardReader to stream through a item in a selected interval of the key space, in sorted order.

See Docs for API and examples.

Note: Enable the 'full-test' feature in Release mode to turn on some long-running stress tests.