pub fn iter_from_counts<Find>(
    counts: Vec<Count>,
    db: Find,
    progress: Box<dyn DynNestedProgress + 'static>,
    _: Options
) -> impl Iterator<Item = Result<(SequenceId, Vec<Entry>), Error>> + Finalize<Reduce = Statistics<Error>>where
    Find: Find + Send + Clone + 'static,
Available on crate feature generate only.
Expand description

Given a known list of object counts, calculate entries ready to be put into a data pack.

This allows objects to be written quite soon without having to wait for the entire pack to be built in memory. A chunk of objects is held in memory and compressed using DEFLATE, and serve the output of this iterator. That way slow writers will naturally apply back pressure, and communicate to the implementation that more time can be spent compressing objects.

  • counts
    • A list of previously counted objects to add to the pack. Duplication checks are not performed, no object is expected to be duplicated.
  • progress
    • a way to obtain progress information
  • options
    • more configuration

Returns the checksum of the pack

Discussion

Advantages

  • Begins writing immediately and supports back-pressure.
  • Abstract over object databases and how input is provided.

Disadvantages

  • currently there is no way to easily write the pack index, even though the state here is uniquely positioned to do so with minimal overhead (especially compared to gix index-from-pack) Probably works now by chaining Iterators or keeping enough state to write a pack and then generate an index with recorded data.