Crate den[][src]

Expand description

A difference library similar to rsync.

Performance

This library (for now) does not use a rolling hash algorithm, but use the theory. This is performant enough for like files and files under 10MiB. Please compile with the release preset for 10X the performance.

Allocating data and keeping it in memory is very fast compared to hashing. The reason to move to more readers is for memory space. Even then, the implementation could abstract the file system to give this library only 64KiB chunks.

How-to & examples

Keep in mind this isn’t guaranteed to give the exact same data. Please check the data with for example SHA-3 to ensure consistency.

Get a remote’s data

To get someone else’s data, we construct a Signature and send it. The remote calculates a Difference using Signature::diff. The remote sends back the Difference which we Difference::apply.

Push my data to remote

This is what rsync does.

Send to the remote the request of their Signature. They calculate it and send it back. We calculate a Signature::diff and send it to them. They Difference::apply it. Their data should now be equal to mine.

Get the difference of a local file

Gets a small diff to send to others, almost like how git works.

base_data is considered prior knowledge. target_data is the modified data.

The data segments can be any size. Performance should still be good.

let base_data = b"This is a document everyone has. It's about some new difference library.";
let target_data = b"This is a document only I have. It's about some new difference library.";

let mut signature = Signature::new(128);
signature.write(base_data);
let signature = signature.finish();

let diff = signature.diff(target_data);

// This is the small diff you could serialize with Serde and send.
let minified = diff.minify(8, base_data)
    .expect("This won't panic, as the data hasn't changed from calling the other functions.");

Future improvements

  • Rolling hash
  • Multi-threaded Signature::diff
  • Support read/write
    • Support to diff a reader
    • Support to apply to a writer
    • Fetch API for apply to get data on demand.
      • This could slow things down dramatically.
    • Implement Write for HashBuilder.

Structs

A delta between the local data and the data the Signature represents.

Several SegmentRef after each other.

A segment with a reference to the base data.

A segment with unknown contents. This will transmit the data.

A identifier of a file, much smaller than the file itself.

Builder of a Signature. Created using constructors on Signature (e.g. Signature::with_algorithm);

Enums

An error during Difference::apply.

The algorithms which can be used for hashing the data.

A segment of data corresponding to a multiple of Difference::block_size.