| A class representing MuHash sets
|
| MuHash is a hashing algorithm that supports
| adding set elements in any order but
| also deleting in any order. As a result,
| it can maintain a running sum for a set
| of data as a whole, and add/remove when
| data is added to or removed from it. A
| downside of MuHash is that computing
| an inverse is relatively expensive.
| This is solved by representing the running
| value as a fraction, and multiplying
| added elements into the numerator and
| removed elements into the denominator.
| Only when the final hash is desired,
| a single modular inverse and multiplication
| is needed to combine the two. The combination
| is also run on serialization to allow
| for space-efficient storage on disk.
|
| As the update operations are also associative,
| H(a)+H(b)+H(c)+H(d) can in fact be
| computed as (H(a)+H(b)) + (H(c)+H(d)).
| This implies that all of this is perfectly
| parallellizable: each thread can process
| an arbitrary subset of the update operations,
| allowing them to be efficiently combined
| later.
|
| MuHash does not support checking if
| an element is already part of the set.
| That is why this class does not enforce
| the use of a set as the data it represents
| because there is no efficient way to
| do so.
|
| It is possible to add elements more than
| once and also to remove elements that
| have not been added before. However,
| this implementation is intended to
| represent a set of elements.
|
| See also https://cseweb.ucsd.edu/~mihir/papers/inchash.pdf
| and https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2017-May/014337.html.
|