Crate fuzzytags[−][src]

FuzzyTags

Anonymous messaging systems (and other privacy-preserving applications) often require a mechanism for one party to learn that another party has messaged them.

Many schemes rely on a bandwidth-intensive "download everything and attempt-decryption" approach. Others rely on a trusted 3rd party, or non-collusion assumptions, to provide a "private" service.

It would be awesome if we could get an untrusted, adversarial server to do the work for us without compromising metadata-resistance!

fuzzytags is an experimental probabilistic cryptographic tagging structure to do just that!

Specifically fuzzytags provides the following properties:

Correctness: Valid tags constructed for a specific public key will always validate when tested using a derived detection key.
Fuzziness: Tags will produce false positives with probability p related to the security property (γ) when tested against detection keys they were not intended for.
Security: An adversarial server with access to the detection key is unable to distinguish false positives from true positives. (Detection Ambiguity)

Security (hic sunt dracones)

This crate provides an experimental implementation of the FMD2 scheme described in "Fuzzy Message Detection". Using Ristretto as the prime order group.

This code has not undergone any significant review.

Further, the properties provided by this system are highly dependent on selecting a false positive rate p and scheme constant γ for your system. There is no one-size-fits-all approach.

If p is too low, then the probability of false positives will be very high.

If p is too high, then an adversarial server will be able to link messages to recipients with low probability.

Likewise a large γ means higher bandwidth costs, but a small γ reveals more of the secret keys to the server and increases false positives.

More Detailed System Description

There exists a metadata resistant application that uses untrusted servers to mediate communication between parties.

Each party can be identified with a set of cryptographic identifiers and there exists methods in or external to the system to distribute keys securely and authentically.

Now, instead of each party adopting a download-everything approach to metadata privacy (or invoking non-collusion or other assumptions) we can leverage fuzzytags to reduce the number of messages downloaded from the server by each party while maintaining a formalized concept of metadata privacy.

Every party generates a FuzzyTagKeyPair, consisting of a FuzzyTagSecretKey and a FuzzyTagPublicKey. These keys will be generated with a parameter γ that relates to the minimum false-positive probability 2^-γ.

When submitting messages to the server for an intended recipient, the sender will generate a new tag from the recipients FuzzyTagPublicKey.

All parties will extract a FuzzyTagDetectionKey from their key pair. This key will be of length n and provide a false positive detection probability of 0 <= 2^-n <= 2^-γ. This detection key can be given to an adversarial server.

When fetching new messages from the adversarial server, the server first runs a test of the tag of the message against the parties' detection key. If the tag passes the test, the message (along with the tag) is provided to the recipient.

Finally, the recipient runs their own test of the tag against an extracted detection key such that FuzzyTagSecretKey == FuzzyTagDetectionKey i.e. the probability of a false positive will be 2^-n == 2^-γ. This will produce a subset of messages likely intended for the recipient, with a smaller probability of false positives.

Alternatively the recipient can simply try and decrypt every message in the subset of messages that the server provided them (depending on the efficiency of the decryption method).

Usage

Generate a key pair:

let gamma = 3;
let key = FuzzyTagKeyPair::generate(gamma);

key.public_key can be given to parties who you want to be able to communicate with you over a specific anonymous messaging service / privacy-preserving application.

key.detection_key can be given to untrusted adversarial servers.

gamma is security property (γ) in the system. For a given gamma, a tag generated for a specific public key will validate against a random public key with a maximum probability of 2^-gamma.

Generating Tags

Once in possession of a public key, a party in a metadata resistant app can use it to generate tags:

  let tag = public_key.generate_tag();

These tags can then be attached to a message in a metadata resistant system.

Testing Tags

First it is necessary to extract a detection key for a given false positive probability 0 <= 2^-n <= 2^-γ.

This extracted key can then be given to an adversarial server. The server can then test a given tag against the detection key e.g.:

let detection_key = key.extract(5);
if detection_key.test_tag(tag) {
    // the message attached to this tag *might* be for the party associated with the detection key 
} else {
    // the message attached to this tag is definitely *not* for the party associated with the detection key.
}

Integrating FuzzyTags

The properties provided by this system are highly dependent on selecting a false positive rate p. In the following sections we will cover a number of considerations you should take into account when integrating fuzzytags into a larger privacy preserving application.

How bad is it to let people select their own false-positive rates?

The short answer is "it depends".

The longer answer:

When different parties have different false positive rates. The server can calculate the skew between a party's ideal false positive rate and observed false positive rate.

That skew leaks information, especially given certain message distributions. Specifically it leaks parties who receive a larger proportion of system messages than their ideal false positive rate.

i.e. for low false positive rates and high message volume for a specific receiver, the adversarial server can calculate a skew that leaks the receipient of individual messages - breaking privacy for that receiver.

It also removes those messages from the pool of messages that an adversarial server needs to consider for other receivers. Effectively reducing the anonymity set for everyone else.

Which brings us onto:

Differential Attacks

Any kind of differential attacks break this scheme, even for a small number of messages i.e. if you learn (through any means, that a specific set of messages are all likely for 1 party, you can diff them against all other parties keys and very quickly isolate the intended recipient - in simulations of 100-1000 parties it can take as little as 3 messages - even with everyone selecting fairly high false positive rates.

The corollary of the above being that in differential attacks your anonymity set is basically the number of users who download all messages - since you can't diff them. This has the interesting side effect: the more parties who download everything, the more the system can safely tolerate parties with small false-positive rates.

To what extent you can actually account for this in your application is an open question.

Should Senders use an anonymous communication network?

If differential attacks are likely e.g. few parties download everything and multiple messages are expect to originate from a sender to a receiver or there is other information that might otherwise link a set of messages to a receiver then you may want to consider how to remove that context.

One potential way of removing context is by having senders send their message to the server through some kind of anonymous communication network e.g. a mixnet or tor.

Be warned: This may not eliminate all the context!

How bad is it to select a poor choice of p?

Consider a pareto distribution where most users only receive a few messages, and small subset of users receive a large number of messages it seems that increasing the number of parties is generally more important to overall anonymity of the system than any individual selection of p.

Under a certain threshold of parties, trivial breaks (i.e. tags that only match to a single party) are a bigger concern.

Assuming we have large number of parties (N), the following heuristic emerges:

Parties who only expect to receive a small number of messages can safely choose smaller false positive rates, up to a threshold θ, where θ > 2^-N. The lower the value of θ the greater the possibility of random trivial breaks for the party.
Parties who expect a large number of messages should choose to receive all messages for 2 reasons:
1. Even high false positive rates for power users result in information leaks to the server (due to the large skew) i.e. a server can trivially learn what users are power users.
2. By choosing to receive all messages, power users don't sacrifice much in term of bandwidth, but will provide cover for parties who receive a small number of messages and who want a lower false-positive rate.

(We consider a pareto distribution here because we expect many applications to have parties that can be modelled as such - especially over short-time horizons)

Structs

FuzzyDetectionKey	A collection of "secret" data that can be used to determine if a `FuzzyTag` was intended for the derived public key with probability p
FuzzyPublicKey	A public identity that others can create tags for.
FuzzySecretKey	The complete secret key. Can't directly be used for testing. Instead you will need to generate a FuzzyDetectionKey using extract
FuzzyTag	A tag is a probabilistic cryptographic structure. When constructed for a given `FuzzyPublicKey` it will pass the `FuzzyDetectionKey::test` 100% of the time. For other public keys it will pass the test with probability `gamma` related to the security parameter of the system. This system provides the following security properties:
FuzzyTagKeyPair	An identity keypair for generating and validating fuzzy meta tags.