ff_k_center 1.2.2

A linear-time k-center algorithm with fairness conditions and worst-case guarantees that is very fast in practice. Includes python bindings.
Documentation
# Code of ICML submission "Fair and Fast k-Center Clustering for Data Summarization"

This is the Rust implementation for the fast privacy preserving representative k-center (Priv-Rep-kC) algorithm described in our [ICML paper](https://proceedings.mlr.press/v162/angelidakis22a.html).
For an instance with n data points that should be covered by at most k centers, this algorithm has a running time guarantee of $O(nk^2 + k^5)$.

The code is a Rust library that implements a python interface, leading to the following two options to run our algorithm:

- It can be imported into another Rust project.
- After installing the python wheel, the algorithm can be executed from any python script or jupyter notebook.

As the python interface is most convenient, we focus on this way to run our code.

## Requirements

- python >=3.6
- matplotlib (optional)

## Installation of the python wheel

Use the package manager [pip](https://pip.pypa.io/en/stable/) to install the python-wheel:

```bash
pip install ff_k_center-1.2.2-cp36-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
```

Moreover, we recommend to install matplotlib to get plots of a computed clustering:

```bash
pip install matplotlib
```

## Usage via python interface

The following python code presents the essential functions of the python interface.

```python
from ff_k_center import FFKCenter
import numpy as np

model = FFKCenter(4,1,[(1,13)]) # parameters: k, privacy_bound = 1, rep_intervals = []
pos = np.array([[1,2],[0,0],[6,7],[0,-1],[-2,-3],[0,-1.2],[1,3],[6,1]])
colors = np.array([0,0,0,1,1,0,0,1])


model.fit(pos, colors, verbose = 2, thread_count = 6, phase_2_rerun = False, phase_5_gonzalez = True) # positions of points; followed by colors; Optional: verbose (0: silent, 1: brief, 2: verbose; default: 1); thread_count (default: #cores); phase_2_rerun (default: True); phase_5_gonzalez (default: False);

print("\nr^*: ", model.radius) # the final radius of the assignment
print("C^*: ", model.centers) # Chosen centers by the point-index.
print("phi^*: ", model.assignment) # for each point the center it is assigned to.
print("running time: ", model.running_time) # running time in sec
```

The full interface can be seen in the jupyter notebook ```showcase/showcase.ipynb```, which can also be used as playground for testing a toy example.

Alternatively, you can use ```experiments/computational_study.ipynb``` for recreating our experiments. (Warning: As we created the plots in the paper by iterating over many different parameters values this can take several hours, even though a single computation is fast.) The data sets used in the manuscript are included and can be found in ```experiments/data/```.

By running ```python_scripts/create_2d_space.py``` it is possible to easy create new toy examples by left clicking into the figure. Use number keys '0' to '9' to change the color and 'Del' to go into deletion mode. The data set is saved after closing the window.

## Uninstall python wheel

Use the following command to uninstall the python wheel:

```bash
pip uninstall ff_k_center
```

## Building the python wheel by compiling the code

There is no need to recreate the python wheel as it is included in the provided code.
Nevertheless, to be able to verify that the python wheel indeed corresponds to the actual code, we provide information here on how it can be compiled from the Rust code.
This can be done via the tool [maturin](https://github.com/PyO3/maturin).

The most convenient way is to use the docker image [pyo3/maturin](https://ghcr.io/pyo3/maturin):

```bash
docker run --rm -v $(pwd):/io ghcr.io/pyo3/maturin build --release
```

Afterwards the python wheel can be found in ```target/wheels/```.

## Usage as Rust library

There is an official create on [crates.io](https://crates.io/crates/ff_k_center). To use it make sure that the provided Rust crate is listed as under ```dependencies``` in your Cargo.toml:

```toml
ff_k_center = "1.2.2"
```

You can load all functionality to a new Rust project by

```rust
use ff_k_center::*;
```

The code below shows an example that creates a data set of 5000 random points on which Priv-Rep-kC is solved with our algorithm.

```rust
use ff_k_center::*;
fn main() {
    let n = 5_000;
    let k = 8;
    let space = SpaceND::new_random(n);
    let prob = ClusteringProblem{
        k, // number of centers;
        privacy_bound : n/k, // number of points to represent;
        rep_intervals : vec!((0,500),(0,500),(0,500)), // representation interval [a,b] for each color class; for color classes without interval we subsitute [0. inf]
    };
    let (clustering, total_time) = compute_privacy_preserving_representative_k_center(&space, &prob, None);
    println!("Radius: {}; Running time: {}s.", clustering.get_radius(), total_time);
}
```

For more details on how to load a data set, save the computed clustering, set optional parameters, and more, we refer to the [docs.rs](https://docs.rs/ff_k_center/1.2.2/). The documentation can also locally generated and opened with:

```bash
cargo doc --open
```