pgm_index 0.3.2

Ultra-fast learned PGM-Index for efficient sorted key lookup with bounded error
Documentation
# ๐Ÿ“ pgm_index โ€” Learned Index for Sorted Keys

![Crates.io Downloads (recent)](https://img.shields.io/crates/dr/pgm_index)

> PGM-Index is a space-efficient data structure for fast lookup in sorted sequences.  
> It approximates the distribution of keys with piecewise linear models, allowing searches in **O(log ฮต)** with a guaranteed error bound.

---

## ๐Ÿ“„ Algorithm

Based on the work by Paolo Ferragina & Giorgio Vinciguerra:  
> *The PGM-index: a fully-dynamic compressed learned index with provable worst-case bounds* (2020)  
๐Ÿ”— [Paper]https://doi.org/10.1145/3373718.3394764 ยท ๐ŸŒ [Official site]https://pgm.di.unipi.it/

---

## ๐Ÿ“ฆ Installation

Add to your `Cargo.toml`:

```toml
[dependencies]
pgm_index = "*"
````

---

## ๐Ÿ›  Usage

```rust
use pgm_index::PGMIndex;

fn main() {
    let data: Vec<u64> = (0..1_000_000).collect();
    let pgm = PGMIndex::new(&data, 32); // ฮต = 32

    let key = 123456;
    if let Some(pos) = pgm.search(key) {
        println!("Found at position {}", pos);
    } else {
        println!("Not found");
    }
}
```

---

## ๐ŸŽ Benchmarks by ฮต

Dataset: **1,000,000 elements**, 100,000 random queries
CPU: Intel Core i7 12700, Windows 11, single-threaded

| ฮต   | Build Time | Mem Usage | Segments | Single Lookup | Batch Lookup | Avg ns/query |
| --- | ---------: | --------: | -------: | ------------: | -----------: | -----------: |
| 16  |    2.19 ms |   7.99 MB |     3906 |     20.84 M/s |    25.35 M/s |  48.0 / 39.4 |
| 32  |    2.22 ms |   7.87 MB |     1953 |     24.09 M/s |    24.98 M/s |  41.5 / 40.0 |
| 64  |    2.05 ms |   7.75 MB |      976 |     21.96 M/s |    26.06 M/s |  45.5 / 38.4 |
| 128 |    2.07 ms |   7.69 MB |      488 |     19.64 M/s |    25.56 M/s |  50.9 / 39.1 |

**Binary Search (baseline)**:
4.32 ms, 23.13 M/s, 43.2 ns/query.

---

## ๐Ÿ“Š Comparison to Other Indexes (1M elements) *

Using **ฮต = 32** as a balanced configuration:

| Structure            | Memory Usage |   Build Time | Lookup Speed (single) |   Batch Lookup Speed |
| -------------------- | -----------: | -----------: | --------------------: | -------------------: |
| **PGM-Index (ฮต=32)** |  **7.87 MB** |  **2.22 ms** |         **24.09 M/s** |        **24.98 M/s** |
| **Binary Search**    |     \~8.0 MB | โ€” (no build) |   23.13 M/s *(0.96ร—)* |  23.13 M/s *(0.93ร—)* |
| **BTreeMap**         |      \~24 MB |      \~50 ms |   \~4.0 M/s *(0.17ร—)* |  \~4.0 M/s *(0.16ร—)* |
| **HashMap**          |      \~64 MB |      \~15 ms |  \~40.0 M/s *(1.66ร—)* | \~40.0 M/s *(1.60ร—)* |

* our benchmarks
---

### ๐Ÿ“ˆ Relative Performance *

| Metric        | vs Binary Search | vs BTreeMap      | vs HashMap       |
| ------------- | ---------------- | ---------------- | ---------------- |
| Memory        | **1.02ร— better** | **3.05ร— better** | **8.13ร— better** |
| Build Time    | โ€”                | **22.5ร— faster** | **6.8ร— faster**  |
| Single Lookup | **1.04ร— faster** | **6.0ร— faster**  | 0.6ร— slower      |
| Batch Lookup  | **1.08ร— faster** | **6.2ร— faster**  | 0.62ร— slower     |

* our benchmarks
---

## ๐Ÿ“Œ Potential Use Cases

* Indexing large sorted numeric datasets
* Time-series databases
* Read-optimized storage engines
* Scientific & bioinformatics data search
* Columnar store secondary indexes

---

## ๐Ÿ“œ License

MIT