kermit-ds 0.1.1

Data structures used in Kermit
Documentation
# kermit-ds

Relation data structures for the Kermit workspace. Provides two trie-based implementations that store tuples of `usize` keys:

- [`TreeTrie`]src/ds/tree_trie/implementation.rs — a pointer-based trie where each node owns its sorted children. Simple and direct; preferable for small relations or pedagogical use.
- [`ColumnTrie`]src/ds/column_trie/implementation.rs — a column-oriented (flattened) trie that stores each depth in parallel `data`/`interval` arrays. More compact and cache-friendly on large relations.

Both implement [`Relation`](src/relation.rs) and [`TrieIterable`](../kermit-iters/src/trie.rs), so they're interchangeable in the join algorithms in [`kermit-algos`](../kermit-algos).

## Surface

- **Core traits**`Relation`, `Projectable`, `RelationFileExt`, `HeapSize`.
- **Metadata**`RelationHeader`, `ModelType`, `RelationError`.
- **Data structures**`TreeTrie`, `ColumnTrie`, plus the `IndexStructure` CLI enum.

## File loading

Any `Relation` automatically gains `from_csv` and `from_parquet` via the blanket `RelationFileExt` impl. CSV files must be `usize`-valued with a header row; Parquet files must be `Int64`-valued. Both extract the relation name from the file stem.

```rust,ignore
use kermit_ds::{RelationFileExt, TreeTrie};

let edges = TreeTrie::from_csv("edges.csv")?;
```

## Extending

To add a new data structure:

1. Implement `Relation + TrieIterable + HeapSize` in this crate.
2. Add a variant to `IndexStructure` in [`src/ds/mod.rs`]src/ds/mod.rs.
3. Wire it into the CLI dispatch (`run_ds_bench` / `run_benchmark` in [`../kermit/src/main.rs`]../kermit/src/main.rs).

See [`ARCHITECTURE.md`](../ARCHITECTURE.md) for the design rationale behind the trie layouts.