kermit-ds
Relation data structures for the Kermit workspace. Provides two trie-based implementations that store tuples of usize keys:
TreeTrie— a pointer-based trie where each node owns its sorted children. Simple and direct; preferable for small relations or pedagogical use.ColumnTrie— a column-oriented (flattened) trie that stores each depth in paralleldata/intervalarrays. More compact and cache-friendly on large relations.
Both implement Relation and TrieIterable, so they're interchangeable in the join algorithms in kermit-algos.
Surface
- Core traits —
Relation,Projectable,RelationFileExt,HeapSize. - Metadata —
RelationHeader,ModelType,RelationError. - Data structures —
TreeTrie,ColumnTrie, plus theIndexStructureCLI enum.
File loading
Any Relation automatically gains from_csv and from_parquet via the blanket RelationFileExt impl. CSV files must be usize-valued with a header row; Parquet files must be Int64-valued. Both extract the relation name from the file stem.
use ;
let edges = from_csv?;
Extending
To add a new data structure:
- Implement
Relation + TrieIterable + HeapSizein this crate. - Add a variant to
IndexStructureinsrc/ds/mod.rs. - Wire it into the CLI dispatch (
run_ds_bench/run_benchmarkin../kermit/src/main.rs).
See ARCHITECTURE.md for the design rationale behind the trie layouts.