Expand description
Shape-bucketed compile cache.
Lets variable-shape callers (e.g., embedding-model wrappers that vary
batch + seq per request) amortize the per-(shape) compile cost. Cache
keys are caller-provided u64s — the caller decides what counts as a
shape bucket. Typical recipe: (batch as u64) << 32 | seq as u64.
The cache stores one CompiledGraph per key. Params loaded onto a
cached entry persist for that entry — re-fetching from cache does
not require re-running set_param. Eviction is FIFO, capped at
capacity entries (good enough for the current “a handful of common
shapes” usage pattern; switch to LRU if a real workload shows churn).
§Example
ⓘ
let mut cache = CompileCache::new(Device::Metal, 8);
let key = ((batch as u64) << 32) | seq as u64;
let mut compiled = cache.get_or_compile(key, || build_my_graph(batch, seq));
// First call for `key`: compiles. Subsequent calls: cache hit.
compiled.run(&[("x", &input_data)]);Structs§
- Bucketed
Compile Cache - Cache
RunInput - Named runtime input for
BucketedCompileCache::run_padded_mixed. - Compile
Cache - Dynamic
DimCompile Cache - Compile-once / specialize-at-runtime cache for symbolic HIR modules.
Functions§
- pad_
rows - Pad
data(interpreted as[actual, inner]row-major) up toupperrows by appending zeros. Returns aVec<f32>of lengthupper * inner. Companion ofslice_rowsfor the “compile at max, run at less” workflow withBucketedCompileCache. - slice_
rows - Slice
data(interpreted as[upper, inner]row-major) down toactualrows. Companion ofpad_rows.