pub struct BucketedCompileCache { /* private fields */ }Implementations§
Source§impl BucketedCompileCache
impl BucketedCompileCache
pub fn new(device: Device, buckets: Vec<Range<u64>>) -> Self
Sourcepub fn power_of_two_ladder(device: Device, min: u64, max: u64) -> Self
pub fn power_of_two_ladder(device: Device, min: u64, max: u64) -> Self
Power-of-two ladder over [1, max], with extents
[min_pow2, 2·min_pow2, 4·min_pow2, …, max_pow2] where
min_pow2 = min.next_power_of_two() and max_pow2 is the smallest
power of two ≥ max. Each bucket compiles at its upper-bound
extent, so an actual value in bucket (prev_extent .. ext] runs
kernels at extent ext (not at the worst case of the whole range).
Guarantees compute waste from padding ≤2× — actual > ext / 2
for every bucket except possibly the smallest.
Example: power_of_two_ladder(Device::Cpu, 8, 256) yields buckets
1..9, 9..17, 17..33, 33..65, 65..129, 129..257 with compile
extents 8, 16, 32, 64, 128, 256. An actual = 17 runs at extent
32 instead of the 255 a single wide 1..256 bucket would compile
at — that’s the “skip compute” win, paid for with O(log max)
compiled artifacts instead of one.
pub fn power_of_two_ladder_with_policy( device: Device, min: u64, max: u64, policy: Option<PrecisionPolicy>, ) -> Self
pub fn with_policy( device: Device, buckets: Vec<Range<u64>>, policy: Option<PrecisionPolicy>, ) -> Self
Sourcepub fn get_or_compile<F: FnOnce(u64) -> Graph>(
&mut self,
key: u64,
build: F,
) -> Option<(u64, &mut CompiledGraph)>
pub fn get_or_compile<F: FnOnce(u64) -> Graph>( &mut self, key: u64, build: F, ) -> Option<(u64, &mut CompiledGraph)>
Find the bucket containing key, compile if needed, return
(upper, &mut CompiledGraph) where upper = range.end - 1 is the
extent the graph was compiled for. Caller pads inputs to upper
before calling run. Returns None if key is outside every
bucket — caller decides whether to fall back to a one-off compile.
build receives upper and must return a Graph specialized for
that extent.
Sourcepub fn get_or_compile_with_options<F: FnOnce(u64) -> Graph>(
&mut self,
key: u64,
build: F,
options: &CompileOptions,
) -> Option<(u64, &mut CompiledGraph)>
pub fn get_or_compile_with_options<F: FnOnce(u64) -> Graph>( &mut self, key: u64, build: F, options: &CompileOptions, ) -> Option<(u64, &mut CompiledGraph)>
Like Self::get_or_compile with explicit [CompileOptions].
Sourcepub fn get_or_compile_hir<F: FnOnce(u64) -> HirModule>(
&mut self,
key: u64,
build: F,
) -> Option<(u64, &mut CompiledGraph)>
pub fn get_or_compile_hir<F: FnOnce(u64) -> HirModule>( &mut self, key: u64, build: F, ) -> Option<(u64, &mut CompiledGraph)>
Like Self::get_or_compile but builds and compiles HIR directly
through the fusion-first pipeline (Session::compile_hir).
Sourcepub fn get_or_compile_hir_with_options<F: FnOnce(u64) -> HirModule>(
&mut self,
key: u64,
build: F,
options: &CompileOptions,
) -> Option<(u64, &mut CompiledGraph)>
pub fn get_or_compile_hir_with_options<F: FnOnce(u64) -> HirModule>( &mut self, key: u64, build: F, options: &CompileOptions, ) -> Option<(u64, &mut CompiledGraph)>
Like Self::get_or_compile_hir with explicit [CompileOptions] (tier-1 profile, fusion target, …).
Sourcepub fn bucket_for(&self, key: u64) -> Option<usize>
pub fn bucket_for(&self, key: u64) -> Option<usize>
Index of the bucket containing key, or None if out of range.
Linear scan — bucket counts are small in practice.
pub fn buckets(&self) -> impl Iterator<Item = &Range<u64>>
Sourcepub fn compiled_count(&self) -> usize
pub fn compiled_count(&self) -> usize
Number of buckets that have been compiled so far (≤ total buckets).
pub fn total_buckets(&self) -> usize
Sourcepub fn run_padded<F: FnOnce(u64) -> Graph>(
&mut self,
key: u64,
actual_rows: usize,
build: F,
inputs: &[(&str, &[f32], usize)],
output_inners: &[usize],
) -> Option<(u64, Vec<Vec<f32>>)>
pub fn run_padded<F: FnOnce(u64) -> Graph>( &mut self, key: u64, actual_rows: usize, build: F, inputs: &[(&str, &[f32], usize)], output_inners: &[usize], ) -> Option<(u64, Vec<Vec<f32>>)>
“Compile at max, run at less” convenience for inputs and outputs whose outer dimension is the bucket key:
- Find or compile the bucket containing
key. - For each input, pad to
upperrows along the outer dim usingpad_rows(caller passes the inner-dim stride per input;inner = 1for purely 1D inputs). - Run the compiled graph at full extent.
- Slice each output back to
actual_rowsalong its outer dim. Outputs flagged withinner = 0inoutput_innersare returned unsliced (use this for extent-independent outputs like a pooled[hidden]embedding). Missing entries past the end ofoutput_innersare also returned unsliced.
Returns (upper, outputs). Returns None if key falls outside
every bucket.
Compute scope: kernels execute at the bucket’s compile
extent (upper), not at actual_rows. This means smaller
buckets directly translate to less padded compute. With
power_of_two_ladder the worst-
case waste is bounded at 2×; with hand-tuned buckets it can be
arbitrarily tight. True active-extent dispatch — one big
compile, kernels short-circuit at runtime — is a separate
per-backend change.
Sourcepub fn ensure_graph_with_params<F>(
&mut self,
key: u64,
build: F,
options: &CompileOptions,
) -> Option<(u64, &mut CompiledGraph)>
pub fn ensure_graph_with_params<F>( &mut self, key: u64, build: F, options: &CompileOptions, ) -> Option<(u64, &mut CompiledGraph)>
Like Self::get_or_compile_with_options but also uploads params on first compile.
Sourcepub fn ensure_hir_with_params<F>(
&mut self,
key: u64,
build: F,
options: &CompileOptions,
) -> Option<(u64, &mut CompiledGraph)>
pub fn ensure_hir_with_params<F>( &mut self, key: u64, build: F, options: &CompileOptions, ) -> Option<(u64, &mut CompiledGraph)>
HIR variant of Self::ensure_graph_with_params.
Sourcepub fn run_padded_mixed<F>(
&mut self,
key: u64,
actual_rows: usize,
build: F,
inputs: &[CacheRunInput<'_>],
output_inners: &[usize],
) -> Option<(u64, Vec<Vec<f32>>)>
pub fn run_padded_mixed<F>( &mut self, key: u64, actual_rows: usize, build: F, inputs: &[CacheRunInput<'_>], output_inners: &[usize], ) -> Option<(u64, Vec<Vec<f32>>)>
Self::run_padded with per-input optional row padding (CacheRunInput).
Auto Trait Implementations§
impl Freeze for BucketedCompileCache
impl !RefUnwindSafe for BucketedCompileCache
impl Send for BucketedCompileCache
impl !Sync for BucketedCompileCache
impl Unpin for BucketedCompileCache
impl UnsafeUnpin for BucketedCompileCache
impl !UnwindSafe for BucketedCompileCache
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more