Expand description
Runtime execution for svod kernels.
Provides generic kernel execution interface with backend-specific implementations (LLVM JIT, native shared libraries, CUDA, etc.).
§Execution Model
execution_plan is the canonical runtime path and executes prepared
operations in dependency order with hazard-aware host parallelism.
The executor module remains available for explicit parallel scheduling
scenarios.
§Benchmarking
The benchmark module provides timing utilities for measuring kernel
execution performance, used by beam search auto-tuning.
Re-exports§
pub use benchmark::BenchmarkConfig;pub use benchmark::BenchmarkResult;pub use benchmark::benchmark_kernel;pub use benchmark::warmup_thread_pool;pub use custom_function::run_custom_function;pub use device_registry::DEVICE_FACTORIES;pub use devices::cpu::CpuBackend;pub use devices::cpu::create_cpu_device;pub use devices::cpu::create_cpu_device_with_backend;pub use devices::cpu_queue::CpuQueue;pub use execution_plan::ExecutionPlan;pub use execution_plan::ExecutionPlanBuilder;pub use execution_plan::PreparedBufferView;pub use execution_plan::PreparedCopy;pub use execution_plan::PreparedCustomFunction;pub use execution_plan::PreparedKernel;pub use execution_plan::PreparedOp;pub use executor::DeviceContext;pub use executor::ExecutionGraph;pub use executor::ExecutionNode;pub use executor::KernelBufferAccess;pub use executor::SyncStrategy;pub use executor::UnifiedExecutor;pub use executor::global_executor;pub use profiler::KernelProfile;pub use error::*;pub use kernel_cache::*;pub use llvm::*;
Modules§
- benchmark
- Kernel benchmarking infrastructure for auto-tuning.
- clang
- Clang compilation backend for C codegen.
- custom_
function - device_
registry - Device factory registry for runtime device creation and caching.
- devices
- Device implementations for different backends.
- error
- Error types for runtime execution.
- execution_
plan - Pre-compiled execution plan for kernel execution.
- executor
- Unified parallel execution for heterogeneous devices.
- jit_
loader - JIT ELF loader: compiles C source via clang stdin→stdout, parses the
relocatable ELF with the
objectcrate, copies sections into an anonymous mmap, applies relocations, and returns an executable function pointer. - kernel_
cache - Global kernel deduplication cache.
- llvm
- LLVM JIT compilation via external clang + ELF loader.
- profiler
- Per-kernel execution profiling.