Skip to main content

Module executor

Module executor 

Source
Expand description

Unified parallel execution for heterogeneous devices.

The UnifiedExecutor handles kernel execution across any mix of devices (CPU, CUDA, Metal, etc.) with proper synchronization and dependency tracking.

§Design Principles

  1. Single abstraction - One executor handles any device mix
  2. Device-agnostic sync - Timeline signals abstract over device-specific primitives
  3. Zero overhead for single-device - Fast path skips synchronization when possible
  4. Buffer dependency tracking - Following Tinygrad’s _access_resources() pattern

§Example

let mut executor = UnifiedExecutor::new();
executor.add_device(DeviceSpec::Cpu)?;

// Execute schedule - handles dependencies automatically
let output_id = executor.execute(&schedule)?;

§Execution Graph

For complex schedules with multiple devices, the executor builds an execution graph (DAG) where nodes are kernel operations and edges are buffer dependencies. Independent kernels on the same device can be batched, and kernels on different devices can run in parallel (with appropriate synchronization).

Structs§

DeviceContext
Per-device execution context.
ExecutionGraph
Execution graph representing a DAG of kernel operations.
ExecutionNode
A node in the execution graph representing a kernel or transfer operation.
KernelBufferAccess
Buffer access information for parallel kernel execution.
UnifiedExecutor
Unified executor for heterogeneous device execution.

Enums§

SyncStrategy
Cross-device synchronization strategy.

Functions§

global_executor
Get access to the global executor.