Module executor

Expand description

Unified parallel execution for heterogeneous devices.

The UnifiedExecutor handles kernel execution across any mix of devices (CPU, CUDA, Metal, etc.) with proper synchronization and dependency tracking.

§Design Principles

Single abstraction - One executor handles any device mix
Device-agnostic sync - Timeline signals abstract over device-specific primitives
Zero overhead for single-device - Fast path skips synchronization when possible
Buffer dependency tracking - Following Tinygrad’s _access_resources() pattern

§Example

let mut executor = UnifiedExecutor::new();
executor.add_device(DeviceSpec::Cpu)?;

// Execute schedule - handles dependencies automatically
let output_id = executor.execute(&schedule)?;

§Execution Graph

For complex schedules with multiple devices, the executor builds an execution graph (DAG) where nodes are kernel operations and edges are buffer dependencies. Independent kernels on the same device can be batched, and kernels on different devices can run in parallel (with appropriate synchronization).

Structs§

DeviceContext: Per-device execution context.
ExecutionGraph: Execution graph representing a DAG of kernel operations.
ExecutionNode: A node in the execution graph representing a kernel or transfer operation.
KernelBufferAccess: Buffer access information for parallel kernel execution.
UnifiedExecutor: Unified executor for heterogeneous device execution.

Enums§

SyncStrategy: Cross-device synchronization strategy.

Functions§

global_executor: Get access to the global executor.

Module executor

Module executor Copy item path

§Design Principles

§Example

§Execution Graph

Structs§

Enums§

Functions§

Module executor