Module link

Expand description

Link-time optimisation for JIT-linking multiple PTX modules.

This module wraps the CUDA linker API (cuLinkCreate, cuLinkAddData, cuLinkAddFile, cuLinkComplete, cuLinkDestroy) for combining multiple PTX, cubin, or fatbin inputs into a single linked binary.

§Platform behaviour

On macOS (where NVIDIA dropped CUDA support), all linker operations use a synthetic in-memory implementation. PTX inputs are accumulated and concatenated into a synthetic cubin blob so that the full API surface can be exercised in tests without a GPU.

§Example

let opts = LinkerOptions::default();
let mut linker = Linker::new(opts)?;

linker.add_ptx(r#"
    .version 7.0
    .target sm_70
    .address_size 64
    .visible .entry kernel_a() { ret; }
"#, "module_a.ptx")?;

linker.add_ptx(r#"
    .version 7.0
    .target sm_70
    .address_size 64
    .visible .entry kernel_b() { ret; }
"#, "module_b.ptx")?;

let linked = linker.complete()?;
println!("cubin size: {} bytes", linked.cubin_size());

Structs§

LinkedModule: The output of a successful link operation.
Linker: RAII wrapper around the CUDA link state (CUlinkState).
LinkerOptions: Options controlling the JIT linker’s behaviour.

Enums§

FallbackStrategy: Strategy when an exact binary match is not found for the target GPU.
LinkInputType: The type of input data being added to the linker.
OptimizationLevel: JIT optimisation level for the linker.

Module link

Module link Copy item path

§Platform behaviour

§Example

Structs§

Enums§

Module link