Skip to main content

Module link

Module link 

Source
Expand description

Link-time optimisation for JIT-linking multiple PTX modules.

This module wraps the CUDA linker API (cuLinkCreate, cuLinkAddData, cuLinkAddFile, cuLinkComplete, cuLinkDestroy) for combining multiple PTX, cubin, or fatbin inputs into a single linked binary.

§Platform behaviour

On macOS (where NVIDIA dropped CUDA support), all linker operations use a synthetic in-memory implementation. PTX inputs are accumulated and concatenated into a synthetic cubin blob so that the full API surface can be exercised in tests without a GPU.

§Example

let opts = LinkerOptions::default();
let mut linker = Linker::new(opts)?;

linker.add_ptx(r#"
    .version 7.0
    .target sm_70
    .address_size 64
    .visible .entry kernel_a() { ret; }
"#, "module_a.ptx")?;

linker.add_ptx(r#"
    .version 7.0
    .target sm_70
    .address_size 64
    .visible .entry kernel_b() { ret; }
"#, "module_b.ptx")?;

let linked = linker.complete()?;
println!("cubin size: {} bytes", linked.cubin_size());

Structs§

LinkedModule
The output of a successful link operation.
Linker
RAII wrapper around the CUDA link state (CUlinkState).
LinkerOptions
Options controlling the JIT linker’s behaviour.

Enums§

FallbackStrategy
Strategy when an exact binary match is not found for the target GPU.
LinkInputType
The type of input data being added to the linker.
OptimizationLevel
JIT optimisation level for the linker.