docs.rs failed to build ptx-linker-0.8.1
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.

Visit the last successful build: ptx-linker-0.2.1

Rust PTX Linker

LLVM NVPTX bitcode linker for Rust 🔥 without any external dependencies 🔥!

Purpose

It is definitely possible to create CUDA (PTX) kernels written with Rust even without the linker.

You could emit PTX code with --emit asm flag. Unfortunately, --emit asm can't link couple modules into a single PTX. Problems comes up when you need to write more or less complex kernels, which use functions from external crates.

From discussion another solution revealed: use of LLVM api.

The linker does the magic without any external dependency installed though. How could it be? you may ask. Thanks to rustc-llvm-proxy we avoid dependency on external LLVM lib and use rustc own one.

Windows users!

Unfortunately, due to rustc-llvm-proxy#1 MSVS targets are not supported yet.

You might face similar errors:

Unable to find symbol 'LLVMContextCreate' in the LLVM shared lib

For now the only solution is to use GNU targets.

Issues

According to Rust NVPTX metabug it's quite realistic to solve part of bugs within this repo:

Non-inlined functions can't be used cross crate - rust#38787
No "undefined reference" error is raised when it should be - rust#38786

Approach

The trick is to build a kernels crate as "dylib" and let the linker handle "linking".

For that, you need a special target definition json and to specify crate type in Cargo.toml:

[lib]
crate_type = ["dylib"]

Convinient usage

The easiest would be to rely on ptx-builder to handle device crate building. It will run xargo (which will invoke the linker after) and set all needed environment variables for comfortable development flow.

You can also refer to a tutorial about using CUDA kernels written in Rust.

Advanced usage

Alternatively, you can use the linker solo. First you need to install tools:

$ cargo install ptx-linker
$ cargo install xargo

Then, create a nvptx64-nvidia-cuda definition:

$ cd /path/to/kernels/crate
$ ptx-linker --print-target-json nvptx64-nvidia-cuda > nvptx64-nvidia-cuda.json

And finally, run a build with proper environment vars:

$ export RUST_TARGET_PATH="/path/to/kernels/crate"
$ xargo build --target nvptx64-nvidia-cuda --release

Eventually the linker will be used to produce a PTX assembly, that can be usually found at target/nvptx64-nvidia-cuda/release/KERNELS_CRATE_NAME.ptx.

Target definition

The common definition for nvptx64-nvidia-cuda looks like:

{
    "arch": "nvptx64",
    "cpu": "sm_20",
    "data-layout": "e-i64:64-v16:16-v32:32-n16:32:64",
    "linker": "ptx-linker",
    "linker-flavor": "ld",
    "linker-is-gnu": true,
    "dll-prefix": "",
    "dll-suffix": ".ptx",
    "dynamic-linking": true,
    "llvm-target": "nvptx64-nvidia-cuda",
    "max-atomic-width": 0,
    "os": "cuda",
    "obj-is-bitcode": true,
    "panic-strategy": "abort",
    "target-endian": "little",
    "target-pointer-width": "64",
    "target-c-int-width": "32"
}

Especially, the most important for the linker are the properties:

"linker" - the linker executable name in PATH.
"linker-flavor" - currently the linker supports parsing of ld-style arguments.
"linker-is-gnu" - needed to be true for Rust to pass optimisation flags.
"dll-suffix" - specifies a correct assembly file extension.
"dynamic-linking" - allows Rust to create dylib for the target.
"obj-is-bitcode" - store bitcode instead of object files, it's significantly easier to work with them.

ptx-linker 0.8.1