ptx-linker 0.3.0

NVPTX modules linker
docs.rs failed to build ptx-linker-0.3.0
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Visit the last successful build: ptx-linker-0.2.1

Rust PTX Linker

Build Status

Purpose

For some time, even without the linker, it is possible to create CUDA (PTX) kernels written with Rust.

The one could emit PTX code with --emit asm flag. But some problems come up when we need to write more or less complex kernels, which uses functions from external crates.

Unfortunately, --emit asm can't link couple modules into a single PTX. From dicsussion another solution revealed:

  1. Emit LLVM bitcode for every crate.
  2. Link the bitcodes with llvm-link.
  3. Compile output bitcode into PTX with llc.

Issues

According to Rust NVPTX metabug it's quite realistic to solve part of bugs within this repo:

  • Non-inlined functions can't be used cross crate - rust#38787
  • No "undefined reference" error is raised when it should be - rust#38786

Approach

The trick it to compile kernels crate as dylib.

So you usually have to add to your Cargo.toml:

[lib]
crate_type = ["dylib"]

And also, some modifications has to be made for target definition:

{
    "arch": "nvptx64",
    "cpu": "sm_20",
    "data-layout": "e-i64:64-v16:16-v32:32-n16:32:64",
    "linker": "ptx-linker",
    "linker-flavor": "ld",
    "linker-is-gnu": true,
    "dll-prefix": "",
    "dll-suffix": ".ptx",
    "dynamic-linking": true,
    "llvm-target": "nvptx64-nvidia-cuda",
    "max-atomic-width": 0,
    "os": "cuda",
    "obj-is-bitcode": true,
    "panic-strategy": "abort",
    "target-endian": "little",
    "target-pointer-width": "64",
    "target-c-int-width": "32"
}

Especially, the most important for the linker:

  • "linker": "ptx-linker" - the linker executable in PATH.
  • "linker-flavor": "ld" - currently we support only ld flavor parsing.
  • "linker-is-gnu": true - it needs for Rust to pass optimisation flag.
  • "dll-suffix": ".ptx" - correct file extension for PTX assembly output.
  • "dynamic-linking": true - allows Rust to create dylib.
  • "obj-is-bitcode": true - store bitcode instead of object files.

After that you can:

$ echo "Installing PTX linker"
$ cargo install ptx-linker

$ cd /path/to/kernels/crate
$ echo "Building PTX assembly output"
$ xargo rustc --target nvptx64-nvidia-cuda --release

We are not going to run any LLVM tools, because they are unlikely in PATH or their version is not the same as Rust's LLVM. What we are going to do, is to use LLVM api here through librustc_llvm.

For that purpose we have to find the the library and link against it - build script at build.rs is responsible for that job. A significant drawback here - very likely you'll need to recompile the linker after every rust update. This happens because rust commit contained in the library name is changed (e.g. rustc_llvm-697fdfdd74f1fb5d.so) and therefore dynamic loader won't find already gone library.