krnl
Safe, portable, high performance compute (GPGPU) kernels.
Developed for autograph.
- Similar functionality to CUDA and OpenCL.
- Supports GPU's and other Vulkan 1.2 capable devices.
- MacOS / iOS supported via MoltenVK.
- Kernels are written inline, entirely in Rust.
- Simple iterator patterns can be implemented without unsafe.
- Supports inline SPIR-V assembly.
- DebugPrintf integration, generates backtraces for panics.
- Buffers on the host can be accessed natively as Vecs and slices.
krnlc
Kernel compiler for krnl.
- Built on RustGPU's spirv-builder.
- Supports dependencies defined in Cargo.toml.
- Uses spirv-tools to validate and optimize.
- Compiles to "krnl-cache.rs", so the crate will build on stable Rust.
See the docs for installation and usage instructions.
Installing
For device functionality (kernels), install Vulkan for your platform.
- For development, it's recomended to install the LunarG Vulkan SDK, which includes additional tools:
- vulkaninfo
- Validation layers
- DebugPrintf
- spirv-tools
- This is used by krnlc for spirv validation and optimization.
- krnlc builds by default without needing spirv-tools to be installed.
- This is used by krnlc for spirv validation and optimization.
Test
- Check that
vulkaninfo --summaryshows your devices.- Instance version should be >= 1.2.
- Alternatively, check that
cargo test --test integration_tests -- --exact noneshows your devices.- You can run all the tests with
cargo test.
- You can run all the tests with
Getting Started
See the docs or build them locally with cargo doc --all-features --open.
Example
use ;
Performance
NVIDIA GeForce GTX 1060 with Max-Q Design
alloc
krnl |
cuda |
ocl |
|
|---|---|---|---|
1,000,000 |
319.07 ns (✅ 1.00x) |
112.83 us (❌ 353.62x slower) |
486.10 ns (❌ 1.52x slower) |
10,000,000 |
318.22 ns (✅ 1.00x) |
1.11 ms (❌ 3494.06x slower) |
493.02 ns (❌ 1.55x slower) |
64,000,000 |
318.40 ns (✅ 1.00x) |
6.31 ms (❌ 19803.98x slower) |
493.07 ns (❌ 1.55x slower) |
upload
krnl |
cuda |
ocl |
|
|---|---|---|---|
1,000,000 |
339.76 us (✅ 1.00x) |
363.93 us (✅ 1.07x slower) |
789.44 us (❌ 2.32x slower) |
10,000,000 |
4.90 ms (✅ 1.00x) |
3.81 ms (✅ 1.29x faster) |
8.84 ms (❌ 1.80x slower) |
64,000,000 |
25.92 ms (✅ 1.00x) |
24.58 ms (✅ 1.05x faster) |
56.74 ms (❌ 2.19x slower) |
download
krnl |
cuda |
ocl |
|
|---|---|---|---|
1,000,000 |
593.88 us (✅ 1.00x) |
461.01 us (✅ 1.29x faster) |
20.12 ms (❌ 33.88x slower) |
10,000,000 |
5.66 ms (✅ 1.00x) |
4.07 ms (✅ 1.39x faster) |
20.13 ms (❌ 3.55x slower) |
64,000,000 |
29.50 ms (✅ 1.00x) |
25.71 ms (✅ 1.15x faster) |
37.48 ms (❌ 1.27x slower) |
zero
krnl |
cuda |
ocl |
|
|---|---|---|---|
1,000,000 |
38.49 us (✅ 1.00x) |
25.31 us (✅ 1.52x faster) |
35.16 us (✅ 1.09x faster) |
10,000,000 |
254.52 us (✅ 1.00x) |
243.01 us (✅ 1.05x faster) |
252.41 us (✅ 1.01x faster) |
64,000,000 |
1.54 ms (✅ 1.00x) |
1.55 ms (✅ 1.01x slower) |
1.56 ms (✅ 1.02x slower) |
saxpy
krnl |
cuda |
ocl |
|
|---|---|---|---|
1,000,000 |
88.59 us (✅ 1.00x) |
81.25 us (✅ 1.09x faster) |
89.24 us (✅ 1.01x slower) |
10,000,000 |
742.25 us (✅ 1.00x) |
770.35 us (✅ 1.04x slower) |
780.49 us (✅ 1.05x slower) |
64,000,000 |
4.68 ms (✅ 1.00x) |
4.91 ms (✅ 1.05x slower) |
4.92 ms (✅ 1.05x slower) |
Recent Changes
See Releases.md
License
Dual-licensed to be compatible with the Rust project.
Licensed under the Apache License, Version 2.0 http://www.apache.org/licenses/LICENSE-2.0 or the MIT license http://opensource.org/licenses/MIT, at your option. This file may not be copied, modified, or distributed except according to those terms.
Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.