SASS Assembler
A specialized assembler for NVIDIA's Shader Assembly (SASS) language, targeting native GPU machine code generation.
🏛️ Architecture
graph TB
subgraph Flow [SASS Compilation Flow]
direction TB
A[SASS Instructions] --> B[Kernel Builder]
B --> C[Instruction Encoder]
C --> D[ELF Encapsulation]
D --> E[CUDA Binary .cubin]
end
subgraph Groups [Instruction Groups]
direction LR
F[Floating Point]
G[Integer Ops]
H[Memory Ops]
I[Control Flow]
end
B -.-> F
B -.-> G
B -.-> H
B -.-> I
🚀 Features
Core Capabilities
- Native GPU Encoding: Direct encoding of SASS instructions into their binary representation for NVIDIA GPUs.
- Kernel Management: Organizes instructions into valid GPU kernels with appropriate ELF section headers (
.text,.nv_fatbin). - Instruction Support: Covers a wide range of SASS instructions including floating-point arithmetic (
FADD,FMUL), integer operations, and global memory access (LDG,STG).
Advanced Features
- ELF/CUBIN Generation: Automatically wraps generated machine code into standard ELF containers compatible with the CUDA driver API.
- Disassembly View: (🚧) Provides tools to disassemble binary kernels back into human-readable SASS text.
- Control Flow: Supports basic control flow instructions and termination codes (
EXIT,NOP).
💻 Usage
Generating a Simple CUDA Kernel
The following example shows how to create a basic SASS kernel and write it to a binary file.
use ;
use fs;
🛠️ Support Status
| Feature | Support Level | Architectures |
|---|---|---|
| Float Ops (FADD/FMUL) | ✅ Full | Volta, Turing, Ampere |
| Integer Ops | ✅ Full | Volta, Turing, Ampere |
| Global Memory (LDG/STG) | ✅ Full | Volta, Turing, Ampere |
| Shared Memory (LDS/STS) | 🚧 Partial | - |
| Predicates | 🚧 Partial | - |
Legend: ✅ Supported, 🚧 In Progress, ❌ Not Supported
🔗 Relations
- gaia-types: Utilizes basic error handling and binary types for instruction encoding.
- gaia-assembler: Serves as the native NVIDIA GPU backend for the Gaia project, allowing high-performance offloading of compute tasks.