SASS Assembler
A specialized assembler for NVIDIA's Shader Assembly (SASS) language, targeting native GPU machine code generation.
🏛️ Architecture
graph TB
subgraph "SASS Compilation Flow"
A[SASS Instructions] --> B[Kernel Builder]
B --> C[Instruction Encoder]
C --> D[ELF Encapsulation]
D --> E[CUDA Binary (.cubin)]
subgraph "Instruction Groups"
F[Floating Point (FADD/FMUL)]
G[Integer Ops (IMAD/ISETP)]
H[Memory Ops (LDG/STG)]
I[Control Flow (BRA/EXIT)]
end
B --> F
B --> G
B --> H
B --> I
end
🚀 Features
Core Capabilities
- Native GPU Encoding: Direct encoding of SASS instructions into their binary representation for NVIDIA GPUs.
- Kernel Management: Organizes instructions into valid GPU kernels with appropriate ELF section headers (
.text,.nv_fatbin). - Instruction Support: Covers a wide range of SASS instructions including floating-point arithmetic (
FADD,FMUL), integer operations, and global memory access (LDG,STG).
Advanced Features
- ELF/CUBIN Generation: Automatically wraps generated machine code into standard ELF containers compatible with the CUDA driver API.
- Disassembly View: (🚧) Provides tools to disassemble binary kernels back into human-readable SASS text.
- Control Flow: Supports basic control flow instructions and termination codes (
EXIT,NOP).
💻 Usage
Generating a Simple CUDA Kernel
The following example shows how to create a basic SASS kernel and write it to a binary file.
use ;
use fs;
🛠️ Support Status
| Feature | Support Level | Architectures |
|---|---|---|
| Float Ops (FADD/FMUL) | ✅ Full | Volta, Turing, Ampere |
| Integer Ops | ✅ Full | Volta, Turing, Ampere |
| Global Memory (LDG/STG) | ✅ Full | Volta, Turing, Ampere |
| Shared Memory (LDS/STS) | 🚧 Partial | - |
| Predicates | 🚧 Partial | - |
Legend: ✅ Supported, 🚧 In Progress, ❌ Not Supported
🔗 Relations
- gaia-types: Utilizes basic error handling and binary types for instruction encoding.
- gaia-assembler: Serves as the native NVIDIA GPU backend for the Gaia project, allowing high-performance offloading of compute tasks.