SASS Assembler

A specialized assembler for NVIDIA's Shader Assembly (SASS) language, targeting native GPU machine code generation.

🏛️ Architecture

graph TB
    subgraph Flow [SASS Compilation Flow]
        direction TB
        A[SASS Instructions] --> B[Kernel Builder]
        B --> C[Instruction Encoder]
        C --> D[ELF Encapsulation]
        D --> E[CUDA Binary .cubin]
    end

    subgraph Groups [Instruction Groups]
        direction LR
        F[Floating Point]
        G[Integer Ops]
        H[Memory Ops]
        I[Control Flow]
    end

    B -.-> F
    B -.-> G
    B -.-> H
    B -.-> I

🚀 Features

Core Capabilities

Native GPU Encoding: Direct encoding of SASS instructions into their binary representation for NVIDIA GPUs.
Kernel Management: Organizes instructions into valid GPU kernels with appropriate ELF section headers (.text, .nv_fatbin).
Instruction Support: Covers a wide range of SASS instructions including floating-point arithmetic (FADD, FMUL), integer operations, and global memory access (LDG, STG).

Advanced Features

ELF/CUBIN Generation: Automatically wraps generated machine code into standard ELF containers compatible with the CUDA driver API.
Disassembly View: (🚧) Provides tools to disassemble binary kernels back into human-readable SASS text.
Control Flow: Supports basic control flow instructions and termination codes (EXIT, NOP).

💻 Usage

Generating a Simple CUDA Kernel

The following example shows how to create a basic SASS kernel and write it to a binary file.

use sass_assembler::{SassProgram, SassWriter, instructions::SassInstruction};
use std::fs;

fn main() {
    // 1. Define SASS instructions
    let instructions = vec![
        SassInstruction::FAdd { dst: "R0".into(), src0: "R1".into(), src1: "R2".into() },
        SassInstruction::Exit,
    ];

    // 2. Build program structure
    let program = SassProgram::new("my_kernel", instructions);

    // 3. Write to binary (simulated ELF)
    let writer = SassWriter::new();
    let binary = writer.write(&program).expect("Failed to generate SASS binary");
    
    fs::write("kernel.cubin", binary).unwrap();
}

🛠️ Support Status

Feature	Support Level	Architectures
Float Ops (FADD/FMUL)	✅ Full	Volta, Turing, Ampere
Integer Ops	✅ Full	Volta, Turing, Ampere
Global Memory (LDG/STG)	✅ Full	Volta, Turing, Ampere
Shared Memory (LDS/STS)	🚧 Partial	-
Predicates	🚧 Partial	-

Legend: ✅ Supported, 🚧 In Progress, ❌ Not Supported

🔗 Relations

gaia-types: Utilizes basic error handling and binary types for instruction encoding.
gaia-assembler: Serves as the native NVIDIA GPU backend for the Gaia project, allowing high-performance offloading of compute tasks.

sass-assembler 0.0.4