PYC Assembler
A specialized toolchain for Python bytecode, enabling the assembly, disassembly, and manipulation of compiled Python files (.pyc).
🏛️ Architecture
graph TB
subgraph "Python Bytecode Pipeline"
A[Python Source / Bytecode] --> B[Python Program Model]
B --> C[Marshal Serializer/Deserializer]
C --> D[PYC Header Generator]
D --> E[Python Binary (.pyc)]
subgraph "Internal Structure"
F[Code Object Builder]
G[Opcode Encoder]
H[Constant Table]
end
B --> F
F --> G
F --> H
end
🚀 Features
Core Capabilities
- Version Support: Optimized for Python 3.x bytecode formats with support for version-specific magic numbers.
- Marshal Format: Full implementation of Python's
marshalserialization protocol for complex objects like code objects, strings, and tuples. - Instruction Encoding: Precise encoding of Python opcodes and their arguments, including support for extended arguments (🚧).
Advanced Features
- PYC Header Management: Automatically generates valid
.pycheaders including magic numbers, bitfield flags, and source timestamps/hash. - Code Object Manipulation: High-level API for constructing and modifying
PyCodeObjectstructures (co_code, co_consts, co_names, etc.). - Bi-directional Conversion: Supports both reading from and writing to
.pycfiles, facilitating bytecode-level analysis and instrumentation.
💻 Usage
Generating a Python Bytecode File
The following example demonstrates how to build a Python code object and save it as a .pyc file.
use PyProgram;
use PycWriter;
use File;
🛠️ Support Status
| Python Version | Magic Number | Marshal Protocol | Opcode Set |
|---|---|---|---|
| Python 3.7 | ✅ | ✅ | Full |
| Python 3.8 | ✅ | ✅ | Full |
| Python 3.9 | ✅ | ✅ | Full |
| Python 3.10 | ✅ | ✅ | Full |
| Python 3.11+ | 🚧 | 🚧 | Partial |
Legend: ✅ Supported, 🚧 In Progress, ❌ Not Supported
🔗 Relations
- gaia-types: Uses the
BinaryReaderandBinaryWriterto handle Python's custom little-endian binary formats. - gaia-assembler: Functions as a target for Gaia IR when compiling for Python-based execution environments.