Expand description

§Advanced Subleq Assembler
This Subleq assembler assembles a custom language, called Sublang, into Subleq
§Features
- Interpreter and debugger
- Friendly and detailed assembler feedback
- Powerful macros
- Syntax sugar for common constructs like dereferencing
- Optional typing system
- Fully fledged standard library including routines and high level control flow constructs like If or While
- Fine grained control over your code and the assembler
- Module and inclusion system
- 16-bit
- Extensive documentation
§What is Subleq?
Subleq or SUBtract and jump if Less than or EQual to zero is an assembly language that has only the SUBLEQ
instruction, which has three operands: A
, B
, C
.
The value at memory address A
is subtracted from the value at address B
. If the resulting number is less than or equal to zero, a jump takes place to address C
. Otherwise the next instruction is executed.
Since there is only one instruction, the assembly does not contain opcodes.
So:
SUBLEQ 1 2 3
would just be
1 2 3
A very basic subleq interpreter written in Python would look as follows
pc = 0
while True:
a = mem[pc]
b = mem[pc + 1]
c = mem[pc + 2]
result = mem[b] - mem[a]
mem[b] = result
if result <= 0:
pc = c
else:
pc += 3
Most subleq implementations, this one included, also include the IO operations: INPUT, OUTPUT and HALT.
These can be achieved by respectively having A = -1
, B = -1
and C = -1
. INPUT and OUTPUT read or write singular ASCII characters
§Installation
cargo install asa
or download the binary from the releases tab
§Usage
asa MySubleq.sbl
§Syntax highlighting
See https://github.com/Kat9-123/sublang-highlighting
§Project status
This project is functionally complete, but the documentation for the assembler, Sublang and Sublib is still lacking. As is the Sublib itself, which isn’t finished yet
§
It is best to view Introduction to sublang on https://github.com/Kat9-123/asa/blob/master/Sublang.md so you get at least a modicum of syntax highlighting
§Introduction to Sublang
Sublang is a bare bones assembly-like language consisting of four main elements:
- The SUBLEQ instruction
- Labels to refer to areas of memory easily
- Macros for code reuse
- Syntax sugar for common constructs
§Subleq
1 2 3 ; This is interpreted as standard subleq: mem[2] -= mem[1] jump to 3 if LEQ otherwise it goes to the next instruction.
; This syntax is valid, but pointless.
; memory addresses and instructions may be labeled
a -> 1
b -> 2
c ->
a -= b ; This syntax is used for the SUBLEQ instruction: Subtract b from a
a -= b c ; Subtract b from a and jump to c if the result is less than or equal to zero
; If no `c` argument is given, the next instruction will always be executed, even if the result is LEQ to zero
; So these two are equivalent
a -= b
a -= b $1 ; `$1` gives a relative address with offset one
a -= b ; is equivalent to
b a $1 ; This syntax works but is not recommended, since it makes it harder for the assembler to give hints
; Other examples, literals and labels may freely be combined
a -= b 0x0000
'\0' -= 0 c
**
Block comment.
Important NOTE: Labeled values take up space in memory and will be
executed if passed
**
a -> 2
4 10 5 ; will be executed as 2 4 10, NOT 4 10 5. To prevent this, jump over any
; label definitions, or use the assignment syntax
§Labels and Literals
a -> 123 ; Decimal
b -> 0x4C6 ; Hex
c -> "Hello, World!" ; Strings are null-terminated by the assembler
d -> 'P' ; Character literals
.label ->
a -= b .label ; Repeats as long as (a -= b) <= 0
§IO
char -> 'a'
input -> 0
W -> 0
-1 -= char ; Prints 'a' to the screen
input -= -1
-1 -= input ; Echoes back users input
W -= W -1 ; Halts execution
§Scopes
Scoping works like in most other languages. Note: Only labels are affected by scopes, macro definitions in scopes will still be globally accessible
Z -> 123
X -> 456
Y -> 0
{
Z -> -789
{
X -= Z ; 456 - -789
}
}
Y -= Z ; 0 - 123
§Macros
§Definition
@Name {
...
}
@Name ; This is allowed as well, but discouraged
{
...
}
; with parameters
@Name a? b? c? {
...
}
; It is also possible to define a macro that isn't scoped:
@Name a? [
...
]
; This, however, is dangerous when label definitions take place in the macro, so it is generally discouraged.
; Linebreaks are allowed between parameters
@Name a?
b?
c?
d?
e? {
...
}
§Expanding
!Name
; With arguments
!Name2 a b c
; Linebreaks are NOT allowed between arguments
§Hygiene
Macros are hygienic. Variables won’t be shadowed.
; Macros
@MyMacro b {
a -= b
a -> 123
}
a -> 0
!MyMacro a
; Is completely fine, and will become the following:
{
?MyMacro?a -= a
?MyMacro?a -> 123
}
§Compound macro arguments
You may pass scopes as macro arguments
@Mac s_my_scope? {
s_my_scope?
}
!Mac { a -= b }
; =>
{
{ a -= b }
}
; If a macro takes multiple scopes, they can be chained as follows:
!Mac {
...
} {
...
} {
...
}
If you don’t want the argument to be surrounded by scopes, you can use braces
@Mac b_my_braced? {
b_my_braced?
}
!Mac ( a -= b )
; =>
{
a -= b
}
This means that you can ‘curry’ macros (using that term loosely)
@Mac b_some_macro? {
b_some_macro? 10
b_some_macro? 3
}
@CurriedMacro l_a? l_b? {
l_a? -= l_b?
}
!Mac ( !CurriedMacro 5 )
; =>
{
{
5 -= 10
5 -= 3
}
}
§Types
The assembler has a simple type-checker for macro arguments, which can be disabled.
value
normal labell_value
literal values_value
scoped valueb_value
a braced valuem_value
a macro call passed as argument, must be braced. In practice it’s the same asb_value
a_value
anything, no type checking
§Pointers
§Referencing
To create a pointer to a value, the relative address syntax $1
must be used to get the address of the next token
ptr -> $1 0x1234 ; This takes up two words of memory, one for the pointer and one for the value
ptr -> $1 "String"
; or
ptr -> &'A'
ptr -> &"String"
ptr -> &123
; & is equivalent to $1
§Dereferencing
; Generic sequence for dereferencing. The value that 'ptr' points to will be subtracted from 'a'
!Copy ptr b
a -= (b -> 0)
; If 'ptr' is constant the following is also legal. Note: ptr doesn't have to be constant but this syntax will give unexpected results if it isn't
a -= (b -> PTR)
; The '*' operator may also be used:
a -= *ptr
; This is effectively syntax sugar for
!Copy ptr b
a -= (b -> 0)
; (But in reality it is exactly equivalent to)
_ASM _ASM &1
*ID*ptr *ID*ptr &1
ptr _ASM &1
_ASM *ID*ptr &1
a -= (*ID*ptr -> 0)
; *ID*ptr is a safe and automatically generated name
Remember that because of how Subleq works, what are called ‘Labels’ here, are also just pointers! But since Subleq dereferences them, we can think of them as values. But keep in mind that literals require indirection a -= 10
doesn’t subtract 10 from a
§Inclusions
The #
symbol may be used to include another .sbl file anywhere
#MySblFile.sbl
; You may leave out the .sbl extension:
#MySblFile
; You can also do this:
...
Z -= Z
P -= Z
#IncludeMe
!Macro P
...
; But of course beware of the contents of the included file
If you want to create a module (a set of .sbl files in a folder) you must create a folder with the name of the module (for example ‘sublib’) and in that folder create a Lib.sbl file. Whenever the ‘sublib’ folder is imported, this is automatically resolved to ‘sublib/Lib.sbl’. In this .sbl file you may include any other files you might need. Includes are initially resolved relative to the file being assembled, and otherwise they are searched for in the LIBS folder, defined using the -l
command line argument.
; ./subleq/MyFile.sbl
#math/FastSqrt
The order in which files are checked is as follows. The first one that exists will be included.
./subleq/math/FastSqrt/Lib.sbl
./subleq/math/FastSqrt.sbl
LIBS/math/FastSqrt/Lib.sbl
LIBS/math/FastSqrt.sbl
See subleq/libs/sublib for an example.
§Miscellaneous Syntax sugar
§Mult operator
When the ‘*’ is placed before a literal n
, the previous token is repeated n
times.
label * 3 ; =>
label label label
0x123 * 0x4 ; =>
0x123 0x123 0x123 0x123
; mind that 3 * label will dereference `label`!
§Assignments
The =
operator can be used to both declare a label and assign it a value every time execution passes it. You can assign a label to another label or a literal.
.loop ->
a = 2 ; a will be set to 2 every iteration of the loop
a -= a .loop
; Label to literal
a = 2
; is equivalent to
_ASM -= _ASM
a -= a .assign
a -> 0 ; declaration
.assign -> {
lit -> 2
_ASM -= lit
name? -= _ASM
}
; Label to zero (special optimised case)
b = 0
; is equivalent to
b -= b .fin
b -> 0 ; declaration
.fin ->
; Label to Label
b = a
_ASM -= _ASM .assign
b -> 0 ; declaration
.assign ->
b -= b
_ASM -= a
b -= _ASM
; Note that there is a small memory and performance cost to assignments
§Namespacing
The format Namespace::Macro
or Namespace::label
should be used. This is solely a naming convention and not enforced in any way. This means that module authors must decide what namespace their macros or labels should have. This is obviously bad design, but it is simple.
§Sublib
Sublib is the standard library. It has a range of very basic features (Prelude.sbl, IO.sbl and Symbols.sbl) to quite advanced ones like functions and control flow.
§Style guide
Adhere to the naming conventions and type system and make sure it looks good :), ideally you should follow the style of the Sublib
§Naming conventions
@MyMacro
macros in PascalCasemy_label
labels in snake_case, with the exception of single character ‘registers’, likeZ
orW
p_value
pointer (not type-checked)p_p_value
pointer to pointer (not type-checked)n_value
negated value (not type-checked)value?
macro parameters.value
a label to jump toCONST_VALUE
constant, can be applied to all of the above and should be applied to macro arguments, but NOT to literals (l_name), since they are always constant by definitionNamespace::label
orNamespace::Macro
for namespacingNamespace::SubNamespace::label
MyFile.sbl
files in PascalCasemodule
modules (folders) in snake_case
§Assembler specific additions to Subleq
The assembler’s runtimes will treat a jump to -2
as a breakpoint and -2 -= a
as printing a
as a signed integer. Note that these features are non-canonical. They should be accessed using ASM::Breakpoint
and ASM::Debug
from the ASM
library. When pedantic mode is turned on, the assembler will notify that these features wont work for other subleq interpreters.
§Runtimes
§Interpreter
The interpreter is the default runtime for subleq. When an error is encountered it exits and prints a trace. To halt the program when it running press CTRL-C but when it is prompting for input, press DELETE.
§Debugger
To run a program with the debugger, add the -d
command line flag. Interactive debugging will only start when an error or breakpoint is encountered.
§Examples
§Basic
; Very basic Sublang, without using the standard library Sublib
; Output: Hello, World!
!Print p_string ; Call the macro Print
Z -= Z -1 ; Jumping to -1 halts, equivalent to !Halt
p_string -> &"Hello, World!\n"
Z -> 0 ; Temp register
N_ONE -> -1 ; Store the literal negative one
**
Pure no dependency implementation of print
**
@Print P_STRING? {
; Copy the pointer into the local variable ptr, because we don't want to
; modify the original pointer
Z -= Z ; clear Z
Z -= P_STRING? ; Z = -P_STRING?
ptr -= Z ; ptr = -Z = --P_STRING? = P_STRING?
Z -= Z
.loop ->
char -= char ; Clear char
Z -= (ptr -> 0) ; Z -= *ptr, dereferences ptr to get the actual character
char -= Z .fin ; Flip the character, since it is negative, and jump if
; result is LEQ zero (i.e. finish if it is a ZERO/NULL)
-1 -= char ; Writes the character to the screen. -1 is a special register used
; for IO operations
ptr -= N_ONE ; Increment the pointer to go to the next character
Z -= Z .loop ; Infinite loop
char -> 0 ; This point is never reached, so it is safe to define the
; label 'char' here. It is very important to keep in mind
; that, in this case the zero, will be put in memory in
; this exact place and, if execution crosses it, it will
; be interpreted as an instruction. To define values in between
; instructions, use the '=' operator
.fin ->
}
§Sublib
; This is how Sublang could should be written, making extensive use of macros
; Output: Hello, Sublang!
#sublib
#sublib/Control
p_string -> &"Hello, Sublang!\n"
**
Print a string using macros from standard lib
**
@PrintStdLib P_STRING? {
p_local = P_STRING?
char = 0
!Loop {
!DerefAndCopy p_local char ; char = *p_local
!IfFalse char {
!Break
}
!IO -= char
!Inc p_local
}
}
; Executing starts here
.main -> {
!PrintStdLib p_string
!Halt
}
; Or you can just use one of the Print macros from sublib/IO
#sublib
.main -> {
IO::PrintLnLit "Hello, Sublib!"
!Halt
}
§Conway’s Game of Life
./subleq/examples/GameOfLife.sbl
§Conclusion
For many more examples see Sublib or the end-to-end tests, though they are messy and not idiomatic
Modules§
- args
- Parses command line arguments
- assembler
- Dispatches the lexer, parser and code generator
- codegen
- Generate a vec of executable words from a vector of tokens
- feedback
- Generates and prints friendly feedback messages for the user
- files
- Management of input and output files
- lexer
- Converts a string into a vector of tokens, resolving includes along the way
- mem_
view - Print the memory of a subleq program
- parser
- Parses a vector of tokens
- runtimes
- Contains the interpreter and the debugger
- symbols
- Global constants
- tokens
- Tokens emitted by the lexer and processed by the parser
- utils
- Utility functions
Macros§
- asm_
details - A sub message for extra details
- asm_
error - asm_
error_ no_ terminate - Since the asm_error macro always terminates, sometimes we don’t want that, if we want to write more information and then terminate for example
- asm_
hint - Display a small hint message under an asm_msg
- asm_
info - asm_
trace - A type of sub message used for macro traces
- asm_
warn - error
- Error message for an error that doesn’t originate from a token
- println_
silenceable - These prints will be silenced by the silence command line argument
- terminate
- Terminate is used for FAILURES