moongen 0.0.1 - Docs.rs

moongen provides both

a command line program for assembling/disassembling/analyzing moonsharp bytecode
a library for interacting with it

CLI

there are three commands

moongen asm <path> assembles the assembly format into a bytecode dump
moongen disasm <path> disassembles a bytecode dump into the assembly format
moongen analyze <path> analyzes a bytecode dump and prints any diagnostics if it violated any rules, along with the full path taken to a given instruction

all three

accept - as their path, indicating they should read data from stdin
emit their results to stdout

assembly format

for an instruction reference, review the Inst documentation

syntax is defined by grammar.pest, and follows the following format

each line may start with a label definition: @ident:
each line may have one instruction
- an instruction name (ident)
- if the instruction takes addr, one of the following:
  - an integer specifying the instruction address relative to the start of the chunk
  - ~, followed by an integer specifying the instruction address relative to the current instruction
  - @, followed by an ident referring to a label
- if the instruction takes arg1, an integer
- if the instruction takes arg2, an integer
- if the instruction takes name, a string
- if the instruction takes value, an =, followed by one of the following:
  - null
  - nil
  - void
  - true
  - false
  - a float
  - a string
  - {} (creates an empty table)
- if the instruction takes symbol, a symbol
- if the instruction takes symbol_list, [, comma-separated symbols, ]

terminology

idents follow the regex /[a-zA-Z_][a-zA-Z0-9_]*/
integers follow the regex /-?(?:0|[1-9][0-9]*)/
floats follow the regex /-?(?:0|[1-9][0-9]*)(?:\.[0-9]*)/
strings are either
- JSON-escaped content wrapped in quotes ("this is a string with \"embedded\" quotes")
- base64-encoded content wrapped in quotes and prefixed with b (b"dGhpcyBpcyBhIHN0cmluZyB3aXRoICJlbWJlZGRlZCIgcXVvdGVz", useful for binary data)
symbols are one of the following:
- &, symbol name (local name), :, integer (local index)
- ^, symbol name (upvalue name), :, integer (upvalue index)
- %, symbol name (global name), :, symbol (global _ENV)
- env (_ENV symbol)
- nullref (null symbol)
symbol names are one of the following:
- an ident (name)
- an ident, @, integer (name + disambiguation)
- ... (vararg)

full demonstration

#![has_env]
// useful for debugging purposes
meta 25 1 "greeter" =null
// does nothing but is in the function header anyways
fn 0 -1 []

closure @greet []
upv.ld ^_ENV:0
// %hello:^_ENV:0 isnt necessary, but moonsharp emits it anyways
// you can use nullref for index.set
index.set 0 0 ="hello" %hello:^_ENV:0

// moonsharp likes to generate closures by emitting their instructions and jumping over them
// you dont have to do it this way though (it also saves an instruction to Not Do That)
// but this example will do it moonsharp's way
jmp @over_greet
	@greet:
	meta 9 1 "greet" =null
	fn 1 0 [&who:0]
	args [&who:0]

	lit ="hello "
	loc.ld &who:0
	lit ="!"
	op.concat
	op.concat

	ret 1
	// moonsharp also generates unreachable `ret 0`s even when the last instruction in a function is a `ret 1`...
	ret 0
@over_greet:

// indentation isn't forced either way! lay it out in a way that makes more sense if you'd like
			upv.ld ^_ENV:0
		index ="print"
				upv.ld ^_ENV:0
			index ="greet"
			lit ="dolly"
		call 1 "calling greet"
	call 1 "calling print"
pop 1

ret 0