Extendable VM
Easily extendable stack virtual machine written in Rust.
About
This is a stack VM that supports
- Booleans, Strings, Ints and functions
- Basic operations like addition, multiplication, concatenation, etc
- Conditional jumps, function calls and returns
- Exceptions that halt the machine and print the stack trace
Why extendable?
Generic VM This project provides a generic stack VM that can be used to implement a concrete VM by defining concrete data types, bytecode format and instructions.
Jex VM The actual runnable VM just implements the generic VM.
It makes it easy to independently change the concrete VM.
How to Run
You can download the latest version from the Releases page. Or you can build from source.
Linux
After you have the binary executable extendable_vm
you can run it:
./extendable_vm path/to/bytecode
Windows
After you have the binary executable extendable_vm.exe
you can run it:
./extendable_vm.exe path/to/bytecode
Run with logging
To run with logging you have to set the environment variable RUST_LOG=extendable_vm
.
For example,
RUST_LOG=extendable_vm ./extendable_vm path/to/bytecode
Instructions
u8
represents an unsigned 8-bit integer
Name | Opcode (u8 ) |
Arguments (name: type) | Stack (old → new) | Description |
---|---|---|---|---|
Constant | 0 | i: u8 |
[] → [value] | Loads onto stack the i-th constant from the constant pool |
Null | 1 | [] → [null] | Loads null onto stack |
|
True | 2 | [] -> [true] | Loads true onto stack |
|
False | 3 | [] -> [false] | Loads false onto stack |
|
Pop | 4 | [x, y] → [x] | Pops the last value from stack | |
Get local | 5 | offset: u8 |
[..., x] → [..., x, y] | Gets the offset -th operand in the current call frame and loads in onto stack |
Set local | 6 | offset: u8 |
[..., x, ..., y] → [..., y, ...] | Pops the value and sets the offset -th operand in the current call frame |
Get global | 7 | identifier_i: u8 |
[...] → [..., x] | Loads a global value onto stack by its identifier which it fetches from the constant pool by index = identifier_i |
Define global | 8 | identifier_i: u8 |
[x] → [] | Sets a global value with the given identifier |
Set global | 9 | identifier_i: u8 |
[x] → [] | Sets a global value with the given identifier |
10 | [x] → [] | Prints a value | ||
Not | 11 | [x] → [!x] | Logical NOT | |
Equal | 12 | [x, y] → [x == y] | Checks if 2 values are equal | |
Greater | 13 | [x, y] → [x > y] | Checks if first is greater than the second | |
Less | 14 | [x, y] → [x < y] | Checks if first is less than the second | |
Negate | 15 | [x] → [-x] | Negates an integer | |
Add | 16 | [x, y] → [x + y] | Adds integers or concatenates strings | |
Subtract | 17 | [x, y] → [x - y] | Subtracts integers | |
Multiply | 18 | [x, y] → [x * y] | Multiplies integers | |
Divide | 19 | [x, y] → [x / y] | Divides integers | |
Jump forward | 20 | offset: u8 |
Jumps forward by offset bytes |
|
Jump forward if false | 21 | offset: u8 |
[x] → [] | Jumps forward by offset bytes if the value if false |
Jump Backward | 22 | offset: u8 |
Jumps backward by offset bytes |
|
Call | 23 | arity: u8 |
Calls a function with arity arguments. For example, CALL 3 will call f(a, b, c) when stack is [f, a, b, c] |
|
Return | 24 | Pops the last call frame and puts the returned value on top | Returns from the function ToString |
Bytecode format
This describes the format of the bytecode that the VM can read from the file.
Notation
struct
s are used as a way to demonstrate what each byte means. Each struct should be viewed as an array of bytes where
each value directly follow the previous (without padding and packing).
For example, struct A
represents an array [a1, a2, b]
where a1
and a2
correspond to a: u16
and b
to b: u8
.
Bytecode
Bytecode is an array of bytecode chunks. First chunks is a global script which will be run first, other chunks can be called as function.
Each chunk has n_constants
constants (constant pool) and n_code_bytes
executable bytes that contain instructions and their arguments.
Constants
Bytecode constants are literal values that are included in the code. There are 3 types of constants: ints, strings and functions.
Each constant type has a unique constant_type
which is used to distinguish it from the other types.
// Constant := IntConstant | StringConstant | FunctionConstant
Calling functions
All chunks except for the first one can be called as a function with a CALL
instruction.
Callable chunk must have these 2 constants:
- the first constant must be a function name (string)
- the second constant must be a function arity (int)
To call a chunk you need to load it onto stack with a Constant
instruction, load some arguments onto stack and call it
with a CALL arity
instruction.
Building from source
Build a development version
The executable will be located under target/debug
.
cargo build
Build a release version
The executable will be located under target/release
.
cargo build --release
Run tests
cargo test
Future plans
I plan to extract this VM into a separate repository, leaving only the generic VM here. You will be able to use the generic VM as a crate to build your own virtual machine!