
Oonta is a compiler front-end for the OCaml programming language: it generates LLVM intermediate representation (IR) from OCaml source code.
Oonta uses the JJIK parser generator and JLEK lexer generator to perform the parsing and lexing stages. For building the IR, Oonta does not depend on the LLVM API.
[!IMPORTANT] This project is still a work in progress, many OCaml features are not yet supported. For example, polymorphic functions and types, and modules are not yet supported. Additionally, the runtime does not yet provide a garbage collection service. See the issues tab for the list of work items.
[!NOTE] This project is part of the "Compiler Toys" project, originally meant as a learning exercise on Compilers.
Quick Start
# 120
Benchmark
Two benchmarks are provided:
Merge sort (benchmark/merge_sort.ml):
match lst (
let ( = f x in
map f l
match n acc
create_lst_aux
create_lst_aux n Empty
match lst
split_aux rest acc_right
split_aux lst Empty Empty
match right
lest
if left_head <= right_head then
Cat
else
Cat
match lst lst
lst
let splitted_lst = split lst in
let left =
match splitted_lst lst
in
let right =
match splitted_lst lst
in
merge
let n = read_int (
let lst = create_lst n
let sorted_lst = merge_sort lst
let ( = map print_int sorted_lst
And insertion sort (benchmark/insertion_sort.ml):
match lst (
let ( = f x in
map f l
match n acc
create_lst_aux
create_lst_aux n Empty
match lst Cat
if elem <= head then
Cat
else
Cat
match lst lst
insert head
let n = read_int (
let lst = create_lst n
let sorted_lst = insertion_sort lst
let ( = map print_int sorted_lst
Execute the following command inside the repository root folder to run the benchmarks.
[!IMPORTANT] Currently, for running the benchmark on large inputs, we have to increase the stack size limit with
ulimit -s unlimited
On my machine (AMD Ryzen™ 7 7700X × 16), the result of running the above benchmarks are as follows:
Benchmarking: merge_sort.ml
Elapsed time (./benchmark/ocamlopt.out): 0.8710 seconds
Elapsed time (./benchmark/oonta.out): 0.7323 seconds
> 1.19 times faster
Benchmarking: insertion_sort.ml
Elapsed time (./benchmark/ocamlopt.out): 0.1323 seconds
Elapsed time (./benchmark/oonta.out): 0.3363 seconds
> 2.54 times slower
Dependencies
The oonta command provides the --opt, -O, --compile / -c and --exec / -e options to optimize the generated IR, compile the generated IR to an object
code and executable, respectively. Internally, oonta will invoke the
following commands:
# with --opt
# with --compile
# with --exec
On Ubuntu, install the llvm package to make those commands available.
[!NOTE] I will be working on my own LLVM backend as part of my compiler learning journey! ✨
[!WARNING] The generated IR uses opaque pointers. If your LLVM version is older than version 15, the
--compile/--execoptions might not work.
Additionally, with --exec, you need the Oonta runtime library:
liboonta_runtime.a. If you're running an x64 linux machine, you can obtain
the static library from the [release page]. If not, you would need to build the
runtime from source by following the guide below.
[!NOTE] Currently the runtime only provides allocation requests using bump allocation. Garbage collection service is not yet available.
User Guide
Feature Highlights
Debug compile phases
Use the --verbose / -v option to debug each compile phase.
=> Lexing & Parsing Start
=> Lexing & Parsing End (1 ms)
=> Build AST Start
factorial =
FunExpr
├─▸ parameters: [x]
├─▸ captures: []
├─▸ recursive: yes
└─▸ body:
CondExpr
├─▸ condition:
│ BinOpExpr
│ ├─▸ operator: <=
│ ├─▸ lhs:
│ │ VarExpr ("x")
│ └─▸ rhs:
│ LiteralExpr (1)
├─▸ then expr:
│ LiteralExpr (1)
└─▸ else expr:
BinOpExpr
├─▸ operator: *
├─▸ lhs:
│ VarExpr ("x")
└─▸ rhs:
ApplicationExpr
├─▸ function:
│ VarExpr ("factorial")
└─▸ binds:
└─▸ (0)
BinOpExpr
├─▸ operator: -
├─▸ lhs:
│ VarExpr ("x")
└─▸ rhs:
LiteralExpr (1)
() =
ApplicationExpr
├─▸ function:
│ VarExpr ("print_int")
└─▸ binds:
└─▸ (0)
ApplicationExpr
├─▸ function:
│ VarExpr ("factorial")
└─▸ binds:
└─▸ (0)
LiteralExpr (5)
=> Build AST End (0 ms)
=> Resolve types Start
Top level bindings:
factorial: (int -> int)
=> Resolve types End (0 ms)
=> Transform application expressions Start
=> Transform application expressions End (0 ms)
=> Build LLVM module Start
=> Build LLVM module End (0 ms)
=> Write LLVM module Start
=> Write LLVM module End (1 ms)
=> LLVM backend Start
=> LLVM backend End (117 ms)
Error reporting
Line 1|let x = foo 3
^--
Error: cannot infer expression type: Unbound value foo
Line 1|let rec f x = f
^
Error: cannot infer expression type: Cannot unify 'b with ('a -> 'b)
Line 1|let () = 1 + 2
^----
Error: cannot bind expression of type int to ()
Building from source
- Install C++ build dependencies
sudo apt install build-essential cmake
- Install
cargotool:
|
- Clone repository.
- Build
liboonta_runtime.a
cmake -S runtime -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build
- Install
liboonta_runtime.a
sudo cmake --install build
- Build
oonta.
Why is it called Oonta?
Oonta, is based on the Indonesian word unta, which translates to "camel".