VSL Compiler
Introduction
This project is the culmination of a series of assignments in the course Compiler Construction at NTNU. The goal of the course is to teach the basics of compiler construction, and the project is to implement a compiler for a hypothetical language called VSL (Very Simple Language). The language is similar to Go, but with a few differences. The compiler is implemented in C.
For a detailed guide on the steps and how to run the application, see the GitHub repository.
A high-level overview
Compiling some source code involves a lot of steps. I have attempted to abstract many things away, and present a high-level overview of the process:
1. Generating the abstract syntax tree (AST)
In the image above, we start with the source code hello.vsl
. This is parsed using the yyparse()
function, originating from yacc. The string representation of the things we are interested in are then tokenized, for instance from 'func'
to the token FUNC
.
Then, entire statements are parse. For instance, a function has an identifier
name, followed by an opening parenthesis, a list of parameters, a closing parenthesis, a list of statements, and finally a return statement.
If we'd like a visual representation of this, we can use the print_syntax_tree()
function, which will print the AST in a tree-like structure:
2. Simplifying the abstract syntax tree
Now that we have an abstract syntax tree, i.e. a tree representation of our code, we can simplify it. After all, not all information is relevant for the next steps. For instance 3 + 5
is equal to 8
, so we can simplify the expression to just 8
.
3. Creating the symbol table
Next, we create a symbol table keeping track of all symbols (variables, strings, functions, etc.) in our program.
4. Generating the assembly code
Finally, we generate the assembly code. This is done by traversing the AST, and generating the corresponding assembly code for each node. When the corresponding assembly code is generated, we can write it to a file, and then run it as an ordinary executable file on our machine.