2 min read

VSL Compiler

A compiler for VSL, a hypothetical language with Go-like syntax. Implemented in C.

Introduction

This project is the culmination of a series of assignments in the course Compiler Construction at NTNU. The goal of the course is to teach the basics of compiler construction, and the project is to implement a compiler for a hypothetical language called VSL (Very Simple Language). The language is similar to Go, but with a few differences. The compiler is implemented in C.

For a detailed guide on the steps and how to run the application, see the GitHub repository.

A high-level overview

Compiling some source code involves a lot of steps. I have attempted to abstract many things away, and present a high-level overview of the process:

1. Generating the abstract syntax tree (AST)

OverviewOverview

In the image above, we start with the source code hello.vsl. This is parsed using the yyparse() function, originating from yacc. The string representation of the things we are interested in are then tokenized, for instance from 'func' to the token FUNC.

Then, entire statements are parse. For instance, a function has an identifier name, followed by an opening parenthesis, a list of parameters, a closing parenthesis, a list of statements, and finally a return statement.

If we'd like a visual representation of this, we can use the print_syntax_tree() function, which will print the AST in a tree-like structure:

2. Simplifying the abstract syntax tree

Simplifying the ASTSimplifying the AST

Now that we have an abstract syntax tree, i.e. a tree representation of our code, we can simplify it. After all, not all information is relevant for the next steps. For instance 3 + 5 is equal to 8, so we can simplify the expression to just 8.

3. Creating the symbol table

Symbol tableSymbol table

Next, we create a symbol table keeping track of all symbols (variables, strings, functions, etc.) in our program.

4. Generating the assembly code

Assembly codeAssembly code

Finally, we generate the assembly code. This is done by traversing the AST, and generating the corresponding assembly code for each node. When the corresponding assembly code is generated, we can write it to a file, and then run it as an ordinary executable file on our machine.