SkillAgentSearch skills...

LatteCompiler

Compiler for custom Java-like language Latte for MRJP course at MIMUW. Written in C#, compiles to Linux, OS X and Windows. Language supports arrays, classes, methods, virtual methods, inheritance.

Install / Use

/learn @BAndysc/LatteCompiler
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

LatteCompiler

A compiler for the Latte language, developed as part of the Programming Language Implementation Methods course, written in C#. The compiler is compatible with both the open-source .NET implementation, Mono, and Microsoft’s .NET Framework.

Environment

The compiler can compile programs on Linux, macOS, and Windows (however, cross-compilation is not supported). More information is provided below.

Compilation

Building uses the standard method for compiling .NET programs with Mono, using xbuild. The make command triggers xbuild, which builds the program, then copies the built program to the latc_data/ directory and copies the scripts/latc_x86 script to the main directory. This script simply runs the compiler from the lib folder.

Tests

Running make test executes the prepared tests.

Scope

The compiler includes a frontend (type checking, AST tree optimizer) and a backend (intermediate code generator and x86 compiler). Currently, the following features are supported:

  • Structures
  • Objects
  • Inheritance
  • Methods (virtual)
  • Arrays

Frontend

The frontend checks for type errors, constant overflows, argument types, argument names, and the presence of return statements. Additionally, constants are evaluated and optimized. For example, the program:

int main() {
    if (!(true == false))
        return 1 + 2;
}

is optimized to:

int main() {
    return 3;
}

As a result, the program is accepted even though the return statement is within an if without an else, which theoretically makes it unclear if a value will always be returned in main. After optimizing constants, it becomes clear that a value is always returned.

Backend

The backend consists of two parts:

  • An intermediate code generator (quadruples) from the AST
  • An intermediate code translator to x86 assembly

The intermediate code is based on virtual registers, which are then assigned to hardware registers. This design makes it straightforward to add other backends, such as x86_64 or other assembly languages.

Projects

The program consists of the following projects:

  • Backend - Initializes the compiler's structures, passes the program (AST) for compilation, writes the assembly file, and calls nasm and ld.
  • Backend.Tests - Integration tests for the compiler. The program is compiled to a binary file, then run to compare output.
  • CLI - Command line interface, responsible for reading user arguments, then passing them to the frontend and backend.
  • Frontend - Initializes the frontend's structures, including converting the text file to an AST and type-checking.
  • LatteAntlr - Files generated by Antlr for parsing Latte and generating an AST from Antlr's CST (Concrete Syntax Tree).
  • LatteBase - Basic Latte AST structures.
  • LatteTreeOptimizer - Classes for optimizing the AST (constant evaluation).
  • LatteTypeChecker - Class for type-checking the AST.
  • LatteTypeChecker.Tests - Tests for the type checker.
  • QuadruplesCommon - Basic structures for intermediate code.
  • QuadruplesGenerator - Intermediate code generator based on the AST (assumes the tree is correct).
  • QuadruplesGenerator.Tests - Tests for the intermediate code generator.
  • TestPrograms - Project with ASTs of test programs.
  • Utils - Helper tools.
  • X86Assembly - Basic structures representing x86 Assembly.
  • X86Generator - x86 assembly generator based on intermediate code.
  • X86IntelAsm - Textual x86 assembly generator based on structures from X86Assembly.

Optimizations

The following optimizations are applied:

  • Constant evaluation during compilation (see the Frontend section)
  • Removal of unreachable code (based on the AST)
  • Tail recursion optimization
  • Shared vtable for objects of the same class
  • Constants are not stored on the stack
  • Removal of redundant instructions:
    mov A, B
    mov B, A
    

Tools Used

The compiler is written in C# and tested with Mono.

Libraries used:

  • Antlr4 for parsing
  • NUnit for tests (make test)

Targets

Linux

For Linux compilation, nasm and gcc are required.

macOS

For macOS compilation, nasm and gcc are required. The latest Mojave version cannot compile 32-bit programs due to missing libraries. You need to download the 10.13 SDK (https://github.com/phracker/MacOSX-SDKs/releases) and set the LATTE_OS_X_SDK environment variable to the SDK folder path.

Windows

For Windows compilation, Visual Studio with Visual C++ is required. Additionally, the path to the cl.exe compiler must be added to the PATH environment variable (typically: C:\Program Files (x86)\Microsoft Visual Studio\[version]\[version]\VC\Tools\MSVC\[version]\bin\Hostx86\x86). Also, the LIB and INCLUDE environment variables need to be set based on values from the Developer Command Prompt for VS. Run the developer command prompt, use echo %LIB% and echo %INCLUDE%, then manually add these values. The assembler nasm is also required. Download it from https://www.nasm.us/pub/nasm/releasebuilds/ and add its path to the PATH variable.

Related Skills

View on GitHub
GitHub Stars5
CategoryCustomer
Updated1y ago
Forks0

Languages

C#

Security Score

55/100

Audited on Dec 18, 2024

No findings