SkillAgentSearch skills...

L2

A minimalist type-inferred programming language with procedural macro support

Install / Use

/learn @murisi/L2

README

L2

L2 is a small statically typed programming language. Roughly speaking it looks like Scheme, it behaves like C, and it type-checks like ML. More precisely, L2 has the following characteristics:

I recommend that you take a look at the implementation of a self-hosting compiler for L2 that accompanies this project and compare it to the compiler for bootstrapping it written in C to get a feeling for what L2 is like.

There are 9 language primitives and for each one of them I describe their syntax, what exactly they do in English, the i386 assembly they translate into, and an example usage of them. Following this comes a listing of L2's syntactic sugar. Then comes a brief description of L2's internal representation and the 9 functions that manipulate it. After that comes a description of how a meta-expression is compiled. The above descriptions take about 8 pages and are essentially a complete description of L2. Then at the end there is a list of reductions that shows how some of C's constructs can be defined in terms of L2. Here, I have also demonstrated closures to hint at how more exotic things like coroutines and generators are possible using L2's continuations.

Contents

| Getting Started | Expressions | Examples | |:--- |:--- |:--- | | Building L2 | Constrain | Numbers | | The Compiler | Literal | Commenting | | Syntactic Sugar | Storage | Backquoting | | Internal Representation | If | Variable Binding | | Constraint System | Function | Boolean Expressions | | | Invoke | Switch Expression | | | With | Characters | | | Continuation | Strings | | | Jump | Sequencing | | | Meta | Conditional Compilation | | | | Assume | | | | Fields | | | | With Variables |

Getting Started

Building L2

./build_bootstrap
./build_selfhost

In this project there are two implementations of L2 compilers. One implementation is the bootstrap compiler written in C, the other implementation is a self-hosting compiler written in L2. (The source code for the self-hosting compiler is larger because it has to define its own control flow, literals, and other such features that come built into C.) Both compilers produce identical object code (modulo padding bytes in the ELFs) when given identical inputs. The bootstrap compiler needs a Linux distribution running on the x86-64 architecture with the GNU C compiler installed to be compiled successfully. To bootstrap the L2 compiler, simply run the bootstrap_compiler script at the root of the repository. This will create a directory called bin containing the file l2compile. l2compile is a compiler of L2 code and its interface is described in the next section. To self-compile the L2 compiler, simply run the selfcompile_compiler script at the root of the repository. This will replace l2compile with a new compiler that has the same command line interface.

The Compiler

./bin/l2compile source1.l2 ... - intrinsic1 ... - object1.o ...

In L2 top-level functions can be invoked at compile-time in addition to run-time. To enable this, the L2 compiler begins by loading the program into memory. For the parts of the program that are object files, the loading is straightforward. For the parts of the program that are L2 files, they cannot simply be compiled and loaded as they may also need to be preprocessed. Hence a lazy compilation scheme is implemented where an object file exposing the same global symbols as the L2 file is loaded, and only later on when one of its functions is actually used as a macro will the compilation of the corresponding L2 function actually be done. The important gain to doing this is that the aforementioned compilation now happens in the environment of the entire program, that is, the program can use its entire self to preprocess itself. Once the program is loaded in memory, its parts are linked together and to the compiler's interface for metaprogramming. And finally each part of the program source is compiled into an object file with the assistance of the copy of itself that has been loaded into memory.

Expressions

Constrain

(constrain expression0 sigfunction0)

Evaluates expression0. The resulting value of this expression then becomes that of expression0.

The constrain expression will be further explained in the constraint system section.

Literal

(literal b63b62...b0)

The resulting value is the 64 bit number specified in binary inside the brackets. Specifying less than or more than 64 bits is an error. Useful for implementing character and string literals, and numbers in other bases.

This expression is implemented by emitting an instruction to mov an immediate value into a memory location designated by the surrounding expression.

Say the expression [putchar x] prints the character x. Then [putchar (literal 0...01100001)] prints the text "a" to standard output.

Storage

(storage storage0 expression1 expression2 ... expressionN)

If this expression occurs inside a function, then space enough for N contiguous values has already been reserved in its stack frame. If it is occuring outside a function, then static memory instead has been reserved. storage0 is a reference to the beginning of this space. This expression evaluates each of its sub-expressions in an environment containing storage0 and stores the resulting values in contiguous locations of memory beginning at storage0 in the same order as they were specified. The resulting value of this expression is storage0.

N contiguous words must be reserved in the current function's stack-frame plan. The expression is implemented by first emitting the instructions for any of the subexpressions with the location of the resulting value fixed to the corresponding reserved word. The same is done with the remaining expressions repeatedly until the instructions for all the subexpressions have been emitted. And then second emitting an instruction to lea of the beginning of the contiguous words into a memory location designated by the surrounding expression.

The expression [putchar [get (storage _ (literal 0...01100001))]], for example, prints the text "a" to standard output.

If

(if expression0 expression1 expression2)

If expression0 is non-zero, then only expression1 is evaluated and its resulting value becomes that of the whole expression. If expression0 is zero, then only expression2 is evaluated and its resulting value becomes that of the whole expression.

This expression is implemented by first emitting an instruction to or expression0 with itself. Then an instruction to je to expression2's label is emitted. Then the instructions for expression1 are emitted with the location of the resulting value fixed to the same memory address designated for the resulting value of the if expression. Then an instruction is emitted to jmp to the end of all the instructions that are emitted for this if expression. Then the label for expression2 is emitted. Then the instructions for expression2 are emitted with the location of the resulting value fixed to the same memory address designated for the resulting value of the if expression.

The expression [putchar (if (literal 0...0) (literal 0...01100001) (literal 0...01100010))] prints the text "b" to standard output.

Function

(function function0 (param1 param2 ... paramN) expression0)

Makes a function to be invoked with exactly N arguments. When the function is invoked, expression0 is evaluated in an environment where function0 is a reference to the function itself and param1, param2, up to paramN are the resulting values of evaluating the corresponding arguments in the invoke expression invoking this function. Once the evaluation is complete, control flow returns to the invoke expression and the invoke expression's resulting value is the resulting value of evaluating expression0. The resulting value of this function expression is a reference to the function.

This expression is implemented by first emitting an instruction to mov the address function0 (a label to be emitted later) into the memory location designated by the surrounding expression. Then an instruction is emitted to jmp to the end of all the instructions that are emitted for this function. Then the label named function0 is emitted. Then instructions to push each callee-saved register onto the stack are emitted. Then an instruction to push the frame-pointer onto the stack is emitted. Then an instruction to move the value of the stack-pointer into the frame-pointer is emitted. Then an instruction to sub from the stack-pointer the amount of words reserved on this function's stack-frame is emitted. After this the inst

View on GitHub
GitHub Stars137
CategoryCustomer
Updated14d ago
Forks9

Languages

Racket

Security Score

100/100

Audited on Mar 16, 2026

No findings