SkillAgentSearch skills...

Gocaml

:camel: Statically typed functional programming language implementation with Go and LLVM

Install / Use

/learn @rhysd/Gocaml
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

GoCaml :camel:

[![Linux and macOS Build Status][]][Travis CI] [![Windows Build Status][]][Appveyor] [![Coverage Status][]][Codecov]

GoCaml is subset of OCaml in Go based on [MinCaml][] using [LLVM][]. GoCaml adds many features to original MinCaml. MinCaml is a minimal subset of OCaml for educational purpose. It is statically-typed and compiled into a binary.

This project aims incremental compiler development for my own programming language. Type inference, closure transform, mid-level IR are implemented.

Example:

let rec gcd m n =
  if m = 0 then n else
  if m <= n then gcd m (n - m) else
  gcd n (m - n) in
print_int (gcd 21600 337500)

You can see [more examples][examples]. (e.g. [Brainfxxk interpreter][Brainfxxk interpreter example], [N-Queens puzzle][N-Queens puzzle example])

Tasks

  • [x] Lexer -> ([doc][syntax doc])
  • [x] Parser with [goyacc][] -> ([doc][syntax doc])
  • [x] Alpha transform ([doc][sema doc])
  • [x] Type inference (Hindley Milner monomorphic type system) -> ([doc][sema doc])
  • [x] mid-level intermediate representation (MIR) ([doc][mir doc])
  • [x] K normalization from AST into MIR ([doc][mir doc])
  • [x] Closure transform ([doc][closure doc])
  • [x] Code generation (LLVM IR, assembly, object, executable) using [LLVM][] ([doc][codegen doc])
  • [x] LLVM IR level optimization passes
  • [x] Garbage collection with [Boehm GC][]
  • [x] Debug information (DWARF) using LLVM's Debug Info builder

Difference from Original MinCaml

  • MinCaml assumes external symbols' types are int when it can't be inferred. GoCaml requires explicit declaration for external symbols with external syntax like OCaml. All external symbols (except for builtins) must be declared. Otherwise, undeclared symbols are reported undefined symbols as compilation error.
  • MinCaml allows - unary operator for float literal. So for example -3.14 is valid but -f (where f is float) is not valid. GoCaml does not allow - unary operator for float values totally. You need to use -. unary operator instead (e.g. -.3.14).
  • GoCaml adds more operators. * and / for integers, && and || for booleans.
  • GoCaml has string type. String value is immutable and used with slices.
  • GoCaml does not have Array.create, which is an alias to Array.make. Array.length is available to obtain the size of array and [| ... |] literal is available to create an array with specific elements.
  • Some useful built-in functions are added (described in below section).
  • [Option type][] is implemented in GoCaml. Please see below 'Option Type' section or [test cases][option type test cases].
  • GoCaml has fun syntax to make an anonymous funcion or closure like fun x y -> x + y.
  • GoCaml has type annotations syntax. Users can specify types explicitly.
  • Symbols named _ are ignored.
  • Type alias using type keyword.

Language Spec

Program

Program is represented as one expression which MUST be evaluated as unit type. So () is the smallest program for GoCaml.

Sequence Expression

Sequenced program can be represented by joining multiple expressons with ;.

e1; e2; e3; e4

In above program, expressions are evaluated in order of e1 -> e2 -> e3 -> e4 and the sequenced expression is evaluated to the value of e4. Program must be evaluated to unit type, so the e4 expression must be evaluated to () (unit value).

Comments

There is a block comment syntax. It starts with (* and ends with *). Any comment must be closed with *), otherwise it falls into syntax error.

(*
   This is comment.
*)

Constants

There are unit, integer, boolean, float and string constants.

(* integer *)
42;

(* float *)
3.0;
3.14e+10;
3.14e-10;
1.;

(* boolean *)
true;
false;

(* string *)
"hello, world";
"contains\tescapes\n";

(* only one constant which is typed to unit *)
()

Show values

print_* and println_* built-in functions are available to output values to stdout.

print_int 42;
println_bool true

Please see 'Built-in functions' section below for more detail.

Unary operators

You can use some unary prefixed operators.

-42;

(* GoCaml distinguishes float and int in operators. -. is a float version of - *)
-.3.14;

not true;

()

Arithmetic binary operators

As mentioned above, GoCaml distinguishes int and float in operators. Operators for float values are suffixed by . (dot).

(* integer calculation *)
1 + 2;
1 - 2;
1 * 2;
1 / 2;

(* float calculation *)
1.0 +. 2.0;
1.0 -. 2.0;
1.0 *. 2.0;
1.0 /. 2.0;

()

Integer operators must have integer values as their operands. And float operators must have float values as their operands. There is no implicit conversion. You need to convert explicitly by using built-in functions (e.g. 3.14 +. (int_to_float 42)).

Note that strings don't have any operators for concatenating two strings or slicing sub string. They can be done with str_concat and str_sub built-in functions (See 'Built-in Functions' section).

Relational operators

Equal operator is = (NOT ==), Not-equal operator is <>. Compare operators are the same as C (<, <=, > and >=).

42 = 42; (* => true *)
42 <> 42; (* => false *)

3.14 > 2.0;
1.0 < 3.0;
2.0 >= 2.0;
1.0 <= 3.0;

()

Tuples (described below) and strings can be compared with = or <>, but cannot be compared with <, <=, > and >=. Arrays (described below) cannot be compared directly with any compare operators. You need to compare each element explicitly.

Logical operators

&& and || are available for boolean values.

println_bool (true || false && false || false)

Variable

let expression binds some value to a variable.

(* 42 is bound to a *)
let a = 42 in
(* You can use the variable as value in any expression *)
a + 4;

()

The syntax is let {name} = {e1} in {e2}. e2 will be evaluated where e1 is bound to name. By chain let, you can define multiple variables.

let pi = 3.141592 in
let r = 2 in
let area = r *. r *. pi in
print_float area

And you can redefine the same name variable as already-defined ones.

let a = 42 in
println_int a;
let a = true in
println_bool a

The first a and the second a are different variable. Second one just shadows first one. So you can always redefine any variable names as any type. Shadowed variable can be no longer referred.

Functions are first-class object in GoCaml. So you can also bind functions to variable as value.

let rec hello _ = println_str "hello" in
let f = hello in

(* Shows "hello" *)
f ();

(* Binds external function *)
let p = println_str in

(* Shows "hi" *)
p "hi"

Functions

let rec is a keyword to define a function. Syntax is let rec name params... = e1 in e2 where function name is defined as e1 and then e2 will be evaluated. f a b c is an expression to apply function f with argument a, b and c. As long as the argument is simple, you don't need to use ().

Note that, if you use some complicated expression (for example, binary operators), you need to use () like f (a+b) c. If you specify f a + b c, it would be parsed as (f a) + (b c).

let rec f a b c = a + b + c in
let d = f 10 20 30 in

(* Output: 60 *)
println_int d

You can make a recursive function as below.

let rec fib x =
    if x <= 1 then 1 else
    fib (x - 1) + fib (x - 2)
in
println_int (fib 10)

Functions can be nested.

let rec sqrt x =
    let rec abs x = if x > 0.0 then x else -.x in
    let rec go z p =
        if abs (p -. z) <= 0.00001 then z else
        let (p, z) = z, z -. (z *. z -. x) /. (2.0 *. z) in
        go z p
    in
    go x 0.0
in
println_float (sqrt 10.0)

(* Error because of out of scope: go 10.0 0.0 *)

In above example, abs and go is nested in sqrt. Nested function is a hidden implementation of the outer function because inner scope is not visible from outside.

Functions can capture any outer variables (=environment). Functions which captured outer environment are called 'closure'. As many functional languages or modern languages, GoCaml has closure functions.

(* Define variable *)
let pi = 3.14 in

(* Captures outer defined variable 'pi' into its body *)
let rec circle r = r *. r *. pi in

(* Invoke closure *)
println_float (circle 10.0)

Below is a bit more complicated example:

let rec make_adder x =
    let z = 1 in
    let rec f y = x + y + z in
    f
in
let add = make_adder 3 in

(* Output: 104 *)
println_int (add 100)

Here, inner function f captures hidden variable special_value. make_special_value_adder returns a closure which captured the variable.

Lambda

Functions can be made without names using fun syntax.

(* Make a lambda and bind it to foo *)
let add = fun x y -> x + y in

(* Above is like below, but the function is anonymous *)
let rec add2 x y in x + y in

println_int (add 1 10);
println_int (add2 1 10);

It's useful when passing a function without considering its name.

let rec quick_sort xs pred =
    (* ...snip *)
in
let a = Array.make 10 0 in
let sorted = quick_sort a (fun l r -> l < r) in
()

Lambda does not have its name, so it cannot be called recursively.

Using lambda, above make_adder can be implemented as following:

let rec make_adder x =
    let z = 1 in
    fun x y -> x + y + z
in
...

Type Annotation

Type can be specified explicitly at any expression, parameter and return type of function with :

Types can be written in the same syntax as other ML languages.

  • Primitive: int, float, bool, string
  • Any type: _
  • Tuple: t1 * t2 * ... * tn (e.g. int * bool)
View on GitHub
GitHub Stars758
CategoryDevelopment
Updated28d ago
Forks29

Languages

Go

Security Score

100/100

Audited on Feb 28, 2026

No findings