Mylang
A simple programming language inspired by Python, JavaScript and C
Install / Use
/learn @vvaltchev/MylangREADME
MyLang
What is MyLang?
MyLang is a simple educational programming language inspired by Python,
JavaScript, and C, written as a personal challenge, in a short time,
mostly to have fun writing a recursive descent parser and explore the
world of interpreters. Don't expect a full-blown scripting language with
libraries and frameworks ready for production use. However, MyLang has
a minimal set of builtins and, it could be used for practical purposes
as well.
Contents
- Maintainance
- Syntax
- Builtins
Maintainance
Building MyLang
MyLang is written in portable C++17: at the moment, the project has no
dependencies other than the standard C++ library. To build it, if you have
GNU make installed, just run:
$ make -j
Otherwise, just pass all the .cpp files to your compiler and add the src/
directory to the include search path. One of the nicest things about not
having dependecies is that there's no need for a build system for one-time
builds.
Out-of-tree builds
Just pass the BUILD_DIR option to make:
$ make -j BUILD_DIR=other_build_directory
Testing MyLang
If you want to run MyLang's tests as well, you need to just compile with TESTS=1 and disable the optimizations with OPT=0, for a better debugging experience:
$ make -j TESTS=1 OPT=0
Then, run all the tests with:
$ ./build/mylang -rt
It's worth noticing that, while test frameworks like GoogleTest and Boost.Test
are infinitely much more powerful and flexible than the trivial test engine we have
in src/tests.cpp, they are external dependencies. The less dependencies, the
better, right? :-)
Syntax
The shortest way to describe MyLang is: a C-looking dynamic python-ish
language. Probably, the fastest way to learn this language is to check
out the scripts in the samples/ directory while taking a look at the short
documentation below.
Core concepts
MyLang is a dynamic duck-typing language, like Python. If you know Python
and you're willing to use { } braces, you'll be automatically able to use it.
No surprises. Strings are immutable like in Python, arrays can be defined using
[ ] like in Python, and dictionaries can be defined using { }, as well.
The language also supports array-slices using the same [start:end] syntax used
by Python.
Said that, MyLang differs from Python and other scripting languages in several
aspects:
-
There's support for parse-time constants declared using
const. -
All variables must be declared using
var. -
Variables have a scope like in
C. Shadowing is supported when a variable is explicitly re-declared usingvar, in a nested block. -
All expression statements must end with
;like inC,C++, andJava. -
The keywords
trueandfalseexist, but there's nobooleantype. Like in C,0is false, everything else istrue. However, strings, arrays, and dictionaries have a boolean value, exactly like inPython(e.g. an empty array is consideredfalse). Thetruebuiltin is just an alias for the integer1. -
The assignment operator
=can be used like inC, inside expressions, but there's no such thing as the comma operator, because of the array-expansion feature. -
MyLang supports both the classic
forloop and an explicitforeachloop. -
MyLang does not support custom types, at the moment. However, dictionaries support a nice syntactic sugar: in addition to the main syntax
d["key"], for string keys the syntaxd.keyis supported as well.
Declaring variables
Variables are always declared with var and live in the scope they've been declared
(while being visible in nested scopes). For example:
# Variable declared in the global scope
var a = 42;
{
var b = 12;
# Here we can see both `a` and `b`
print(a, b);
}
# But here we cannot see `b`.
It's possible to declare multiple variables using the following familiar syntax:
var a,b,c;
But there's a caveat, probably the only "surprising" feature of MyLang:
initializing variables doesn't work like in C. Consider the following
statement:
var a,b,c = 42;
In this case, instead of just declaring a and b and initializing to c to 42,
we're initializing all the three variables to the value 42. To initialize each
variable to a different value, use the array-expansion syntax:
var a,b,c = [1,2,3];
Declaring constants
Constants are declared in a similar way as variables but they cannot be shadowed in nested scopes. For example:
const c = 42;
{
# That's not allowed
const c = 1;
# That's not allowed as well
var c = 99;
}
In MyLang constants are evaluated at parse-time, in a similar fashion to C++'s
constexpr declarations (but there we talk about compile time). While initializing
a const, any kind of literal can be used in addition to the whole set of const
builtins. For example:
const val = sum([1,2,3]);
const x = "hello" + " world" + " " + join(["a","b","c"], ",");
To understand how exactly a constant has been evaluated, run the interpreter
with the -s option, to dump the abstract syntax tree before running the script.
For the example above:
$ cat > t
const val = sum([1,2,3]);
const x = "hello" + " world" + " " + join(["a","b","c"], ",");
$ ./build/mylang t
$ ./build/mylang -s t
Syntax tree
--------------------------
Block(
)
--------------------------
Surprised? Well, constants other than arrays and dictionaries are not even
instantiated as variables. They just don't exist at runtime. Let's add a
statement using x:
$ cat >> t
print(x);
$ cat t
const val = sum([1,2,3]);
const x = "hello" + " world" + " " + join(["a","b","c"], ",");
print(x);
$ ./build/mylang -s t
Syntax tree
--------------------------
Block(
CallExpr(
Id("print")
ExprList(
"hello world a,b,c"
)
)
)
--------------------------
hello world a,b,c
Now, everything should make sense. Almost the same thing happens with arrays and dictionaries with the exception that the latter ones are instanted as at runtime as well, in order to avoid having potentially huge literals everywhere. Consider the following example:
$ ./build/mylang -s -e 'const ar=range(4); const s=ar[2:]; print(ar, s, s[0]);'
Syntax tree
--------------------------
Block(
ConstDecl(
Id("ar")
Op '='
LiteralArray(
Int(0)
Int(1)
Int(2)
Int(3)
)
)
ConstDecl(
Id("s")
Op '='
LiteralArray(
Int(2)
Int(3)
)
)
CallExpr(
Id("print")
ExprList(
Id("ar")
Id("s")
Int(2)
)
)
)
--------------------------
[0, 1, 2, 3] [2, 3] 2
As you can see, the slice operation has been evaluated at parse-time while
initializing the constant s, but both the arrays exist at runtime as well.
The subscript operations on const expressions, instead, are converted to
literals. That looks like a good trade-off for performance: small values like
integers, floats, and strings are converted to literals during the const evaluation,
while arrays and dictionaries (potentially big) are left as read-only symbols at
runtime, but still allowing some operations on them (like [index] and len(arr))
to be const-evaluated.
Type system
MyLang supports, at the moment, only the following (builtin) types:
- None
The type of
none, the equivalent of Python'sNone. Variables just declared without having a value assigned to them, h
