Typed tagless-final interpretations: Lecture notes

The course on typed tagless-final embeddings of domain-specific languages has been presented at the Spring School on Generic and Indexed Programming (SSGIP) http://www.comlab.ox.ac.uk/projects/gip/school.html at Wadham College, Oxford, UK on 22nd to 26th March 2010. This page collects the notes for the course in the form of the extensively commented Haskell and OCaml code.

Introduction

The topic of the course is the embedding of domain-specific languages (DSL) in a host language such as Haskell or OCaml. We will often call the language to embed 'the object language' and the host language 'the metalanguage'. All throughout the course we will repeatedly encounter the following points:

Multiple interpretations:
- writing a DSL program once, and interpret it many times, in standard and non-standard ways;
Extensibility:
- enriching the syntax of the object language, re-using but not breaking the existing interpreters;
Types
Typed implementation language
- getting the typechecker to verify properties of interpreters, such as not getting stuck;
Typed object language
- getting the metalanguage typechecker to enforce properties of DSL programs, such as being well-typed;
Connections with logic
Final
- preferring lower-case
- preferring elimination over introduction
- connecting to denotational semantics

First-order languages (generic programming)

We will be talking about ordinary data types and (generic) operations on them. The expression problem will make its appearance. The first-order case makes it easier to introduce de-serialization and seemingly non-compositional operations.

Initial and final, deep and shallow: the first-order case

Intro1.hs [2K] Algebraic data type/initial representation of expressions; Constructor functions: the intimation of the final representation (or, shallow embedding)

Intro2.hs [3K] Symantics: parameterization of terms by interpreters

Intro3.hs [2K] Initial and Final, Deep and Shallow, First-class

ExtI.hs [<1K] Algebraic data types are indeed not extensible

ExtF.hs [2K] Adding a new expression form to the final view: solving the expression problem

Serialize.hs [4K] Serialization and de-serialization

SerializeExt.hs [4K] De-serializing the extended language

Final embeddings in OCaml

We demonstrate several encodings of extensible first-order languages in OCaml. Objects turn out handy in emulating the composition of type class dictionaries.

final_obj.ml [2K] The traditional application of objects to represent extensible data types. Alas, the set of operations on these data types is not extensible.

final_mod.ml [3K] Tagless-final embedding using modules

final_dic.ml [3K] Tagless-final embedding with objects emulating type-class dictionaries. Both the language and the set of its interpretations are extensible.

Non-compositionality: Fold-unlike processing

Interpreters are well suited for compositional, fold-like operations on terms. The tagless-final representation of terms makes writing interpreters quite convenient. One may wonder about operations on terms that do not look like fold. Can we even pattern-match on terms represented in the tagless-final style? Can we compare such terms for equality? We answer the first question here, deferring the equality test till the part on implementing a type checker for a higher-order language. Our running examples are term transformations, converting an expression into a simpler, more optimal, or canonical form. The result is an uncrippled expression, which we can feed into any of the existing or future interpreters. Our sample term transformations look like simplified versions of the conversion of a boolean formula into a conjunctive normal form.

PushNegI.hs [3K] Pushing the negation down: the initial implementation

PushNegF.hs [3K] Pushing the negation down: the final implementation

PushNegFExt.hs [4K] Pushing the negation down for extended tagless-final terms

FlatI.hs [2K]

FlatF.hs [3K] Flattening of additions, the initial and the final implementations

PushNegFI.hs [3K] The final-initial isomorphism, and its use for implementing arbitrary pattern-matching operations on tagless-final terms.

http://www.comlab.ox.ac.uk/ralf.hinze/SSGIP10/Slides.pdf

Ralf Hinze, in Part 7 of his Spring School course, has derived this 'initial-final' isomorphism rigorously, generally and elegantly from the point of view of Category Theory. In the first-order case, both 'initial' and 'final' are the left and the right views to the same Initial Algebra. The 'final' view is, in the first-order case, ordinary Church encoding.

Interpreters for higher-order languages

Higher-order languages are data types with binding, so to speak. In the first part, only the interpreters were typed; we could get away with our object language being unityped. Now, the object language itself becomes typed, bringing the interesting issues of interpreting a typed language in a type language ensuring type preservation. It is this part that explains the attributes 'typed' and 'tagless' in the title of the course.

Type-preserving embedding of higher-order, typed DSLs

Using simply-typed lambda-calculus with constants as a sample DSL, we demonstrate its various embeddings into Haskell. We aim at a type-preserving embedding and efficient and type-preserving evaluations. The tagless-final embedding not only achieves this goal, it also makes the type-preservation patently clear. Tagless-final evaluators are well-typed Haskell programs with no pattern-matching on variant types. It becomes impossible for the evaluators to get stuck. Since the type preservation of the evaluators is apparent not only to us but also to a Haskell compiler, the evaluators can be efficiently compiled. Tagless-final embeddings are also extensible, letting us add to the syntax of the DSL, preserving and reusing old interpreters.

IntroHOT.hs [3K] The illustration of problems of embedding a typedDSL into a typed metalanguage

Either the Universal type (and hence spurious partiality, type tags and inefficiency), or fancy type systems seem inevitable. The problem stems from algebraic data types' being too broad: they express not only well-typed DSL terms but also ill-typed ones.

Term.agda [2K] http://www.iis.sinica.edu.tw/~scm/2008/typed-lambda-calculus-interprete/

Shin-Cheng Mu: Typed Lambda-Calculus Interpreter in Agda. September 24, 2008

Shin-Cheng Mu solves the problem of the type-preserving tagless interpretation of simply-typed lambda-calculus, relying on dependent types and type functions in full glory.

IntroHOIF.hs [6K] Tagless-initial and Tagless-final evaluators

TTFdB.hs [7K] Typed, tagless, final, with de Bruijn indices: Expressing the type system of simply-typed lambda-calculus in Haskell98. No dependent types are necessary after all. The types of methods in the Symantics type class read like the axioms and inference rules of the implication fragment of minimal logic.

TTF.hs [7K] Typed, tagless, final, in the higher order abstract syntax (HOAS). We illustrate extending the DSL with more constants, types, and expression forms.

TTIF.hs [8K] Initial-final isomorphism, in the higher-order case

De-serialization and type-checking

Since we represent DSL expressions as well-typed Haskell terms, we can place DSL terms in Haskell code or enter at the GHCi prompt. However, we also want to interpret DSL expressions that are read from files or received from communication pipes. We no longer can then rely on GHC to convert DSL expressions from a text format into the typed embedding. We have to do the type-checking of DSL expressions ourselves. Our goal is to type-check an expression once, during de-serialization, and evaluate the result many times. Since a type checker needs to represent types and reason about type equality, we develop type representations and type safe cast. We regard the language of types, too, as a typed DSL, which we embed in Haskell in the tagless-final style.

TypeCheck.hs [12K]\ De-serialization: (Dynamic) Type Checking
In contrast to an earlier version of the type checker, we use de Bruijn indices and obtain a much clearer code. The code is quite similar to Baars and Swierstra's ''Typing Dynamic Typing'' (ICFP02). However, the result of our type-checking is an embedded DSL expression that can be interpreted many times and in many ways, rather than being merely evaluated. The set of possible interpretations is open. Also, our code is written to expose more properties of the type-checker for verification by the Haskell type-checker; for example, that closed source terms are de-serialized into closed target terms.

Typ.hs [8K] Type representation, equality and the type-safe generalized cast

We present an above-the-board version of Data.Typeable, in the tagless-final style. Our implementation uses no GHC internal operations, no questionable extensions, or even a hint of unsafe operations.

http://www.comlab.ox.ac.uk/projects/gip/school/tc.hs

Stephanie Weirich some time ago wrote a very similar type-checker, but in the initial style, using GADTs. The comparison with the tagless-final style here is illuminating.

Applications and Extensions

Tagless

Install / Use

README