Despite my recent fascination with space manufacturing, I’m still focusing on my programming language, which is tentatively called SYMBL. In November and December, I wrote a bootstrap compiler in CoffeeScript, and replaced the bootstrap compiler with a self-hosted compiler. Since then, I’ve been working toward adding static bounds checking.
A lot of my effort has gone into transforming the high-level language syntax into a minimal core language. After parsing, there is a distiller stage that reduces a syntax tree with 20 different types of non-literal nodes to a tree with 10 different types of non-literal nodes. The distiller turns most of the complex language features into the simplified core language:
- Operators are translated into applications of intrinsic functions.
- Strings with subexpressions are translated to applications of the intrinsic string concatenation function.
- Comprehensions are translated to applications of a map function.
- Objects are translated to records (simple objects that just define a mapping from name to value).
- Loops are translated to tail-recursive functions.
The knitter follows the distiller, and turns the core syntax tree into a graph by replacing references to local definitions with direct references to the subtrees denoting the definition’s value, producing cycles for recursive definitions. The resulting graph has only 9 different types of non-literal nodes.
The result is a term graph which has very simple language-independent semantics. Most of the language’s semantics are defined by how the distiller and knitter translate its syntax into the semantics of the term graph, combined with the simple evaluation rules for the term graph.
The distiller and knitter were initially a single stage, but I found that splitting them into two stages produced simpler code: two 250 line modules can be much simpler than a 500 line module! And while this change didn’t reduce the amount of code in the stages themselves, the tests were simplified dramatically.
The distiller/knitter split also matches this pattern of splitting a stage into a homomorphic (structure-preserving) part and a non-homomorphic part. The distiller is a homomorphic map from the syntax tree to the core syntax tree. I’m using homomorphic loosely here: the distiller is not truly homomorphic because repeat and array size shortcuts have a core syntax translation that is dependent on their context. But most of the syntax nodes are translated by the distiller independent of context! This context-independence leads to simple code, and much simpler tests.
There’s a point where simpler stages aren’t worth the additional complexity of more stages, but I think these two additional stages were a good trade-off.