The idea of a programming language being able to implement itself is fascinating. It stirs an intense feeling of curiosity: "What would that even look like?" Since its inception in the early '60s, Lisp has managed to do exactly that.
John McCarthy, In the early '60s, unlocked a collection of remarkable ideas that work really well together and continue to be relevant decades later. First through the Lisp paper, and shortly thereafter with the Lisp 1.5 manual.
One such idea is homoiconicity, a trait where code and data are interchangeable. Typically, we think of code as a sequence of operations acting upon data. This understanding shapes our view of most programming languages today. However, Lisp breaks this mold by treating both code and data as the same—what's famously known as its homoiconic nature. This distinct characteristic effectively blurs the lines between the operator (code) and the operated (data).
This unification of code and data in Lisp is profound, it allows a level of expression where the language can be naturally expressed in itself.
This tiny card is the entire programming language of Lisp! A programming language written in itself. Here is famous quote from Alan Kay about this piece of coding history:
that was the big revelation to me … when I finally understood that the half page of code on the bottom of page 13 of the Lisp 1.5 manual was Lisp in itself. These were "Maxwell's Equations of Software!" This is the whole world of programming in a few lines that I can put my hand over.
Okay, but what sort of magic did Mr. Kay see in this artifact that led him to call it the "Maxwell's Equations of Software"? How was the entire world of programming encapsulated in just a few lines of code?
One fun way to answer that, of course, is by applying the principle that: "In order to understand something, you need to code it".
And to keep this spiritual implementation of the original Lisp fun and fresh, I decided to pick Python as the tool of choice. Most programmers are either not familiar or not comfortable with Lisp's syntax (too many parentheses), but probably quite familiar with Python's syntax. My idea is to rewrite the "Lisp in Lisp" code in Python and try to maintain as much of the spirit of the old code as possible.
Strange yet familiar
Lisp originally, and quite brilliantly, came with two syntactical flavors. A code flavor named M-expression (short for meta) and a data flavor named S-expression (short for symbolic). They're semantically equivalent.
The "Lisp in Lisp" code presented earlier is written as an M-expression (code flavor) and implements an S-expression Lisp (data flavor).
One little trick to get us going is by translating Lisp M-expressions into Python code constructs, such as function calls and conditional statements. Additionally, we can represent Lisp S-expressions using Python lists. Lisp is short for "List Processing" because it only uses one data structure: the list. Therefore, it makes perfect sense to use Python lists to emulate Lisp S-expressions. Below, I present our mini dictionary that would act as our rosetta stone:
These are four ways of expressing one thing. They're all semantically equivalent.
An additional advantage of this mapping approach is that it eliminates the need to implement a parser for our language. This simplifies our code base and allows us to stay true to the original spirit of Lisp without getting distracted by string manipulations.
First iteration
With this context and motivation in place, we can move on to the actual implementation. Lisp requires a set of basic functions to be implemented outside of its scope, they're mostly about the basic building blocks of lists.
Equivalence
atom(x): is x a list?
eq(x,y): is x equal y?
Cutting
car(x): first element of the list
cdr(x): the rest of the list
Stitching
cons(x,y): append an atom to a list
append(x,y): append two lists together
Ignoring a few recursive primitives, and with a little help from Llama3-70b (on Groq), we can quickly get a working interpreter for a subset of the "Lisp in Lisp" code:
Here are a few examples you can play with:
Full code available on github gists
Second iteration
Our first iteration is nearly complete, except for one crucial feature: lambdas. Lambdas are anonymous functions that serve as the primary method for defining and calling functions in Lisp. Without lambdas in Lisp, we cannot implement recursion, and without recursion, our Lisp would not be Turing complete (the minimum threshold to compute all which can be computed).
To incorporate lambdas, we need to add a few functions we previously overlooked, specifically two primitives: assoc(x,y) and pairlis(x,y). assoc(x,y) is a key/value dictionary style lookup but implemented with lists instead (associative lists). parlis is just the zip(x,y) in Python (ziping two lists together).
A literal translation in Python would be:
The original Lisp had to resort to recursion (function calling itself) even for simple, linear scans, since it had no loops. However, assoc and pairlis can be elegantly translated to Python using list comprehensions:
and if you hadn't noticed already I actually cheated a bit in the COND case as this in the original lisp was a evcon which was also translated into a loop. The same trick will be done again with evlis for the LAMBDA case.
We're almost there! There is one last thing, and we're done. The eval function actually takes two arguments in the original Lisp. The first is the expression (s-exp) of course. The second is the environment, which is yet another list (of key/values). Environment maintains variable binding for the LAMBDA case by mapping variable names to their corresponding values. Â
For example, when you define a function with x variable then substitute that function with data, the data gets binded (with pairlis) to the x symbol and is then stored/appended to the environment list. When x is needed, it gets looked up (with assoc) and subtitued back to expression. The specific technique for this binding is known as dynamic scoping.
Here is the full 'Lisp in Lisp' code in Python:
Full code available on github gists
Find me on Twitter @aburjg
Helpful resources:
https://inst.eecs.berkeley.edu/~cs61a/fa14/assets/interpreter/scheme.html
http://languagelog.ldc.upenn.edu/myl/llog/jmc.pdf
https://justine.lol/sectorlisp2
https://hylang.org/ is a mature implementation of this idea.