Lisp Macro Programming

Macros

Macros are the heart and soul of meta-programming in Lisp. They are snippets of code that write or modify either themselves or other parts of the source code, allowing you to create DSLs or domain specific languages, your own special programming dialect. They work by receiving source code itself as their arguments and then transforming that source code accordingly, a process known as macro expansion.

Macros exist in other programming languages such as Dylan, Scala, and Nemerle but are most commonly used in the Lisp family (Common Lisp, Racket, Scheme, etc). This is due to the homoiconic nature of Lisp, which enable the existence of a simple yet comprehensive Macro system.

This is a short example of the potential use of Macros. In Lisp we use the the Polish or prefix notation for code evaluation, this includes mathematical operations.

But what if we wanted to use infix notation (infix notation is the standard way we write equations; "1 + 2", "8 * 7", and so forth)? How would we go about making Lisp do that?

Enter the Macro:

This snippet is telling the Lisp interpreter that whenever it sees the symbol infix-math where the first and last variables are numbers, and the middle variable is a valid mathematical operator, to put the operator at the beginning and evaluate the expression.

Let's see this in action.

As you can see the expressions were entered using infix notation, and Lisp evaluated them correctly. The last example showed what would happen if the operator was not valid.

Now let's take a look at how our macro is being transformed once it is evaluated. We can use the built-in function macroexpand for this purpose.

Take a look at the return value, what did the interpreter do? It basically took the arguments and rearranged them into prefix notation before evaluating them. This is what we told it to do every time it encountered infix-math.

$$\require{color}$$ $(defmacro \ infix-math (x \ op \ y) \\ (if (member op '(+ * - /)) \\ \color{red}\grave(,op \ ,x \ ,y)))$

The arguments are entered in the order of x operator y, and reorders them into operator x y.

$$\ast$$ Macros are expanded during compile-time and not run-time (that is, before the sourced code is evaluated)

Variable Capture

As you can see, defining a macro is almost exactly the same as defining a function. There is something to watch out for however, something called variable capture. This happens when the definition of a macro uses a certain variable, say $$x$$, and the function that calls that specific macro also uses the variable $$x$$.

This is a very simple and immature macro that takes a list and assigns the variables head and tail to the first and rest of the elements if the list exists. There are also expressions to be evaluated in the case the list is valid or not.

Sample Usage:

As you can see, when passed a valid list, our split macro evaluates the first expression and variables head and tail are tied to the car and cdr of the list. When it is passed an empty list, the macro evaluates the second expression.

So what's the problem with this definition? Well, what about if you call split inside a lambda function (like let) and have a local variable $$x$$ defined?

Uh Oh! We defined a local variable $$x$$ and assigned it the value of 100, so why is our function printing (2 3) when we tell it to print $$x$$?

Let's try a macroexpansion to see what the problem is.

If you remember our definition of split, we assigned the value of lst the variable $$x$$ , overwriting the $$x$$ variable of the outer let function. This is known as variable capture.

A simple solution would be to use a really long and hopefully unique name in our macro definition to avoid this problem like thisisauniquevariablename.

(gensym)

(gensym) returns a unique symbol everytime it's called. Let's see it in action.

We can use this to avoid the variable capture that happened in our previous example.

We bind the variable $$x$$ to a symbol generated by (gensym). This is done outside the backtick which means it will be run during compile time. Then we use the value of $$x$$ (the gensym symbol) as a variable name and assign the value of the lst argument to it. This part is run during run-time, as evidenced by the backtick.

Now everytime we want to reference the value of lst, we can use ,x which will call the gensym-generated symbol.

If expanded, this would be equivalent to:

Now to see if we have resolved the variable capture issue.

$$x$$ is now free to be used in the outer functions.

$$\ast$$ split macro and variable capture example are taken from Barski's Land of Lisp.

Best Practices?

One common mistake with beginners writing macros is that they usually have the habit of using them where they shouldn't be used. We can write, for example, an add-elements function that adds all the elements in a list together like so, using our split function:

However we could also just do this:

Besides being shorter and more concise, we didn't have to rely on any custom-defined macro. This makes it easier to maintain, especially if there are other members in the team or if the code will be passed on to another programmer in the future.

Another thing that may happen, this is on how to write macros themselves, is that sometimes within the macro source variables are evaluated more than once.

For example, let's write a macro called if-var. if-var returns the value of the argument passed to it or a nil if it is a nil (think empty list, kind of similar to (nullp)):

Now to see what the possible problem here is, we'll pass it a progn form:

The argument was evaluated twice (we can see that because the string "evaluated", which is part of the argument, was printed twice)! Why did this happen?

If you look back on the macro that was just defined, you can see that ,var appears twice in the code.

$$\require{color}$$ $\boldsymbol{(defmacro \ if-var (var) \\ \grave(if \color{red},var \\ \quad \color{red},var \\ \quad nil))}$

To avoid this, we have to write the code in a way that it only evaluates once.

Now let's rerun it with the same argument:

So here we can see that it only evaluated once! This is because ,var only makes one appearance in the code, it's value is immediately stored in a local variable which is used the rest of the time.

To sum it up

Macros are incredibly powerful and make possible what would otherwise not be possible, or extremely difficult. However there is a tendency to abuse them, to use them where there is no need. A nice rule of thumb is to use them not when it would be easier for the coder, but when the alternative would otherwise be incredibly complex or even outright impossible.