Why Lisp?

I’ve been fascinated by Lisp ever since I first discovered it and it “clicked” for me.

(defun hello (world)
    (format t "Hello, ~a!~$" world""))

When I talk about Lisp, I get animated. It unnerves a lot of people, I think. I can see it in their eyes, that slight sense of unease. Is this guy okay? I mean, it’s just a freakin’ programming language!

That’s where I’d probably differ. It’s not just a programming language. It’s a tool for structured, unstructured thought. To understand what Lisp is, we first have to look at all the things people hate about it.

All Those Parentheses

The first thing people notice about Lisp is that it has an awful lot of parentheses. To programmers more accustomed to Algol-like languages (C, Java, Perl, Javascript and the like), all this punctuation seems unnecessary and off-putting.

But parentheses are, for most people writing most programs, the end of the punctuation. Lisp doesn’t have semicolons. It doesn’t really have mathematical symbols — at least not in the sense that you have to remember special rules for how to use them. It doesn’t have block bracing punctuation either.

Just parentheses.

Some Lisp dialects inject new types of parentheses; Clojure has three distinct sets – (), [], and {}! While these provide some visual flavor, they don’t provide anything new, structurally speaking, over plain old round parentheses.

Some (now extinct?) dialects of Lisp had the MEGA END BRACKET OF BRACKETY COMPLETENESS, also known by the more unassuming moniker of “]”. When encountering a lone close-square-bracket, these Lisp compilers would dutifully close up all of the open parentheses. This was believed to help beleaguered programmers from having to finish everything that they start.

(defun should-equal (msg got want)
  (should msg (equal got want))
  (cond ((not (equal got want))
         (diag "got    ~S~%" got"")
         (diag "wanted ~S~%" want] ; i.e. "))))"

The parentheses don’t distract me. They are not a nuisance. They are the whole point.

You see, Lisp lacks syntax. All those parentheses are there to shore up the structure of the program, since we don’t have a syntax to do it. That probably makes no sense, and that’s okay. Let’s look at a language that does have syntax: Perl.

# here's some Perl
if ($a > $b) {
  print "a!";
} else {
    print "b!";
}

This program, while too trivial to be useful, serves our purposes beautifully. It compares to variables (presumably, numerically) and prints out the name of the variable that is the bigger of the two.

The perl compiler doesn’t actually deal with this syntax; as quickly as it can, it converts the source program text into an abstract syntax tree. Ours looks like something like this:

Diagram of the abstract syntax tree for the above conditional program.

As the program is compiled (or interpreted, in Perl’s case), this abstract syntax tree is expanded, collapsed, coalesced, and cajoled as various bits of the programming language’s implementation are brought to bear on the problem: what the heck did the programmer mean?

In Lisp, we would write the following:

(if (> a b)
    (print "a!")
    (print "b!"))

This is literally just a serialization of the abstract syntax tree!

That’s why I keep saying that Lisp doesn’t have syntax (but it does have parentheses!)

Prefix Notation

Most Algies like to use infix notation for mathematical operators ($$2 + 2 = 4$$, just like you were taught in school), a mix of prefix and postfix notation for increment/decrement operations (++n++ anyone?), prefix for address shenanigans (&val, *ptr), and a weird infix+prefix notation for function calls.

Lisp insists on prefix notation for all of this. This gives lots of people the willies.

Admittedly, prefix notation does take some getting used to. Here’s some exercises to get you going.

    1 + 2 → (+ 1 2)
      ++n → (1+ n)
sqrt (25) → (sqrt 25)

Lisp’s use of this somewhat uncomfortable notation is intentional. In Lisp, everything is a function (or a macro, but we’ll get to those in a moment). That includes the arithmetic addition operation.

More interestingly, all functions in Lisp can be applied, with their arguments assembled at runtime by the program doing the calling.

(defun range (a b)
  (loop for i from a to b collect i))

(apply #'+ (range 1 100))

This is easy enough to implement in C:

int sum_range (int from, int to) {
  int i, sum = 0;
  for (i = from; i < to; i++) {
    sum += i;
  }
  return sum;
}

But how do you do that to just the odd numbers? In Lisp, we can filter the arguments before applying them to the + operation:

(apply #'+ (remove-if #'evenp (range 1 100)))

But in C, we have to build a whole new function!

int sum_odd_range(int from, int to) {
  int i, sum = 0;
  for (i = from; i < to; i++) {
    if (i % 2 == 1) {
      sum += i;
    }
  }
}

And what if we want to sum the squares of the odd numbers in our range?

(defun sq (n)
  (* n n))

(apply #'+              ; add up the...
  (mapcar #'sq          ; squares of...
    (remove-if #'evenp  ; odd numbers...
      (range 1 100))))  ; from 1 to 100

I’m not even going to bother with the C implementation, because I’m about to throw you a curve ball – in Lisp we can do this without muddying up the global namespace with a function to multiply two numbers together.

Behold, the lambda!

(apply #'+                     ; add up the...
  (mapcar (lambda n) (* n n))  ; squares of...
    (remove-if #'evenp         ; odd numbers...
      (range 1 100)))          ; from 1 to 100

You can’t do that in C. You can do it in Go, but it’s a bit less elegant:

func sum(from, to int, func keep(int) bool) int {
  sum := 0
  for i := from; i <= to; i++ {
    if keep(from+i) {
      sum += from+i
    }
  }
  return sum
}

sum(1, 100, func (x int) bool {
  return x % 2 == 1
})

You’re still having to build an edifice (the sum() function) around the fact that Go’s arithmetic addition operator is special – it exists inside the language proper, because of its special syntax.

Macros Hurt My Brain

Everybody know what macros are, right?

#define for_range(i,from,to) for (i = from; i <= to; i++)

The C pre-processor implements a rudimentary (and that word is doing a lot of work here) macro system that supports inline replacement and some token pasting.

You see, in C, when you use a “constant” like AF_INET, the C pre-processor replaces all occurrences of the token “AF_INET” with the numeric value that the constant represents. If you dump the symbols in a compiled C program, you’ll see function names and global variables, but you’ll not see a single, solitary #define’d constant. They evaporate.

Because cpp is just doing search-and-replace against the C program’s source code, you can get pretty tricky with C “macros”. I once wrote a list implementation that relied heavily on pre-processing to introduce new syntax for list iteration:

void do_all_the_things(struct list *things) {
  char *thing;
  for_each(thing, things) {
    printf("A thing: %s\n", thing);
  }
}

Here’s the magic pre-processor define for for_each(x,y):

#define for_each(var,lst) \
  for ((var) = list_item(lst); \
       list_has_next(lst); \
       (lst) = list_next(lst), \
       (var) = list_item(lst))

This only works because the other list* functions are specifically tuned to this usage, and the C pre-processor drops our (legitimate) for loop preamble into the program, right where our (totally illegal syntax) foreach loop preamble was.

People say they don’t like macros, or that they are complicated, or too meta, but then they use them every day. Ctrl-C is a macro. It’s a thing you type on the keyboard, that doesn’t type the thing you typed on the keyboard into the input field. That’s a macro. It’s meta by definition.

Lisp is the only language I’m aware of that has real macros, because Lisp is the only language I’m aware of where the representation of the programs and the representations of the data those program operate on is EXACTLY THE SAME.

Lisp data is often organized into lists of symbols.

Lisp code is always organized into lists of symbols.

That means that Lisp data, operated on by a Lisp program, can produce a second, different Lisp program.

This is the heart of what macros are: a way of programming the compiler.

Here’s a thing you can do in Lisp that you cannot do in any other language: add new (non-)syntax:

(defmacro backwards-and-2 (a b)
  `(and ,b ,a))

With that macro defined, whenever I write the code

(backwards-and-2
  (if-second-thing)
  (if-first-thing))

The macro expander will rewrite that, swapping the expressions and passing them both to the and form:

(and (if-first-thing)
     (if-second-thing))

The Rust community has taken to calling these “zero-cost abstractions” because they evaporate before the code generation phase starts. As far as everyone else is concerned, we ALWAYS wrote the and's the right way round.

The real power in Lisp macros comes when you realize that the macro expander operates with the full power of Lisp itself behind it. That means we can write macros that do all sorts of deep magic on the expressions we wrap them around, before spitting out code that would be error-prone for us to write ourselves.

Consider:

(defun is-html? (tag)
  (or (eq tag 'a)      ; ... etc.
      (eq tag 'abbr)
      (eq tag 'address))))

Wouldn’t it be nicer to say what the HTML tags are, and mechanically synthesize the is-html? predicate?

(defmacro html-tags (tags)
  (let ((eqs (mapcar #'(lambda (tag)
                         `(eq tag ',tag)) tags)))
    `(defun is-html? (tag)
       (or ,@eqs)))

Now, if we call (html-tags ...) with all of the known HTML tags, we get (is-html? ...) for free, several times over!

(html-tags (list
  'a 'abbr 'address)) ; ... etc.

It’s way more compact, but it is equivalent (at runtime) to the hand-coded version. The key difference is that I can add a new HTML element by adding its symbol to the argument list.

We don’t even need to specify the arguments literally! We could read the list of HTML tags from a local file if we wanted!

(defun read-all (path)
  (with-open-file (in path)
    (loop for form = (read in nil nil)
          while form collect form))))
(html-tags (read-all "html.tags"))

Now, if we discover a new HTML element, we pop its tag onto the end of the html.tags file, recompile our program and we magically have new code!

Isn’t lack of syntax grand?

James (@iamjameshunt) works on the Internet, spends his weekends developing new and interesting bits of software and his nights trying to make sense of research papers.

Currently exploring Kubernetes, as both a floor wax and a dessert topping.