SKI Calculus

1 September 2019

I recently implemented Relambda, an esolang interpreter based on SKI combinator calculus. This post explores SKI and its relationship to lambda calculus. A basic understanding of lambda calculus is useful.

SKI Combinator Calculus

SKI combinator calculus is a very simple Turing complete computational system, that, according to Wikipedia, is “not convenient for writing software”. SKI can be seen as a functional counterpart to classical Turing tarpits: it’s an even more minimal lambda calculus, where Brainfuck is an even more minimal Turing machine.

In SKI, like in lambda calculus, all objects are functions: they can be applied to an argument and they will return a value. Of course, since all objects are functions, the argument and return value will have to be a functions themselves. We call these functions combinators.

Application of combinator y to combinator x is denoted (xy). Application is left-associative, and some parentheses can be left out. For instance, xy=(xy), w(xy)z=((w(xy))z), and xy(zw)=((xy)(zw)).

SKI starts with only 3 combinators:

  • I (Identity): when applied, it returns its argument. For instance,
    • II = I
    • IK = K.
  • K (Kestrel, or Constant): when applied to a value x, it returns a function K_x that itself yields x when applied. For instance,
    • (KI)K=(K_IK)=I
    • I(KK)K=(IK_K)K=K_KK=K.
  • S (Starling): when applied to three values x, y, and z, returns Sxyz = xz(yz). For instance,
    • SKKS = KS(KS) = K_SK_S = S
    • SKSI = KI(SI) = K_I(SI) = I

In the last expression above, I didn’t have to evaluate SI because whatever the result, K_I ignores it and return I. However, SI is a combinator itself, which we’ll note S_I; in general, Sx = S_x. When applied to another value, y, it returns S_xy = S_{x,y}. When that combinator is finally applied, we get the final version: S_{x,y}z = xz(yz). You may know this as currying (we’ve talked about it in the past), and it’s common in functional languages.

So S doesn’t do much when applied once, it just stores its argument. If the result is applied again, it adds the second argument to storage. On the third application, it’s finally ready to perform the Starling transformation. So the rule we gave above for S is really:

This also reassuringly proves that our rule for S is compatible with our application and priority rules.

We’ll explain in a moment why S, K and I make sense as our base combinators.

I isn’t actually primitive

A quick interlude: can we write I in terms of K and S? That is, we want to get a combinator I’ such that I’x = x for all x.

Well, given any value x, we know that KxK = K_xK = x. (Note that we could use any combinator instead of the rightmost K.) But KxK doesn’t work as an identity combinator, because we have to plop the value right in the middle of it! Application only works by putting values to the right of the applied combinator.

But S lets us reorganize elements in an application. We know that Sxyz = xz(yz); notice how we now have a z in the middle of the expression. Substituting x=K, we get SKyz = Kz(yz) = K_z(yz) = z. So, for any y, SKy = I. In particular we have SKK = SKS = I, so we could do without the I combinator in our base set and create it from S and K. Everything would just be a bit lengthier to write.

SKI and Lambda Calculus

We can clearly do some stuff with SKI, but it’s intuitively unclear how much. As I’ve said above, SKI is Turing complete. But how can we perform more complicated operations? It turns out SKI is pretty easy to map to lambda calculus.

Let’s quickly recap the key elements of lambda calculus:

  • Expressions are made of abstractions (e.g., λy. y), and applications (e.g., xy). Abstractions behave like lambda functions in any functional languages (in fact, that’s where lambda functions got their names).
  • There are 3 transformations: α-, β-, and ɣ-reduction. The most important is β-reduction, which tells us how to process an application: it says we can transform ((λ x.A) B) into A’, where A’ is obtained by replacing x with B everywhere it occurs in the body of A. We write that A’ = A[x \rightarrow B]. α- and ɣ-reduction essentially deal with variable name conflicts, which are of little interest for SKI.

The fact that SKI and lambda calculus are equivalent may seem strange at first: SKI does not have abstractions, even though these are precisely what allows us to create Turing complete loops and the cool Y combinator.

SKI → Lambda Calculus

To start with, given an SKI expression, how can we turn it into a lambda calculus expression? If we can achieve this, we will have proven… Absolutely nothing (actually, this proves that lambda calculus is at least as powerful as SKI).

Converting the base operators is straightforward:

  • I becomes, obviously enough, \iota := λx. x, an abstraction that just returns its argument.
  • K becomes \kappa := λx. λy. x: keep the first argument, ignore the second.
  • S becomes \sigma := λx. λy. λz. xz(yz), which does its Starling thing.

Now if we just turn SKI applications into lambda calculus applications, we have fully mapped SKI into lambda calculus (and in a very rigorous way at that /s).

Example

Ok, let’s try mapping an expression into lambda calculus:

which is very good, because we knew that SKI = I, and I maps to λx. x. This derivation is a little annoying, but really not difficult if you’re careful.

Lambda Calculus → SKI

Now for the harder part. We take a lambda calculus expression, and we want to turn it into SKI. We will still map lambda calculus applications into SKI applications, so that part is easy. The crux is mapping abstraction.

Assume we have an abstraction f = λx. A. The body, A, can be one of four things:

  1. x. In this case, f = \iota = λx. x.
  2. e, where e is a variable distinct from x. In this case, f = \kappa e.
  3. an application, e_1 e_2, where e_1 and e_2 are arbitrary expressions. In this case, f = \sigma (λx. e_1) (λx. e_2). Indeed, for any y, \begin{align} fy = A[x\rightarrow y] &= e_1[x\rightarrow y]e_2[x\rightarrow y] \\ &= ((λx. e_1)y)((λx. e_2)y)\\ &= \sigma (λx. e_1) (λx. e_2) y. \end{align} We then apply these rules recursively to e_1 and e_2,

  4. an abstraction, λy. A’. We can recursively apply this set of rules to this inner abstraction to eliminate it, then continue applying our rules on the result.

Assuming the original lambda expression had no free variables, applying this procedure recursively lets us rewrite f in terms of \iota, \kappa and \sigma only. At that point, we can simply replace \iota with I, \kappa with K, and \sigma with S to obtain an equivalent SKI expression.

So we see why the three combinators of SKI are S, K and I: these correspond exactly to the three structures we can find when eliminating abstractions from Lambda calculus.

Example

Let’s pick a more interesting example: the lambda function v defined as

It’s interesting because for any y,

So (uu) is an abstraction that always returns itself when applied. Let’s turn it into a combinator that does that!

  1. The body of u is v := λx. (ss), so we start by applying the rules to this inner abstraction (rule 4):
    1. The body of v is (ss), which is an application. Using rule 3, we turn it into \sigma(λx. s)(λx. s). We now need to apply our rules recursively to the subexpressions.
    2. The first subexpression is λx. s, which is a variable other than x. We replace it using rule 2, and get \kappa s.
    3. The other subexpression is also λx. s. In the end, v = \sigma(\kappa s)(\kappa s).
  2. Back to u. We have u = λx. ((\sigma(\kappa s))(\kappa s)) (I’ve added parentheses back for clarity). The body is an application! Apply rule 3 and we get u = \sigma(λx. (\sigma(\kappa s)))(λx. \kappa s).
  3. The first subexpression is w = λx. (\sigma(\kappa s)). Another application, rule 3 again, we get w = \sigma(λx. \sigma)(λx. (\kappa s)).
  4. Applying rule 2, we get w = \sigma(\kappa\sigma)(λx. (\kappa s)).
  5. The second argument, λx. \kappa s, becomes \sigma(\kappa\kappa)\iota, through rules 3, 2, and 1. This gives us w = \sigma(\kappa\sigma)(\sigma(\kappa\kappa)\iota)
  6. The second subexpression of u is w’ = λx. \kappa s. This turns into w’ = \sigma(\kappa\kappa)\iota just as in step 5.
  7. We finally get u = \sigma w w’ = \sigma (\sigma(\kappa\sigma)(\sigma(\kappa\kappa)\iota))(\sigma(\kappa\kappa)\iota)
  8. This turns into U = S(S(KS)(S(KK)I))(S(KK)I).

Again, this is tedious but completely mechanical! We can note that S(KK)I = K, which further simplifies U into S(S(KS)K)K, a nicely symmetrical expression.

Conclusion

The main takeway here is that SKI is functionally equivalent to lambda calculus, at the cost of a potentially expensive transformation, but with much simpler rules; in particular, we do away completely with α- and ɣ-transformations.

Comments

Type Comment Here (at least 3 chars)