Typeclasses from scratch

In this post, I want to introduce the concept of typeclasses (aka traits). Typeclasses are the primary way to do ad hoc polymorphism in languages like Rust and Haskell, and one option for it in Scala. They are comparable to interfaces in class-based object oriented languages, but have different tradeoffs.

Motivation

In Kotlin, both strings and normal lists are (more or less) immutable. In either case, we can “append” stuff with the plus function,¹plus is an operator function, so it’s also possible to just use + as an infix operator instead of calling it as a function. This has no bearing on anything else I say here. I use the explicit form as it might be slightly clearer. which returns a new item. For instance,

// just an alias to reduce the number of angle brackets later on
typealias SList = List<String>

val newList: SList = listOf("hello", " world").plus("!")
val newString: String = "hello, world".plus("!")

The two references to plus in this snippet refer to two different functions. They have the same name, but are defined on different classes and have different implementations. Their types are also different: one is an (SList, String) -> SList, and the other is a (String, String) -> String.

But they are not that different: both take something, add a string to it, and return the same thing they took in. In other words, they are both (T, String) -> T, with about the same behavior. It would be pretty reasonable to want to abstract over it. For instance, let’s try to write a function that appends a "!" to either a string, or a list of strings. Something like:

fun makeExcited(stringOrList: ???): ??? = stringOrList.plus("!")

Of course, this doesn’t work as-is, because there’s nothing we can write for ??? that will make makeExcited work for both strings and lists.

Interface

An object-oriented way to go about this could be to create a common interface:

interface StringContainer {
    fun plus(s: String): StringContainer
}

Then define wrappers which implement this interface:

class LstSC(val lst: SList) : StringContainer {
    override fun plus(s: String): LstSC = LstSC(lst.plus(s))
}

class StrSC(val str: String) : StringContainer {
    override fun plus(s: String): StrSC = StrSC(str.plus(s))
}

Finally we can define our helper function:

fun makeExcited(cont: StringContainer): StringContainer =
    cont.plus("!")

This sort of works, but it’s actually a little hard to use, because the result of makeExcited is not the same as what was passed in; it’s always a StringContainer. This means we need to (gasp!) use type casts (an as-expression) to get the original value back:

// doesn't work:
// makeExcited(LstSC(listOf("hi"))).lst

(makeExcited(LstSC(listOf("hi"))) as LstSC).lst

[hi, !]

Type casts are not that big of a deal in Kotlin, where all types are known at runtime, and a failed cast will just cause a ClassCastException. But in Rust or in Haskell, casts are very dangerous because the program has no way to check, at runtime, whether they are valid. An invalid cast will generally corrupt memory, crash the program, or do other unpleasant things. So we’d really like the type system to guess the right type for us!

We want to be able to write something like:

val excitedList: LstSC = makeExcited(LstSC(listOf("hi")))
val excitedString: StrSC = makeExcited(StrSC("hi"))

[The other issue, of course, is that defining these wrapper types, as well as wrapping and unwrapping, is really annoying. The typeclass pattern avoids this, but trades it for another annoyance, at least in languages with no built-in support for it.]

Generics

One other idea is to combine this interface with generics, which are now available in most OO languages, like C++, Java, and Kotlin. Generics are effectively our only option to express that the function’s return type is the same as its argument type. We could write something like this:

// doesn't work as-is
fun <T: StringContainer> makeExcited(cont: T): T =
    cont.plus("!")

But plus, which is still defined in the StringContainer interface, still returns a StringContainer, not a T, so this doesn’t work.

In fact, as long as we try to use interfaces, we’ll run into this issue. So what can we do instead? One idea that works well with generics, and meshes well with functional programming, is to just pass the particular implementation of plus to makeExcited:

fun <T> makeExcited(cont: T, plus: (T, String) -> T): T =
    plus(cont, "!")
    
"${makeExcited(listOf("Hello", "world"), SList::plus)} - ${makeExcited("Hello world", String::plus)}"

[Hello, world, !] - Hello world!

Ok, so we’ve mostly achieved our goal. We can now write functions that can apply to any type that has a plus function whose type (T, String) -> T. These types don’t even need to have a common interface defined.

And importantly, the concrete type that the function is applied to doesn’t disappear—we can return the same type we took in, which is is what lets us avoid casts.

The main issue with this is that it’s annoying to have to pass the plus function around. It’s sort of OK if you only have one, but if your interface is made up of 10 functions, you’d need to pass them all…

Typeclasses

The typeclass pattern attempts to solve that issue.

Let’s look at a function that needs a more complicated interface. makeInterrogative will still take a list or a string, and add a question mark if it’s non-empty, returning a default value otherwise. We could write it this way:

fun <T> makeInterrogative(
        cont: T, 
        plus: (T, String) -> T, 
        isEmpty: (T) -> Boolean, 
        default: T
    ): T =
    if (isEmpty(cont)) { default } else { plus(cont, "?") }

As we said, this gets annoying, because you need to pass in three functions everywhere. We could just bundle these up into a single object, and pass that one object instead of three functions. Let’s start by defining an interface:

interface StringFunctions<T> {
    fun plus(cont: T, s: String): T
    fun isEmpty(cont: T): Boolean
    val default: T
}

We then write singleton implementations, one for each type:

object SListSF : StringFunctions<SList> {
    override fun plus(cont: SList, s: String): SList = cont.plus(s)
    override fun isEmpty(cont: SList): Boolean = cont.isEmpty()
    override val default = listOf<String>()
}

object StringSF : StringFunctions<String> {
    override fun plus(cont: String, s: String): String = cont.plus(s)
    override fun isEmpty(cont: String): Boolean = cont.isEmpty()
    override val default = "[none]"
}

Nothing really prevents you from defining two alternative sets of functions for the same type, but the general idea is that you would have just one instance of StringFunctions<T> for each type T. This is what we did here.

And we can now define and call makeInterrogative:

fun <T> makeInterrogative(cont: T, sf: StringFunctions<T>): T =
    if (sf.isEmpty(cont)) { sf.default } else { sf.plus(cont, "?") }
    
"${makeInterrogative(listOf("Hello", "world"), SListSF)} - ${makeInterrogative("", StringSF)}"

[Hello, world, ?] - [none]

And that’s the typeclass pattern! In Haskell, the interface we last defined is called the class or typeclass, and each object is an instance (confusingly, this terminology is similar to OOP’s, but means somewhat different things).

In real languages

In languages like Scala and Haskell (and in Rust, although not exactly), the compiler also:

provides a special way to declare typeclasses,
ensures a single instance of the typeclass exists or is in scope for any given type
implicitly passes that one instance around to functions that need it (which makes typclasses no more cumbersome than standard interfaces).

For instance, in Haskell, we’d write:

{-# LANGUAGE FlexibleInstances #-}
class StringFunctions t where
    plus :: t -> String -> t
    isEmpty :: t -> Bool
    def :: t

instance StringFunctions [String] where
    plus start rest = start ++ [rest]
    isEmpty = null
    def = []
    
instance StringFunctions String where
    plus = (++)
    isEmpty = null
    def = "[none]"

makeInterrogative :: StringFunctions t => t -> t
makeInterrogative x = if isEmpty x then def else plus x "?"

Note how we don’t need to pass in any dictionaries manually.

Conclusion

To recap, the issue is this:

We have several types that share some functions (here, just plus). The functions must have the same signatures, except any number of the parameter types and the return type are allowed to be “holes”, i.e. replaced by another type.
We want to abstract over these shared methods.
We can’t use interfaces because we want to preserve more type information than they will let us.

Typeclasses let us solve that problem.

Of course, this problem mostly occurs with immutable data structures (like in Haskell); another reason to use typeclasses is that they mesh well with static dispatch, which can be made more performant than dynamic dispatch (this is why Rust uses typeclasses).

I don’t think this pattern is all that useful in Kotlin, but implementing can be a good intro to understanding how things work in languages with built-in support for them.

Terbium