Building finite types

4 November 2018

This post is about interesting properties of types in programming languages. It assumes you know about basic functional strongly-typed programming. A bit of Haskell syntax can be helpful but isn’t required.

Building finite types

Imagine a Haskell-like language that only had two primitive types, Void and Unit. Void is a type with no elements, whereas Unit is a type with a single element, ()1. That’s a little light for serious programming. Most programs, for instance, will want a Boolean type.

So let’s give ourselves a little tooling to climb out of the hole. Assume x and y are any two types. Then Either x y is a type whose values are all values of the form Left <X>, where <X> is a value of type x, and Left is just a keyword/tag, and the elements of the form Right <Y>, where <Y> is a value of type y. For instance:

  • Some elements of the type Either Int Bool are the values Left -4, Right False, Right True and Left 2.
  • Some elements of the type Either Float Float are the values Left 0.5, Left 1.0 and Right 1.0 (the latter two being different).

Remember that we don’t have Int or Bool at our disposal! I’m just using them here to hopefully make the examples clearer.

Although it doesn’t change the expressiveness of what we’re describing, we’ll also allow the structure type MyType = <some type definition> which allows us to create a type alias for brevity.

With this out of the way, we can now make a boolean type:

type Boolean = Either Unit Unit
let true = Left ()
let false = Right ()

Here’s the not function for this type:

not :: Boolean -> Boolean
not Left () = Right ()
not Right () = Left ()

In fact, we can generalize this to create types with any finite amount of values (that is, enum types). We simply need to nest Eithers. Here’s how you can create a three-state enum of Queued, Working and Completed (which might represent the state of a task in a work queue):

type StatusEnum = Either Unit (Either Unit Unit)
let Queued = Left ()
let Working = Right (Left ())
let Completed = Right (Right ())

(Of course, none of this is very practical: used in a real system, it’d be as painfully slow to write as it would be to run.)

Notice that infinite types are out of our reach, unless we allow ourselves to write down infinite definitions. Of course, since we’d like our programs, theoretical as they may be, to terminate, we won’t allow this.

One thing functional programmers like, besides writing tutorial on monads, is to create pair types. For instance, they might want to define the type of pairs of booleans and statuses; this could represent the combination of whether a task is high-priority or not and what its current status is.

We can’t directly define pairs with our Either, but fortunately, there’s just a finite number of possible pairs of booleans and statuses! These are (True, Queued), (True, Working), (True, Completed), (False, Queued), (False, Working), (False, Completed), which we can just represent with a 6-state enum.

Counting finite types

By the way, could we have predicted the pair of Boolean and StatusEnum had 6 elements? Sure! The pair of two finite types with $n$ and $m$ elements respectively has $nm$ elements. This is why pairs are often called product types.

What about Either? If x has $n$ elements and y has $m$, how many elements are there in Either x y? Easy too, there are $n+m$ (each element of Either x y is one of the $n$ elements of x or one of the $m$ of y). Accordingly, we call this a sum type.

We have a type constructor that adds the number of elements in its component types (Either), another that multiplies them (the pair, which we’ll soon call Pair). Any other operations?

Well, this is functional programming, so perhaps the most natural type to form from two other types is the type of functions from one to the other. We’ll call this type (->) x y. How many functions are there from x to y? Just like a pair was formed by picking an element from x and one from y, we form a function by picking and element of y for every element of x, i.e. pick one of $m$ elements $n$ times. There’s $m^n$ possible choices, so that’s also the number of functions from x to y.

I don’t think functions are often called exponential types (probably because we’re busy calling them functions), but that would be an appropriate name!

At this point, we have defined three basic operations on types (sum, product, exponentiation). We can establish a correspondence between types and numbers:

Types Numbers
Void $0$
Unit $1$
Either x y $n+m$
Either Unit UnitBool $1+1=2$
Either x Voidx $n+0=n$
Pair x y $nm$
Pair x Unitx $n\times 1=n$
Pair x VoidVoid $n\times 0=0$
(->) x y $m^n$
(->) x UnitUnit $1^n=1$
(->) x VoidUnit $0^n=0$
(->) Unit xx $n^1=n$
(->) Void xUnit $n^0=1$

Much of this table is just a recap of stuff we’ve talked about, but a few points of note:

  • The relation ‘≈’ means “has the same number of elements, and is one-to-one mappable to”. In math speak, they’re isomorphic.
  • While Either and Pair are commutative (you can exchange their arguments and get an isomorphic result), (->) is not.
  • Either and Pair both have neutral elements (Void and Unit respectively), that is, types that make the operation do nothing (for instance, in Either x Voidx).
  • Saying there is one function of type (->) Void x (which I declare to be 1) can seem strange. It’s mostly a matter of convention since such a function can’t be called anyways, but this matches up with the mathematical definition of a function and has other advantages, as I discuss below.

With that said, you might want to spend a little time making sure you understand all of the equalities in the “Types” column of the table. Most of them should be easy to figure out.

Type interaction

If you look at the table above again, you’ll notice (or perhaps you’ve already noticed) that none of these equalities involve mixing Either, Pair and (->). But we can also get interesting equalities from doing that!

For instance, we have an analogy of distributiveness: just as $n\times(m+p) = n\times m+n\times p$,

Pair x (Either y z) ≈ Either (Pair x y) (Pair x z)

There’s a few interesting identities stemming from multiplication too. We know that $p^{n+m}$ = $p^n\times l^m$. What does this mean for functions? If we translate this to types, we get:

(->) (Either x y) z ≈ Pair ((->) x z) ((->) y z)

That is, defining a function from an x or a y to a z is just like having a function from x to z and another from y to z (and selecting based on the type of the argument).

We also have $p^{mn} = (p^n)^m$. Let’s translate this:

(->) (Pair x y) z ≈ (->) x ((->) y z)

You may recognize this as the definition of Currying, which is how languages like Haskell, Ocaml, and Scala (optionally) represent functions of several arguments. If you haven’t seen this before, here’s how it works. Assume you have a function f that takes an x and a y and returns a z. The curryfied version of this function is a function f' that takes just the x and returns a new function that only needs the y to produce the resulting z.

One of the advantages of currying is that it makes it very natural to create partially-applied functions. For instance, imagine a function split of type (->) (Pair Char String) [String], that splits a string on a character. The curryfied version, split', has type (->) Char ((->) String [String]). This makes it easy to define a new function split_on_commas, that takes a string and returns a list split on each comma, as split' ",".

Morphing

Of course, it is no coincidence that we can translate all of these type formulas into formulas about the integer and get valid results.

A structure that supports addition, multiplication, and exponentiation, has neutral elements for the first two, and respects all of the equations mentioned above is an exponential semiring. Of course, natural numbers are also an exponential semiring.

What we’ve shown above is that the operator $\varphi$ that maps types to integers is an exponential semiring morphism; it preserves the exponential semiring structure when going from types to integers, so any equation you can write with types is preseved through $\varphi$.


Even though they have limited application in actual programming, I find these results very pleasing. In the next post, we’ll look at how we can extend our constructs to build infinite types are built and look at some funny things that happen as a result. As always, comments and feedback are more than welcome!

  1. In real-life Haskell, Unit is called (). This is confusing since () is also the only value of that type, so, depending on context, () can refers to either a type or a value. 

Comments