This post is about interesting properties of types in programming languages. It assumes you know about basic functional programming and types. A bit of Haskell syntax can be helpful but isn’t required.
Building finite types
Imagine a Haskell-like language that only had two primitive types, Void
and Unit
. Void
is a type with no elements, and Unit
is a type with a single element, ()
1In real-life Haskell, Unit
is called ()
. This is confusing since ()
is also the only value of that type, so, depending on context, ()
can refers to either a type or a value.. That’s a little light for serious programming. Most programs, for instance, will want a boolean type.
So let’s give ourselves a little tooling to climb out of the hole. Assume x
and y
are any two types. Then Either x y
is a type whose values are all values of the form Left <X>
, where <X>
is a value of type x
(Left
is just here to act as a kind of tag) and the elements of the form Right <Y>
, where <Y>
is a value of type y
. For instance:
- Some elements of the type
Either Int Bool
are the valuesLeft -4
,Right False
,Right True
andLeft 2
. - Some elements of the type
Either Float Float
are the valuesLeft 0.5
,Left 1.0
andRight 1.0
(the latter two being different).
Remember that we don’t yet have Int
or Bool
at our disposal! I’m just using them here to hopefully make the examples clearer.
Although it doesn’t change the expressiveness of what we’re describing, we’ll also allow the structure type MyType = <some type definition>
which allows us to create a type alias for brevity.
With this out of the way, we can now make a boolean type:
type Boolean = Either Unit Unit
let true = Left ()
let false = Right ()
Here’s the not
function for this type:
not :: Boolean -> Boolean
not Left () = Right ()
not Right () = Left ()
In fact, we can generalize this to create types with any finite amount of values (that is, enum types). We simply need to nest Eithers
. Here’s how you can create a three-state enum of Queued
, Working
and Completed
(which might represent the state of a task in a work queue):
type StatusEnum = Either Unit (Either Unit Unit)
let Queued = Left ()
let Working = Right (Left ())
let Completed = Right (Right ())
(Of course, none of this is very practical: used in a real system, it’d be as painfully slow to write as it would be to run.)
Notice that infinite types are out of our reach, unless we allow ourselves to write down infinite definitions. Of course, since we’d like our programs, theoretical as they may be, to terminate, we won’t allow this.
One thing functional programmers like, besides writing tutorial on monads, is to create pair (aka product) types. For instance, they might want to define the type of pairs of booleans and statuses; this could represent the combination of whether a task is high-priority or not and what its current status is.
We can’t directly define pairs with our Either
, but fortunately, there’s just a finite number of possible pairs of booleans and statuses! These are (True, Queued)
, (True, Working)
, (True, Completed)
, (False, Queued)
, (False, Working)
, (False, Completed)
, which we can just represent with a 6-state enum.
Counting finite types
By the way, could we have predicted the pair of Boolean
and StatusEnum
had 6 elements? Sure! The pair of two finite types with $n$ and $m$ elements respectively has $nm$ elements. This is why pairs are often called product types.
What about Either
? If x
has $n$ elements and y
has $m$, how many elements are there in Either x y
? Easy too, there are $n+m$ (each element of Either x y
is one of the $n$ elements of x
or one of the $m$ of y
). Accordingly, we call this a sum type.
We have a type constructor that adds the number of elements in its component types (Either
), another that multiplies them (the pair, which we’ll call Pair
). Any other operations?
Well, this is functional programming, so perhaps the most natural type to form from two other types is the type of functions from one to the other. We’ll call this type (->) x y
. How many functions are there from x
to y
? Just like a pair was formed by picking an element from x
and one from y
, we form a function by picking and element of y
for every element of x
, i.e. pick one of $m$ elements $n$ times. There’s $m^n$ possible choices, so that’s also the number of functions from x
to y
.
I don’t think functions are often called exponential types (probably because we’re busy calling them functions), but that would be an appropriate name!
At this point, we have defined three basic operations on types (sum, product, exponentiation). We can establish a correspondence between types and numbers:
Types | Numbers |
---|---|
Void |
$0$ |
Unit |
$1$ |
Either x y |
$n+m$ |
Either Unit Unit ≈ Bool |
$1+1=2$ |
Either x Void ≈ x |
$n+0=n$ |
Pair x y |
$nm$ |
Pair x Unit ≈ x |
$n\times 1=n$ |
Pair x Void ≈ Void |
$n\times 0=0$ |
(->) x y |
$m^n$ |
(->) x Unit ≈ Unit |
$1^n=1$ |
(->) x Void ≈ Unit |
$0^n=0$ |
(->) Unit x ≈ x |
$n^1=n$ |
(->) Void x ≈ Unit |
$n^0=1$ |
Much of this table is just a recap of stuff we’ve talked about, but a few points of note:
- The relation ‘≈’ means “has the same number of elements, and is one-to-one mappable to”. In math speak, they’re isomorphic.
- While
Either
andPair
are commutative (you can exchange their arguments and get an isomorphic result),(->)
is not. Either
andPair
both have neutral elements (Void
andUnit
respectively), that is, types that make the operation do nothing (for instance, inEither x Void
≈x
).- Saying there is one function of type
(->) Void x
(which I declare to be 1) can seem strange. It’s mostly a matter of convention since such a function can’t be called anyways, but this matches up with the mathematical definition of a function and has other advantages, as I discuss below.
With that said, you might want to spend a little time making sure you understand all of the equalities in the “Types” column of the table. Most of them should be easy to figure out.
Functor interaction
If you look at the table above again, you’ll notice (or perhaps you’ve already noticed) that none of these equalities involve mixing Either
, Pair
and (->)
. But we can also get interesting equalities from doing that!
For instance, we have an analogy of distributiveness: just as $n\times(m+p) = n\times m+n\times p$,
Pair x (Either y z) ≈ Either (Pair x y) (Pair x z)
There’s a few interesting identities stemming from multiplication too. We know that $p^{n+m}$ = $p^n\times l^m$. What does this mean for functions? If we translate this to types, we get:
(->) (Either x y) z ≈ Pair ((->) x z) ((->) y z)
That is, defining a function from an x
or a y
to a z
is just like having a function from x
to z
and another from y
to z
(and selecting based on the type of the argument).
We also have $p^{mn} = (p^n)^m$. Let’s translate this:
(->) (Pair x y) z ≈ (->) x ((->) y z)
You may recognize this as the definition of Currying, which is how languages like Haskell, Ocaml, and Scala (optionally) represent functions of several arguments. If you haven’t seen this before, here’s how it works. Assume you have a function f
that takes an x
and a y
and returns a z
. The curryfied version of this function is a function f'
that takes just the x
and returns a new function that only needs the y
to produce the resulting z
.
One of the advantages of currying is that it makes it very natural to create partially-applied functions. For instance, imagine a function split
of type (->) (Pair Char String) [String]
, that splits a string on a character. The curryfied version, split'
, has type (->) Char ((->) String [String])
. This makes it easy to define a new function split_on_commas
, that takes a string and returns a list split on each comma, as split' ","
.
Morphing
Of course, it is no coincidence that we can translate all of these type formulas into formulas about the integer and get valid results.
A structure that supports addition, multiplication, and exponentiation, has neutral elements for the first two, and respects all of the equations mentioned above is an exponential semiring. Of course, natural numbers are also an exponential semiring.
What we’ve shown above is that the operator $\varphi$ that maps types to integers is an exponential semiring morphism; it preserves the exponential semiring structure when going from types to integers, so any equation you can write with types is preseved through $\varphi$.
Even though they have limited application in actual programming, I find these results very pleasing. In the next post, we’ll look at how we can extend our constructs to build infinite types and look at some funny things that happen as a result. As always, comments and feedback are more than welcome!
-
In real-life Haskell,
Unit
is called()
. This is confusing since()
is also the only value of that type, so, depending on context,()
can refers to either a type or a value. ↩
Comments