## Introduction

The goal of this post is to elucidate what type inference is to the working programmer. Specifically, I want to introduce it to imperative programmers who use a typed language regularly, such as C or Java. I also target those who are avid users of untyped/dynamically typed languages, such as Ruby or Python.

## Types Revisited

Let’s sort of review what a type is informally. A **type** is a way to categorize different values (usually called **instances** of that type) based on properties they have. The properties can be rather arbitrary.

Languages usually have some set of **fundamental** or **base types**, like `int`

, `char`

, `float`

, and so on. Then a language will have some sort of way to make your own type, usually a **record type** or **product type**, which would be a `struct`

in C or C++, or to some extent, a class in Java.

There are also **higher order types** — types that take a type as an “argument”. Perhaps the best example is the type `array`

. An array is really nothing if we don’t know the type of its elements. So in fact, the type should really be `array(T)`

, for some type `T`

. C++ users might be accustomed to their template syntax: `array<T>`

.

The last major sort of type is a **union** or **sum type**. These are the typical `union`

types in C; specifically, a union type represents one of many possible types. If a variable `x`

has the union type `union{int i, string s}`

, then `x`

can be in effect either an `int`

or a `string`

.

## Some Types Behind The Scenes

One of the most important types is one that surprisingly many programmers don’t visit too often: the function type. Many programmers don’t see this type because their language doesn’t actively support the notion of first-class functions, and so such programmers never *really* see then as their own thing.

Whenever one declares a function in C or Java, one is actually declaring the type of the function in a roundabout way. Let’s look at an example in a pseudo-Java/C/Pascal language. Note that `x : T`

means “`x`

is of type `T`

” (this syntax is intentional to make the point clearer).

strlen(s : String) : Int { if (s == ""){ return 0; }else{ // if s = "hello", then rest(s) = "ello" return 1 + strlen(rest(s)); } }

If we ask “what is

`s`

”, we’d say “some `String`

”. How about if we ask what `strlen`

is? I hope one would say “a function which takes a `String`

and returns an `Int`

”. Function types capture this very notion. We could actually re-write the above example asstrlen : String → Int // type declaration strlen(s) { //... }

This reads as “

`strlen`

is a function which accepts a `String`

and returns an `Int`

”. The type of `strlen`

is `String → Int`

.
We can do this with multiple arguments too. Consider the function `charAt`

:

`charAt(s : String, n : Int) : Char`

We can write the type of this function as

`charAt : String × Int → Char`

This can be read as “

`charAt`

is a function which accepts a pair consisting of a `String`

and an `Int`

, and returns a `Char`

”. The symbol ‘`×`

’ denotes that it is a *product*type, as said earlier. However, this product type isn’t something we usually can interact with in C++ or Java directly. Nonetheless, if you wish, you can think of

`charAt`

taking a `struct`

or `class`

of some sort, which consists of only a `String`

and an `Int`

. However, this `struct`

is implicit when we call the function.
### A Bit of Advanced Type Tomfoolery

I won’t go into depth about this, but for those who are interested about how we would label functions with types involving arrays or lists or generics in general, here you are.

Consider this C++-esque template function.

`listRef(l : List<T>, n : Integral) : T`

Here,

`Integral`

is an abstract class whose subclasses are essentially different kinds of integer types, maybe `Int`

, `BigInt`

, `Char`

, and whatnot. How do we represent this in the language of types?`listRef : ∀t. ∀s<Integral. List(t) × s → t`

Let’s go through this step-by-step.

`∀t`

means “for all`t`

”; it is a way of stating what`t`

can be in the following declaration. In some typed languages, this is omitted/implicit.`∀s<Integral`

is similar. It means “for all`s`

which are a subclass (*less than*) of`Integral`

”. It merely states that`s`

must be derived from`Integral`

.`List(t)`

means “a list whose elements are type`t`

”. These kinds of types which take other types as arguments are, as said, higher order types, and are also known as**algebraic data types**(adt). If you’re familiar with Lisp or how a list is constructed internally, you know that a list is really either`Nil`

, an empty list, or a pair/struct whose first element is the list item, and the second element is itself a list. We can actually define the type of a list in our little type specification language:

adt ∀t. List(t) ≔ Nil + Cons(t × List(t))

**Read the**; it represents the previously discussed`+`

as “or” here!*sum*or*union type*. This can be seen as “for all types`t`

, a list of`t`

is either a unique object called`Nil`

, or a pair consisting of`t`

and a list of`t`

, and we call this pair`Cons`

”. Don’t get too worked up if this is a little nonsensical. An important aspect of this algebraic data type is that it’s actually recursive.- The
`→ t`

means the function returns an object of type`t`

.

I won’t say much more about this, but I hope those die-hard fans of extreme object-oriented programming made complete sense of this.

## Type Inference

So far, we really haven’t done anything remarkable. All we’ve done is merely reformulated (I hope!) how we think of types, and gave a weirdo notation for them. The notation is useful for this section.

What is type inference? It’s actually an exceedingly simple concept. And to describe precisely what it is, we’ll just go through an example.

Let’s look at this function again, eliding the type information.

strlen(s) { if (s == ""){ return 0; }else{ return 1 + strlen(rest(s)); } }

and let’s write down

**every**function it calls, every (semantically-important) construct it uses, every variable it uses, and every value it uses. I encourage you to not look below, and to try to list them yourself. After, compare against this list.

First, in this list, `{n}`

is just a place holder for arguments. If we have a function `f(x) = x^2`

, we’d write `f({1})`

only to denote where the arguments of the function are syntactically. There is no other significance.

Why isn’t `return`

included? I’m not sure if I can give a succinct explanation, but it really is just a way of saying “I’m assigning this value to the value of this function with the arguments given.” In other words, it doesn’t really *do* anything. If this explanation doesn’t suit your needs, no worries, it’s not really important.

Anyway, now try labeling a type to each of these constructs. *However*, you must only try to determine the types by **only** looking at the list. You cannot use any information from the actual implementation of the function. If you can’t determine a type, just label it an arbitrary letter (I will use `T₁`

, `T₂`

, etc.). (**Hints/Facts**: Call the boolean type `Bool`

. You may assume the only number type is `Int`

. In our case, assume an `if`

*must* return the same type without depending on the branch.)

Now, we are ready to do the type inference. The basic strategy of doing so is to use logic to determine what types must be. We want to attempt to infer, from the program, what

`T₁`

through `T₇`

are.
Here is the program again. The lines are labeled with Roman numerals to make it easier to reference this and the list above in parallel.

I strlen(s) II { III if (s == ""){ IV return 0; V }else{ VI return 1 + strlen(rest(s)); VII } VIII }

**Line III**- We have
`s == ""`

. By #1, we know that both arguments must have the same type. By #6, we know the right side has type`String`

. Therefore,`s`

must be of type`String`

. So we now have a*constraint*on`s`

, which*must*hold true henceforth:**Constraint 1:**.`s : String`

**Line IV**- The first branch of the
`if`

is, by #4, an`Int`

, and by #2, both branches must have the same type. Therefore, we have a constraint on the expression in line VI:**Constraint 2:**. We have more information in fact. Since the entire function is composed of the`1 + strlen(rest(s)) : Int`

`if`

, it returns either of the branches, so we know the return type of`strlen`

. Therefore, we know the entire type of`strlen`

.**Constraint 3:**.`strlen : String → Int`

**Line VI(a)**- We have addition of
`1`

and`strlen(rest(s))`

. For now, let’s just call the latter quantity`X`

. We know by #8 that both arguments to`+`

must be`Int`

. We also know that`1`

is an`Int`

by #5, so it agrees. This also means that`X`

must be an`Int`

too. Since`+`

returns an`Int`

, we’ve**satisfied constraint 2**, and we now have:**Constraint 4:**. This constraint is satisfied by constraint 3.`strlen(rest(s)) : Int`

**Line VI(b)**- Same line. Using constraint 3 and #10, we know that
`rest(s)`

must be of type`String`

. So`rest : ? → String`

. Easily enough, since we know by constraint 1 that`s : String`

, we know that`rest : String → String`

. This would be constraint 5. However, we are done.

Here is a summary of the types we determined.

`(==) : T₁ × T₁ → Bool`

`T₁`

is`String`

`if {1} then {2} else {3} : Bool × T₂ × T₂ → T₂`

`T₂`

is`Int`

(in our particular case, not true generally)

`s : T₃`

`T₃`

is`String`

`0 : Int`

`1 : Int`

`"" : String`

`rest : T₄`

`T₄`

is`String → String`

`+ : Int × Int → Int`

`strlen : T₅`

`T₅`

is`String → Int`

`{1}({2}) : (T₆ → T₇) × T₆ → T₇`

`T₆`

and`T₇`

have two different meanings in our particular case: (1) for the`strlen`

invocation and (2) for`rest`

invocation. This does not mean`T₆`

and`T₇`

are equal in both of these cases. Technically we’d have a`T₈`

and`T₉`

for each case separately (since function invocation works*generally*).

So we’ve used the type system to make logical deductions. This duality between types and logic has a name: the **Curry-Howard isomorphism** or **Curry-Howard correspondence**.

## The Meaning of It All

So what was the point of all of this? What do we achieve?

First, we can see that we elided types, but our program provided enough information to determine the types of everything completely. If we had some sort of algorithm for this type of thing, we could elide types for almost all of our function definitions, but *our program could still be checked correct, *and* be optimized.* Notice how when we went through those steps to determine the types of everything, we generated constraints and such, and a constraint could never ever be broken. This means there was some sanity checking going on, making our program safer.

If such an algorithm existed, think of the great possibilities:

- We could remove cumbersome type annotations, and make our programs comparable to the length of dynamically typed programs.
- Types could be determined statically — at compile time — which means optimizations can be done (specializing for a particular type) automatically.
- The compile-time checking also allows us to debug. If there’s a constraint not being satisfied, then we probably are doing The Wrong Thing®, like adding an integer to a boolean.

An algorithm would give us these things for practically *free*, with the only limitation being that our programs must obey a few more rules (e.g., not being able to do `x ← 5`

and then later doing `x ← True`

, since `Int`

and `Bool`

don’t match).

Fortunately (!), there is an algorithm to do this, thanks to some visionaries of the lambda calculus Haskell Curry and Robert Feys who began work in inference, J. Roger Hindley who proved some of Curry’s and Fey’s work, the great Robin Milner (1934–2010, R.I.P.) who independently devised a computer algorithm for inference, called Algorithm W, and Luis Damas who proved Milner’s algorithm was correct. The algorithm is called the **Hindley-Milner type inference algorithm**, sometimes simply known as **HM**.

In the previous section, we actually essentially performed HM inference by hand! We first generated type variables and got all the information we **know** about each individual element, then we generated constraints, or **type equations** (but very implicitly; the equations we “generated” were things like “foo must be bar” which is equivalent to saying “foo = bar”). Finally, we used Curry-Howard to run through the type equations, match things up, and **solve the type equations**. This phase of “solving” is formally called **unification**.

## Beyond Hindley-Milner

HM is a solid foundation to inferring types for, especially, lambda calculus-based languages. We can make all sorts of extensions to it, such as adding extensions for polymorphic types, type classes, some dependent typing, etc. But the issue is, as soon as we add something, type checking becomes a lot more difficult and harder to *prove*. It is known that it is *impossible* to devise an algorithm which can infer types in a system in which there are arbitrary subtypes. It is also known it is impossible to add type inference to a system with unconstrained dependent types. So there are definitely limitations from a theoretical standpoint. Nonetheless, from a practical standpoint, HM has been successful in many programming languages, most notable Standard ML, which Milner himself worked on. Other good ones include Haskell, OCaml, Clean, Miranda, and others. We can only hope we’ll be seeing more inference in non-functional languages. They could certainly use it!

[...] This post was mentioned on Twitter by HN Firehose, newsery5 and SrinivasaKettenbruch, '(q u a d). '(q u a d) said: Just published "Type Inference For the Uninformed" http://symbo1ics.com/blog/?p=959 #symbo1ics [...]

“strlen : Int → String // type declaration

…

This reads as “strlen is a function which accepts an Int and returns a String”. The type of strlen is Int → String.”

I believe you meant String → Int in both cases.

Yes, it was a typo of the late-night kind. Thanks, it’s fixed now.

How to best explain to a programmer whose whole career has been expended on popular imperative languages, how a language with type inference can help them work better? – Quora

[...] I’ve answered this question in this blog post: http://symbo1ics.com/blog/?p=959 8:52pm[...]

Most accessible explanation of Hindley-Milner I have found so far. Thank you. I have a question. You say:

This is convenient for the example but isn’t guaranteed to be true in a real program in the wild. I guess in this case if it isn’t true you can’t resolve all the types?

Actually, a true and correct program really should have the same type for each branch. Consider a C fragment program:

Here, the branches of this ternary-if must have the same type.

The confusion comes from the fact that C’s if statement is just that: a statement. If statements themselves don’t truly return values, so really they cannot have types.

In my example in the post, since each if does indeed have a return, I treat the if-statement as if it does return values, and therefore can be typed.