Haskell's kind system - a primer

This post is available in Chinese here, thanks to JulyTreee.

In this post we’ll explore Haskell’s kind system, the similarities between types and kinds, and show how these can be leveraged to write safer and more reusable code.

Here’s what’s in store for today:

NB: this article was written with GHC 8.4.3 in mind.

Types and Kinds

Simply put:

Just like values/terms can be classified into types, types can be classified into kinds.

The values "hello" and "world" are of type String. The values True and False are of type Bool. Similarly, the types String and Bool are of kind *, pronounced “star”.

State type

Just like we can use :t/:type in GHCi to check the type of a term, we can use :k/:kind to check the kind of a type.

λ> :t True
True :: Bool

λ> :k Bool
Bool :: *

In standard Haskell, all inhabited types (types that have at least 1 value) are of kind *. So Int, Int -> String, [Int], Maybe Int, Either Int Int are all of kind * because there’s at least one term for each of these types1.

Not all types are inhabited though. Maybe and Either, for example, are uninhabited. There is no term of type Maybe, not even the infinite loop!

λ> x = undefined :: Maybe
<interactive>:9:18: error
     Expecting one more argument to Maybe

λ> f x = f x :: Maybe
<interactive>:10:14: error:
     Expecting one more argument to Maybe

Well, if Maybe and Either are not inhabited types, what are they then? They’re type constructors.

Data constructors and type constructors

Just like we have data constructors for creating data, we also have type constructors for creating types.

λ> data Person = MkPerson { name :: String, age :: Int }

λ> :t MkPerson
MkPerson :: String -> Int -> Person

MkPerson is a data constructor that, given two values name and age of type String and Int, creates another value of type Person. In other words, MkPerson is a value of type String -> Int -> Person.

λ> data Either a b = Left a | Right b

λ> :k Either
Either :: * -> * -> *

Similarly, Either is a type constructor that, given two types a and b of kind *, creates another type of kind *. In other words, Either is a type of kind * -> * -> *.

Just like data constructors are curried and can be partially applied, so can type constructors.

λ> :t MkPerson
MkPerson :: String -> Int -> Person
λ> :t MkPerson "Diogo"
MkPerson "Diogo" :: Int -> Person
λ> :t MkPerson "Diogo" 29
MkPerson "Diogo" 29 :: Person

MkPerson data constructor

λ> :k Either
Either :: * -> * -> *
λ> :k Either String
Either String :: * -> *
λ> :k Either String Int
Either String Int :: *

Either type constructor

Type signatures and kind signatures

Just like GHC is usually able to correctly infer the types of variables, it is also usually able to correctly infer the kinds of type variables.

-- The inferred type of `x` is `Bool`
x = True

-- The inferred type of `y` is `String -> IO ()`
y = putStrLn
-- The inferred kind of `a` is `*`
data List a = Cons a (List a) | Nil

-- The inferred kind of `f` is `* -> *`
-- The inferred kind of `a` and `b` is `*`
class Functor f where
  fmap :: (a -> b) -> (f a -> f b)

And just like you can manually specify a variable’s type, you can also manually specify a type variable’s kind using the KindSignatures extension.

x :: Bool
x = True

y :: String -> IO ()
y = putStrLn
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE ExplicitForAll #-}

data List (a :: *) = Cons a (List a) | Nil

class Functor (f :: * -> *) where
  fmap :: forall (a :: *) (b :: *). (a -> b) -> (f a -> f b)

The ExplicitForAll extension allows us to define each type variable explicitly, which I did here for clarity.

HOFs and HKTs

Just like we can have higher-order functions (HOFs), functions that take other functions as arguments, we can also have higher-kinded types (HKTs), types constructors that take other type constructors as arguments.

λ> :t map
map :: (a -> b) -> [a] -> [b]

Here, map is a function that takes another function of type (a -> b) and a list of type [a].

λ> data NonEmpty f a = MkNonEmpty { head :: a, tail :: f a }

λ> :k NonEmpty
NonEmpty :: (* -> *) -> * -> *

Similarly, NonEmpty is a type constructor that takes another type constructor of kind (* -> *) and a type of kind *.

When applied to [] and Bool, we obtain the type NonEmpty [] Bool, a list of boolean values that is guaranteed to have at least one value.

λ> :t MkNonEmpty True [False, True]
MkNonEmpty True [False, True] :: NonEmpty [] Bool

We can apply this type constructor to any two types, so long as their kinds match the expected kinds, e.g.: NonEmpty [] Int, NonEmpty Tree String or NonEmpty Vector Char.

Other kinds

Unboxed/unlifted types

Remember when I said all inhabited types are of kind *? Allow me to rephrase that: in standard Haskell, * is the kind of all inhabited boxed (or lifted) types. However, in GHC’s version of Haskell, there are also some inhabited types that don’t fall under this umbrella: unboxed (or unlifted) types.

These are defined in the GHC.Prim module from the ghc-prim package. By convention, all unlifted types end with a #, called the magic hash, enabled by the MagicHash extension. Examples include Char# and Int#. You can even have unboxed tuples (# a, b #) and unboxed sums (# a | b #)!

Each unlifted type has a kind that describes its runtime representation. Is this a pointer to something in the heap? Is it a signed/unsigned word-sized value? The compiler then uses that type’s kind to decide which machine code it needs to produce - this is called “kind-directed compilation”.

Here are some examples:

Unboxed types and kinds

We’ll come back to this TYPE thing later on.

Constraints

There’s also the Constraint kind that covers everything that can appear to the left of an => arrow, including typeclass constraints:

λ> :k Show
Show :: * -> Constraint

λ> :k Show Int
Show Int :: Constraint

λ> :k Functor
Functor :: (* -> *) -> Constraint

> :k Functor IO
Functor IO :: Constraint

With the ConstraintKinds extension, we can use constraints as first-class citizens, and that’s amazingly useful. If you’ll recall, Set famously does not have an instance for the Functor typeclass. The reason being fmap :: (a -> b) -> f a -> f b must work for all types a and b, but Set.map only works for “orderable” types, i.e. it has an Ord constraint.

λ> :t Data.Set.map
Data.Set.map :: Ord b => (a -> b) -> Set a -> Set b

With ConstraintKinds, we can write a more generic Functor typeclass that abstracts over any constraint that may or may not exist.

{-# LANGUAGE ConstraintKinds #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE FunctionalDependencies #-}

import Data.Kind (Constraint)
import qualified Data.Set as Set
import qualified Data.List as List

class GFunctor (c :: * -> Constraint) (f :: * -> *) | f -> c where
  gfmap :: c b => (a -> b) -> (f a -> f b)

instance GFunctor Ord Set.Set where
  gfmap = Set.map

instance GFunctor EmptyConstraint [] where
  gfmap = List.map

Where EmptyConstraint is a typeclass constraint that’s trivially satisfied by all types.

{-# LANGUAGE FlexibleInstances #-}

class EmptyConstraint a
instance EmptyConstraint a

Datatype promotion

In standard Haskell, the data keyword lets us define a custom type/type constructor, followed by a set of data constructors:

data ConnectionStatus = Open | Closed

ConnectionStatus type

But when GHC’s DataKinds extension is enabled, the data keyword creates two additional things: a custom kind, followed by a set of types/type constructors.

{-# LANGUAGE DataKinds #-}

data ConnectionStatus = Open | Closed

In the code above, a new kind ConnectionStatus is created. This kind has exactly two uninhabited types, 'Open and 'Closed.

ConnectionStatus kind

We say the type ConnectionStatus has been promoted to a kind, and Open and Closed to types. Notice how the promoted types 'Open and 'Closed are prefixed with a tick. This tick can almost always be omitted, and only in rare scenarios will you need it for disambiguation.

With our new custom kind, we can define a type variable that can only ever be instantiated to Open or Closed, and nothing else.

{-# LANGUAGE KindSignatures #-}

data Connection (s :: ConnectionStatus) = MkConnection ...

newConnection     :: Address           -> Connection Closed
openConnection    :: Connection Closed -> Connection Open
closeConnection   :: Connection Open   -> Connection Closed
connectionAddress :: Connection s      -> Address
λ> :k Connection Int
<interactive>:1:12: error:
     Expected kind ConnectionStatus, but Int has kind *

By tagging the connection with its status, we can statically enforce rules such as “closeConnection cannot be called on an already closed connection”.

This doesn’t work just for our own custom data types though. In the presence of DataKinds, Bool, for example, gets promoted to a kind as well.

λ> data F (b :: Bool) = MkF deriving Show

λ> MkF :: F 'True
MkF

λ> MkF :: F 'False
MkF

λ> MkF :: F 'Open
<interactive>:30:10: error:
     Expected kind Bool, but  'Open has kind ConnectionStatus

GHC.TypeLits

GHC provides two other very convenient kinds out of the box, hidden in the GHC.TypeLits module in base.

The first is Symbol, which is the kind for type-level strings. It lets us use string literals like "hello" as a type.

Symbol kind

With it, we can tag a type with a string literal.

{-# LANGUAGE KindSignatures, DataKinds #-}

import GHC.TypeLits (Symbol)
import Data.Ratio ((%))

newtype Money (currency :: Symbol) = Money Rational

fivePence :: Money "GBP"
fivePence = Money (5 % 100)

twoEuros :: Money "EUR"
twoEuros = Money 2

Having the currency represented at the type-level, rather than at the term-level (like in data Money = Money String Rational), lets us statically ensure that monies of different currencies don’t get mixed up, e.g. we can’t add EUR and GBP together.

add :: Money c -> Money c -> Money c
add (Money x) (Money y) = Money (x + y)
λ> add fivePence fivePence
Money (1 % 10)

λ> add fivePence twoEuros
<interactive>:8:15: error:
     Couldn't match type "EUR" with "GBP"

If we had the currency at the term-level instead, we’d have to do runtime checks and every function would have to signal the possibility of failure by returning Maybe, Either, MonadError e m, etc.

Little interesting fact: this is more than just a contrived example. This is exactly how the safe-money library represents dense monetary values!

When we don’t need to perform arithmetic operations on money, storing the value as a Rational is a bit overkill. For these scenarios, the library provides a different type, Discrete, which is a simple wrapper around an Integer.

Here’s the catch though: for reasons explained in safe-money’s introductory blog post, when representing monetary values as integers, keeping track of the currency is not enough. We need to keep track of the scale as well. When dealing with whole US dollars, the scale would be 1 to 1. When dealing with cents, the scale would be 100 to 1 (because there are 100 cents in a dollar).

In order to represent the scale as a type, we need Nat, the kind for type-level natural numbers, also exported from GHC.TypeLits. Similar to Symbol, it lets us use numeric literals as types.

{-# LANGUAGE KindSignatures, DataKinds #-}

import GHC.TypeLits (Symbol, Nat)

newtype Discrete (currency :: Symbol) (scale :: (Nat, Nat))
  = Discrete Integer

oneDollar :: Discrete "USD" '(1, 1)
oneDollar = Discrete 1

oneDollarThirtyCents :: Discrete "USD" '(100, 1)
oneDollarThirtyCents = Discrete 130

Yes, you saw that right. In scale :: (Nat, Nat), (,) is the tuple type promoted to a kind via DataKinds. And in '(100, 1), '(,) is the tuple data constructor promoted to a type constructor.

Kind polymorphism

Parametric polymorphism is ubiquitous in Haskell, it helps us abstract over types. With the PolyKinds extension, we can go even higher and abstract over kinds as well!

Let’s look at an example.

data Proxy a = MkProxy

Here, Proxy contains a type variable a. It has a single data constructor with no arguments, which we can then tag with a type variable. Proxy is very useful for passing around types when we have no values at hand2.

λ> intRepresentative = MkProxy :: Proxy Int

λ> stringRepresentative = MkProxy :: Proxy String

The problem is, by default, GHC will assume a is of kind *, which means Proxy doesn’t reeaally work for all types - just lifted ones.

λ> :k Proxy
Proxy :: * -> *

λ> maybeRepresentative = MkProxy :: Proxy Maybe
<interactive>:9:38: error:
     Expecting one more argument to Maybe
      Expected a type, but Maybe has kind * -> *

Normally, we’d have to create a bunch of Proxy-like types for every possible kind, like this:

data Proxy a = MkProxy
data Proxy1 (a :: * -> *) = MkProxy1
data Proxy2 (a :: * -> * -> *) = MkProxy2
data HKProxy1 (a :: (* -> *) -> * -> *) = MkHKProxy1

stringRepresentative   = MkProxy    :: Proxy String
maybeRepresentative    = MkProxy1   :: Proxy1 Maybe
eitherRepresentative   = MkProxy2   :: Proxy2 Either
nonEmptyRepresentative = MkHKProxy1 :: HKProxy1 NonEmpty

This is obviously unsustainable. But with the PolyKinds extension enabled, we can create a Proxy that works for all types a and for all kinds k.

{-# LANGUAGE PolyKinds #-}
{-# LANGUAGE DataKinds #-}

data Proxy (a :: k) = MkProxy

-- Now `a` can be anything at all
maybeRepresentative    = MkProxy :: Proxy Maybe
nonEmptyRepresentative = MkProxy :: Proxy NonEmpty
functionRepresentative = MkProxy :: Proxy (->)
helloRepresentative    = MkProxy :: Proxy "hello"
showRepresentative     = MkProxy :: Proxy Show
functorRepresentative  = MkProxy :: Proxy Functor
openRepresentative     = MkProxy :: Proxy 'Open

In fact, we could simply omit the kind signature altogether and GHC would still infer a :: k.

λ> :set -XPolyKinds
λ> data Proxy a = MkProxy

λ> :k Proxy
Proxy :: k -> *

Another instance of kind polymorphism being used in the wild can be found in Servant, a library for writing type-safe web APIs. It has a type named :> that’s used to combine components of an API into a complete description of an endpoint. These components can vary in type and in kind.

{-# LANGUAGE PolyKinds #-}
{-# LANGUAGE TypeOperators #-}

data (:>) (a :: k) (b :: *)
infixr 4 :>

This is how you’d describe an endpoint that responds to POST /books. We use :> to combine the types "books" :: Symbol, ReqBody '[JSON] Book :: * and Post '[JSON] () :: *.

type BooksAPI = "books" :> ReqBody '[JSON] Book :> Post '[JSON] ()

Levity polymorphism

In some cases, it may be useful to abstract over lifted and unlifted types. For example, consider the error “function”, which takes an error message and throws an exception:

{-# LANGUAGE ExplicitForAll #-}

error :: forall a. String -> a

error is polymorphic in its return type, so it can be used (almost) anywhere.

increment :: Int -> Int
increment x = error "oops"

But because a is (by default) assumed to be a type of kind *, we can’t use it in a place where an unlifted type is expected!

incrementUnlifted :: Int# -> Int#
incrementUnlifted x = error "oops"
<interactive>:22-40: error
    Couldn't match a lifted type with an unlifted type

Armed with our new knowledge, our first attempt might be to turn on PolyKinds and introduce a kind variable k, but that won’t work either.

error :: forall k (a :: k). String -> a
/Playground.hs:15:40: error:
     Expected a type, but a has kind k
     In the type signature: error :: forall k (a :: k). String -> a

The reason this doesn’t compile is that it doesn’t make sense for a :: k to be anything other than an inhabited type. Say k was Symbol and a was "hello". If the "hello" type is uninhabited, what value could such a function possibly return?

To make the error function work for all inhabited types, both lifted and unlifted, we need something called levity polymorphism.

The trick resides in the kind TYPE r we saw earlier. This kind is parameterised over r :: RuntimeRep, which describes a type’s runtime representation and can be one of the following:

data RuntimeRep = VecRep VecCount VecElem   -- ^ a SIMD vector type
                | TupleRep [RuntimeRep]     -- ^ An unboxed tuple of the given reps
                | SumRep [RuntimeRep]       -- ^ An unboxed sum of the given reps
                | LiftedRep       -- ^ lifted; represented by a pointer
                | UnliftedRep     -- ^ unlifted; represented by a pointer
                | IntRep          -- ^ signed, word-sized value
                | WordRep         -- ^ unsigned, word-sized value
                | Int64Rep        -- ^ signed, 64-bit value (on 32-bit only)
                | Word64Rep       -- ^ unsigned, 64-bit value (on 32-bit only)
                | AddrRep         -- ^ A pointer, but /not/ to a Haskell value
                | FloatRep        -- ^ a 32-bit floating point number
                | DoubleRep       -- ^ a 64-bit floating point number

We already saw how TYPE 'IntRep is the kind of unlifted integers, TYPE 'FloatRep is the kind of unlifted floats, etc. But there’s something here that stands out: TYPE 'LiftedRep. This is the kind for all lifted types - in fact, the * kind we’ve been using all along is nothing more than a synonym for TYPE 'LiftedRep!

So it turns out we can use TYPE r not only to abstract over all unlifted types, but also over lifted ones.

{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE ExplicitForAll #-}

import GHC.Exts (TYPE, RuntimeRep)

error :: forall (r :: RuntimeRep) (a :: TYPE r). String -> a

And now we can use error in both increment and incrementUnlifted.

Levity polymorphism has its restrictions - there are some places where it can’t be used. To learn where and why, watch Richard Eisenberg’s amazing talk on the subject.

Wrap-up

By now you might be reasonably wondering: what’s the point of all this? Why have a kind system, why even care about kinds? Well, in languages where types cannot be classified at all, like Java or C#, all type variables <T> are of kind *. You can’t have a type variable of kind * -> *, like in Functor<List>. Without being able to talk about kinds, we can’t define abstractions like functors and monads, or even the NonEmpty f a data type we saw earlier. Furthermore, with levity polymorphism, we can write abstractions that work with both boxed and unboxed types, like this generic version of the Num a typeclass, as opposed to Java where generic type variables can only ever be boxed types.

Having a kind system also opens the door to a world of type-level programming (which I hope to get more into in my next blog post), enabling things like extensible records, Servant’s type-level web APIs, and just safer APIs in general, like Connection s or Money c.

Exploring the frontier of research, Simon Peyton Jones recently discussed adding linearity to Haskell, which I imagine would mean the function type (->) a b would have an additional type variable l of a new kind Linearity, composed of the types Omega and One.

Another question you might be asking is, if terms are classified into types, and types into kinds, are kinds classified into something else? Well, in GHC 7 and earlier, kinds were classified into sorts. All kinds had the unique sort BOX, which was internal to GHC and invisible to us, the developers.

But in version 8, GHC went a different way. As you might have noticed, in this article I tried to expose the similarities between types and kinds: both can be higher-order, polymorphic, inferred, curried, etc. This is not a coincidence! With the advent of TypeInType, an extension introduced in GHC 8, types and kinds (and sorts) became one and the same! Types can now be classified by other types. 3 is of type Int, Int is of type *, and * is of type *. This unification is meant to pave the way for full dependent types in Haskell3.

Also, in more recent literature, you might see the kind * being referred to as Type (do not confuse with TYPE r). These are synonyms for now, and the plan is to gradually phase out * in favour of Type.

  1. A notable exception is data Void, which cannot be constructed. There are no values of this type per se, and it’s only inhabited by terms that never complete successfully, such as undefined or the infinite loop f x = f x

  2. Proxy is already defined in the Data.Proxy module in base, we’re defining our own here to show the logic behind its kind. To read more about Data.Proxy and its use cases, check out Kwang Seo’s blog post

  3. If you want to learn more about dependent types, I recommend reading Type-Driven Development with Idris or the recently released The Little Typer

Comments

Want to leave a comment? Visit this post's issue page on GitHub.