Typeclasses in perspective
Typeclasses are typically taught by drawing parallels with method overloading (another form of ad hoc polymorphism), and Java-like interfaces (a form of subtyping polymorphism).
Even though typeclasses are a concept in their own right, it’s only natural to want to compare them to other familiar constructs. This can lead to a lot of confusion, and the line that separates these things can be blurry.
So in this post I’ll try to focus on their differences and show that their similarities are only superficial. These are the points I’ll be covering:
Code samples are included in Haskell and Scala (using the Ammonite repl).
NB: this post assumes the reader is already at least vaguely familiar with what typeclasses are. If not, here (Haskell) and here (Scala) are great starting points.
Extensibility
Say we’re trying to abstract over all the types that can be JSON encoded.
In Java or C#, one might reasonably start by defining an interface like this:
Java
interface Encodable {
Json toJson();
}
And implementing it for a class Person
is quite straightforward:
Java
class Person implements Encodable {
public String getName() { ... }
public Integer getAge() { ... }
public Json toJson() { ... }
}
So now all we have to do is make every encodable type extend our interface.
But that’s not always possible - we can’t make types that we don’t control extend our interface.
Integer
, for example, is defined in the standard library, we can’t just go and change it!
Typeclasses, on the other hand, give us the freedom to add this behaviour to types outside our control.
Haskell
class Encodable a where
toJson :: a -> Json
instance Encodable Int where
toJson i = _
Scala
trait Encodable[A] {
def toJson(a: A): Json
}
object Encodable {
implicit val encodableInt = new Encodable[Int] {
def toJson(i: Int) = ???
}
}
Conditional implementation
Let’s keep going with our example. Say we now want to serialize lists.
We know we can’t change the built-in List<A>
type, but let’s pretend we can for a second.
Java
class List<A> implements Encodable {
public Json toJson() { ... }
}
Turns out, it’s impossible to implement the toJson
method for List<A>
, because not all lists are encodable.
We can only encode lists of other encodable things, like List<Person>
, List<String>
or List<Boolean>
.
But serializing a List<DbConnection>
, or a List<Function<String, String>>
is nonsense!
What we want to express is this: a List<A>
is encodable so long as A
is encodable.
And a Pair<A, B>
is encodable so long as A
and B
are encodable.
But the Java type system is not expressive enough, it doesn’t let us express this.
Typeclasses make this trivial.
Haskell
instance Encodable a => Encodable [a] where
toJson list = _
instance (Encodable a, Encodable b) => Encodable (a, b) where
toJson (a, b) = _
Scala
implicit def encodableList[A : Encodable] =
new Encodable[List[A]] {
def toJson(list: List[A]) = ???
}
implicit def encodablePair[A : Encodable, B : Encodable] =
new Encodable[(A, B)] {
def toJson(pair: (A, B)) = ???
}
Clearly, subtyping polymorphism just won’t do in this instance. Let’s try something different.
Most languages implement a simple form of ad hoc polymorphism generally called function/method/operator overloading. With it, we can write functions that work on some types, and behave slightly differently for each one.
Java
Json toJson(String s) { ... }
Json toJson(Integer i) { ... }
<A> Function<List<A>, Json> toJson(Function<A, Json> fa) { ... }
<A, B> Function<Pair<A, B>, Json> toJson(Function<A, Json> fa, Function<B, Json> fb) { ... }
This seems to work well enough! We can now convert built-in types to json, and we can conditionally convert lists to json if and only if their elements are also convertible. We do have to bear the burden of:
- explicitly composing
toJson
s together (e.g.toJson(this::toJson, this::toJson)
), - explicitly passing these functions around whenever we need to abstract over “encodable things”.
We can sorta kinda maybe live with it, as long as we keep our type-fu to a minimum. Dealing with even slightly more complex types, like lists of pairs, will not be fun.
Return-type polymorphism
But can we do the same with fromJson
, a function that parses a Json
?
Let’s give it a go.
Java
Optional<String> fromJson(Json s) { ... }
Optional<Integer> fromJson(Json s) { ... }
Sadly, this doesn’t compile. Java and most C-like languages lack what’s called return-type polymorphism. Overloads can have different parameters, but they cannot differ (solely) in their return types.
With typeclasses, all we have to do is take Encodable
and flip the arrow:
Haskell
{-# LANGUAGE OverloadedStrings #-}
import Data.Text (Text)
class Decodable a where
fromJson :: Json -> Maybe a
instance Decodable Int where
fromJson json = Just 5
instance Decodable Text where
fromJson json = Just "hello"
Scala
trait Decodable[A] {
def fromJson(j: Json): Option[A]
}
object Decodable {
def fromJson[A](j: Json)(implicit A: Decodable[A]) =
A.fromJson(j)
implicit val decodableInt =
new Decodable[Int] {
def fromJson(j: Json) = Some(5)
}
implicit val decodableString =
new Decodable[String] {
def fromJson(j: Json) = Some("hello")
}
}
Type inference will in most cases be able to disambiguate between overloads (more so in Haskell than in Scala), but on the off-chance it can’t, there’s nothing a little type annotation can’t fix.
Haskell
λ> fromJson json :: Maybe Int
Just 5
λ> fromJson json :: Maybe Text
Just "hello"
Scala
@ import Decodable._
@ fromJson[Int](json)
res0: Some[Int] = Some(5)
@ fromJson[String](json)
res1: Some[String] = Some("hello")
Another issue with using method overloading as a means of abstraction is that it quickly falls over when our abstraction has more than 1 method.
As an exercise, try to think how Decodable
would look like if defined as a Java-like interface.
Relations between types
Both typeclasses we’ve looked at so far have one type parameter, a
.
In these examples, typeclasses are used to describe that type: a
can be turned into Json
, or a
can be obtained from a Json
.
However, typeclasses can have multiple type parameters. These not only describe types, they describe relations between types.
We can, for example, define a typeclass that describes two types a
and b
, such that any value
of type a
can be safely, losslessly converted to a value of type b
.
Haskell
{-# LANGUAGE MultiParamTypeClasses #-}
import Data.Int
class Cast a b where
cast :: a -> b
instance Cast Int Int64 where
cast x = fromIntegral x
instance Cast a b => Cast [a] [b] where
cast xs = map cast xs
Scala
trait Cast[A, B] {
def cast(a: A): B
}
object Cast {
implicit val castIntLong = new Cast[Int, Long] {
def cast(x: Int) = x.toLong
}
implicit def castList[A, B](implicit AB: Cast[A, B]) =
new Cast[List[A], List[B]] {
def cast(xs: List[A]) = xs.map(AB.cast)
}
}
A classic example is a typeclass that describes how two things can be multiplied to produce a third thing (where all these 3 things can be of different types):
Haskell
class Mult a b c where
mult :: a -> b -> c
instance Mult Int Int Int where mult x y = x * y
instance Mult Euros Int Euros where {- ... -}
instance Mult Int Euros Euros where {- ... -}
instance Mult Metres Metres SquareMetres where {- ... -}
instance Mult MilesPerHour Hours Miles where {- ... -}
instance Mult Matrix Matrix Matrix where {- ... -}
instance Mult Matrix Vector Vector where {- ... -}
instance Mult Matrix Int Matrix where {- ... -}
With Haskell’s functional dependencies, we can further refine Mult
by stating that the result type c
is uniquely determined
from the operand types a
and b
. From the Haskell wiki:
When you know the types of the things that you’re multiplying, the result type should be determined from that information alone.
Haskell
{-# LANGUAGE FunctionalDependencies #-}
class Mult a b c | a b -> c where
mult :: a -> b -> c
This greatly improves type inference.
In Scala, this can be emulated by making C
an abstract type member instead.
Scala
trait Mult[A, B] {
type C
def mult(a: A, b: B): C
}
Dispatch
Lastly, there’s one more glaring difference between interfaces and typeclasses that must be pointed out.
Interfaces, and subtyping polymorphism in general, rely on dynamic dispatch. This means that the process of selecting which method to call happens at runtime, based on information available only at runtime.
Typeclasses rely on static dispatch. Which methods/functions get called is fully resolved at compile-time.
Why does this matter? Let’s try implementing a function that increments every element of a collection, using the Seq[A]
Scala “interface”.
Scala
def addOne(xs: Seq[Int]): Seq[Int] = xs.map(_ + 1)
val result: Seq[Int] = addOne(List(1,2,3))
Now here’s a different, equally valid, implementation and invocation of the same function:
Scala
def addOne(xs: Seq[Int]): Seq[Int] =
if (Random.nextBoolean) xs.map(_ + 1).to[Vector]
else xs.map(_ + 1).to[Queue]
val result: Seq[Int] = addOne( if (Random.nextBoolean) List(1,2,3) else Stream(1,2,3) )
Notice how we can’t really tell which map
method is going to be called. It could be List.map
, Stream.map
, etc.
This will have to be figured out at runtime - hence, dynamic dispatch.
Two things worth highlighting:
- the caller doesn’t know what exactly he’s getting back. It could be a vector, or a queue, an array, etc.
- the callee decides what to return at runtime.
Now let’s redefine addOne
, but this time with the Functor
typeclass instead:
Haskell
addOne :: Functor f => f Int -> f Int
addOne xs = fmap (+1) xs
λ> addOne [1,2,3]
[2,3,4]
Scala
@ import $ivy.`org.typelevel::cats-core:1.1.0`, cats._, cats.implicits._
@ def addOne[F[_]: Functor](xs: F[Int]): F[Int] = xs.map(_ + 1)
@ val result = addOne(List(1, 2, 3))
result: List[Int] = List(2, 3, 4)
This time, the tables have turned:
- the caller decides and knows exactly what he’s getting back at compile-time,
- the callee doesn’t get a say in what’s returned.
This is because the type parameter f
/F
must be bound to a specific functor at compile-time.
In this example, we implicitly bound f
/F
to []
/List
- everything’s been fully resolved, there’s nothing left to decide at runtime.
The key takeaway here is this: with dynamic dispatch, we’re able to do more things at runtime, but there’s less we can say about our program’s behaviour at compile-time.
Flexibility comes at the expense of analytical power. Constraints liberate, liberties constrain.
Credits
Thanks to George Wilson for his amazing talk which mentions a couple of the points I’ve reiterated here, and got me thinking what other differences there were between typeclasses and other forms of polymorphism.
Comments