My Profile Photo

code ninja . blog

where I show and discuss things I'm working on...

Why Monads?

Recently I had the unlucky experience of trying to explain to some friends over Slack what monads actually do. Trying to show how they are quite useful outside of “they allow Haskell to have side-effects” proved difficult.

I went away for a while and started really putting some thought into how to best explain it. I’ve gotten so used to using them that I never thought too hard about why. This post is about why, and - hopefully - at the same time help others who have a difficult time grokking them….

Why Are Monads Hard?

There are pages and pages of the Internet dedicated to trying to explain monads. They really are quite simple once you understand them, so either…

  • a conceptual barrier that’s difficult to cross;
  • all the documentation and blog posts suck; or
  • the use-case is so foreign to most programmers that they don’t understand the problem that monads solve, making the solution nigh impossible to grok.

How Is This Post Different?

My focus is going to be on the final bullet point: defining the use-case. I’m going to make the problem so clear that it should be obvious when a monad is warranted.

By the end, the reader will have re-invented monads - out of necessity - to solve the problem posed. The reader should understand the problem they solve, how they solve it, when to use them, and how to use them effectively.


For the code in this post I’ll be using Scala and the Ammonite-REPL. Any line of code beginning with @ is executed from within Ammonite.

@ import scala.util.Random

Premise: Pure vs. Effect

We’ll begin with a premise…

All code produces values, and those values fall into one of two categories: pure or effect. 1

Consider the following line of code:

@ 3
Int = 3

This is a pure value; there was no side-effect necessary to compute it.

@ Random.nextInt(100)
Int = 42

This is an effect value. Before its value is known, a side-effect had to take place: either it read from somewhere (e.g. /dev/random) or updated some internal state.

Next, let’s stipulate that…

Using the result of an effect implies that the value returned must also - by definition - be the result of an effect.

That stipulation may be a bit much to accept. But, consider: any computation we perform after getting our random number was dependent on the side-effect used to produce it.

Instead of side-effects, think about Try values. If it’s possible for a computation to fail and return an error, then every value derived from it must also be possible to fail (and therefore also be a Try). 2

The Problem

Now that we’ve setup our premise, let’s dive into what the problem is. First, let’s start with a simple function:

def pair[T](x: T) = (x, x)

Let’s see what happens when we call this function with a pure value:

@ pair(3)
(Int, Int) = (3, 3)

And now let’s call it with an effect value:

@ pair(Random.nextInt(100))
(Int, Int) = (23, 23)

It appears to have worked fine in both instances, so what’s wrong?

We’ve identified that any value returned by Random.nextInt is the result of an effect. We’ve also stipulated that any value derived from it (the value returned by pair) must also be an effect value.

The problem is that we have no type-safe way of letting the compiler know that Random.nextInt returns an effect value. Because of this, we can incorrectly be allowed to use it in a pure context: our pair function.

A Naïve Solution

A quick solution would be to make a new type that wraps any code and identify it as an effect:

case class Effect[A](private val value: A)

And now we can call a function and identify the return value an Effect.

@ val x = Effect(Random.nextInt(100))
x: Effect[Int] = Effect(48)

But, now let’s call our pair function with it:

@ pair(x)
(Effect[Int], Effect[Int]) = (Effect(48), Effect(48))

While this is correct, it’s not really what’s desired. We want the return value to be an Effect that results in a pair, not a pair of effects.

A Better Solution

Let’s start by redefining our Effect class, adding a flatMap method to it that allows us to pass the value of our Effect through a function that will return a new Effect.

case class Effect[A](private val value: A) {
  def flatMap[B](f: A => Effect[B]) = f(value)

Remember our earlier stipulation:

Using the result of an effect implies that the value returned must also - by definition - be the result of an effect.

And to help ourselves, let’s create a way to “map” a pure function and get back a function that returns an Effect:

def fmap[A,B](f: A => B): (A => Effect[B]) = {
  x => Effect(f(x))


@ Effect(Random.nextInt(100)).flatMap(fmap(pair))
Effect[(Int, Int)] = Effect((9, 9))

What Was Accomplished?

Let’s recap the problem:

  • There are two categories of values: pure and effect.
  • There’s no type-safe way to distinguish between them.

To solve this problem we’ve defined a functional, type-safe wrapper around effect values, and guaranteed that any result derived from an effect value is also treated as an effect. This prevents us from ever accidentally using an effect value in a pure context.

Just to show that this solution could be used for another, completely different category of values, let’s treat that previous paragraph as a mad lib:

…around asynchronous values, and guaranteed that any result derived from an asynchronous value is also treated as asynchronous. This prevents us from ever accidentally using an asynchronous value in a synchronous context.

Or how about…

…around optional values, and guaranteed that any result derived from an optional value is also treated as optional…

So, What Is A Monad?

A monad is just a formal definition (rooted in category theory) of what was coded above. In short:

A monad is used to categorize values and compose functions within that category in a type-safe manner.

Our Effect class is a monad that wraps values that were derived through the use of side-effects. But, the use of monads is not exclusive to that use-case. These are also examples of monads:

Once a value is optional (Option), any new, derived value from it must also be optional. If a function executes asynchronously (Task), then any function that uses the result must also - by definition - be considered asynchronous.

The formal definition of a Monad is any type that implements two functions:

  1. inject a value into the monad; and
  2. map that value and returns a new monad instance

And, those two functions must be coded in such a way as to adhere to these three laws (Haskell syntax):

  1. Right identity: m >>= return ≡ m
  2. Left identity: return a >>= f ≡ f a
  3. Associativity: (m >>= f) >>= g ≡ m >>= (\x -> f x >>= g)

And translated into English…

  1. The “constructor” for a monad acts as the identity function when passed to flatMap.
  2. The result of calling flatMap with a function, should be the same as calling the function with the value injected.
  3. As long as the order of operations is unchanged, where the flatMap calls are bound should not affect the result.

Let’s make sure that our Effect class implements these laws correctly:

@ def inc(x: Int) = x+1
@ def sq(x: Int) = x*x

// right identity
@ Effect(1).flatMap(Effect.apply) == Effect(1)
Boolean: true

// left identity
@ Effect(1).flatMap(fmap(inc)) == fmap(inc)(1)
Boolean: true

// associativity 
@ Effect(1).flatMap(fmap(inc)).flatMap(fmap(sq))
Effect[Int] = Effect(4)

@ Effect(1).flatMap(x => fmap(inc)(x).flatMap(fmap(sq)))
Effect[Int] = Effect(4)

Congratulations, you just re-invented monads!

What About IO?

Ignoring Option, one of the first experiences programmers get with monads is the IO monad via Haskell, Cats, or some similar monadic library.

What makes the IO monad different is that the values it wraps are a description of code to evaluate (e.g. a “thunk”). That’s it!

There’s truly nothing mystical or different other than the category of value being stored, which is a body of code to evaluate as opposed to the result of executing that code.

The only thing about the IO monad that is “special” is that - eventually - it needs to be evaluated. In Haskell this is implicitly done by the main function (which is why it returns IO). Using the cats-effect library in Scala, you’re responsible for doing it yourself.

Regardless, the problem being solved is (conceptually) the same, and the solution is exactly the same.


I hope this helps you to understand monads a bit more, what problem(s) they are used to solve, and - more importantly - giving you the knowledge to know when and how to use them effectively!

  1. There are many different categories of values (e.g. optional, either, asynchronous, …), but for our problem we’re going to focus only on this particular category… 

  2. If coming from a language like Go, think about functions that return error values. If you just blissfully ignore them it’s a bug. You need to either handle them (and do something else that can’t fail), or propogate the error, which instantly means your return value is now in the category of “may fail”.