Recently I had the unlucky experience of trying to explain to some friends over Slack what monads actually do. Trying to show how they are quite useful outside of “they allow Haskell to have sideeffects” proved difficult.
I went away for a while and started really putting some thought into how to best explain it. I’ve gotten so used to using them that I never thought too hard about why. This post is about why, and  hopefully  at the same time help others who have a difficult time grokking them….
Why Are Monads Hard?
There are pages and pages of the Internet dedicated to trying to explain monads. They really are quite simple once you understand them, so either…
 a conceptual barrier that’s difficult to cross;
 all the documentation and blog posts suck; or
 the usecase is so foreign to most programmers that they don’t understand the problem that monads solve, making the solution nigh impossible to grok.
How Is This Post Different?
My focus is going to be on the final bullet point: defining the usecase. I’m going to make the problem so clear that it should be obvious when a monad is warranted.
By the end, the reader will have reinvented monads  out of necessity  to solve the problem posed. The reader should understand the problem they solve, how they solve it, when to use them, and how to use them effectively.
Setup
For the code in this post I’ll be using Scala and the AmmoniteREPL. Any line of code beginning with @
is executed from within Ammonite.
@ import scala.util.Random
Premise: Pure vs. Effect
We’ll begin with a premise…
All code produces values, and those values fall into one of two categories: pure or effect. ^{1}
Consider the following line of code:
@ 3
Int = 3
This is a pure value; there was no sideeffect necessary to compute it.
@ Random.nextInt(100)
Int = 42
This is an effect value. Before its value is known, a sideeffect had to take place: either it read from somewhere (e.g. /dev/random
) or updated some internal state.
Next, let’s stipulate that…
Using the result of an effect implies that the value returned must also  by definition  be the result of an effect.
That stipulation may be a bit much to accept. But, consider: any computation we perform after getting our random number was dependent on the sideeffect used to produce it.
Instead of sideeffects, think about Try
values. If it’s possible for a computation to fail and return an error, then every value derived from it must also be possible to fail (and therefore also be a Try
). ^{2}
The Problem
Now that we’ve setup our premise, let’s dive into what the problem is. First, let’s start with a simple function:
def pair[T](x: T) = (x, x)
Let’s see what happens when we call this function with a pure value:
@ pair(3)
(Int, Int) = (3, 3)
And now let’s call it with an effect value:
@ pair(Random.nextInt(100))
(Int, Int) = (23, 23)
It appears to have worked fine in both instances, so what’s wrong?
We’ve identified that any value returned by Random.nextInt
is the result of an effect. We’ve also stipulated that any value derived from it (the value returned by pair
) must also be an effect value.
The problem is that we have no typesafe way of letting the compiler know that Random.nextInt
returns an effect value. Because of this, we can incorrectly be allowed to use it in a pure context: our pair
function.
A Naïve Solution
A quick solution would be to make a new type that wraps any code and identify it as an effect:
case class Effect[A](private val value: A)
And now we can call a function and identify the return value an Effect
.
@ val x = Effect(Random.nextInt(100))
x: Effect[Int] = Effect(48)
But, now let’s call our pair
function with it:
@ pair(x)
(Effect[Int], Effect[Int]) = (Effect(48), Effect(48))
While this is correct, it’s not really what’s desired. We want the return value to be an Effect
that results in a pair, not a pair of effects.
A Better Solution
Let’s start by redefining our Effect
class, adding a flatMap
method to it that allows us to pass the value of our Effect
through a function that will return a new Effect
.
case class Effect[A](private val value: A) {
def flatMap[B](f: A => Effect[B]) = f(value)
}
Remember our earlier stipulation:
Using the result of an effect implies that the value returned must also  by definition  be the result of an effect.
And to help ourselves, let’s create a way to “map” a pure function and get back a function that returns an Effect
:
def fmap[A,B](f: A => B): (A => Effect[B]) = {
x => Effect(f(x))
}
Hurrah!
@ Effect(Random.nextInt(100)).flatMap(fmap(pair))
Effect[(Int, Int)] = Effect((9, 9))
What Was Accomplished?
Let’s recap the problem:
 There are two categories of values: pure and effect.
 There’s no typesafe way to distinguish between them.
To solve this problem we’ve defined a functional, typesafe wrapper around effect values, and guaranteed that any result derived from an effect value is also treated as an effect. This prevents us from ever accidentally using an effect value in a pure context.
Just to show that this solution could be used for another, completely different category of values, let’s treat that previous paragraph as a mad lib:
…around asynchronous values, and guaranteed that any result derived from an asynchronous value is also treated as asynchronous. This prevents us from ever accidentally using an asynchronous value in a synchronous context.
Or how about…
…around optional values, and guaranteed that any result derived from an optional value is also treated as optional…
So, What Is A Monad?
A monad is just a formal definition (rooted in category theory) of what was coded above. In short:
A monad is used to categorize values and compose functions within that category in a typesafe manner.
Our Effect
class is a monad that wraps values that were derived through the use of sideeffects. But, the use of monads is not exclusive to that usecase. These are also examples of monads:
 Option
 Try
 Future
 Task (monix)
 IO (catseffect)
Once a value is optional (Option
), any new, derived value from it must also be optional. If a function executes asynchronously (Task
), then any function that uses the result must also  by definition  be considered asynchronous.
The formal definition of a Monad is any type that implements two functions:
 inject a value into the monad; and
 map that value and returns a new monad instance
And, those two functions must be coded in such a way as to adhere to these three laws (Haskell syntax):
 Right identity:
m >>= return ≡ m
 Left identity:
return a >>= f ≡ f a
 Associativity:
(m >>= f) >>= g ≡ m >>= (\x > f x >>= g)
And translated into English…
 The “constructor” for a monad acts as the identity function when passed to
flatMap
.  The result of calling
flatMap
with a function, should be the same as calling the function with the value injected.  As long as the order of operations is unchanged, where the
flatMap
calls are bound should not affect the result.
Let’s make sure that our Effect
class implements these laws correctly:
@ def inc(x: Int) = x+1
@ def sq(x: Int) = x*x
// right identity
@ Effect(1).flatMap(Effect.apply) == Effect(1)
Boolean: true
// left identity
@ Effect(1).flatMap(fmap(inc)) == fmap(inc)(1)
Boolean: true
// associativity
@ Effect(1).flatMap(fmap(inc)).flatMap(fmap(sq))
Effect[Int] = Effect(4)
@ Effect(1).flatMap(x => fmap(inc)(x).flatMap(fmap(sq)))
Effect[Int] = Effect(4)
Congratulations, you just reinvented monads!
What About IO?
Ignoring Option
, one of the first experiences programmers get with monads is the IO
monad via Haskell, Cats, or some similar monadic library.
What makes the IO
monad different is that the values it wraps are a description of code to evaluate (e.g. a “thunk”). That’s it!
There’s truly nothing mystical or different other than the category of value being stored, which is a body of code to evaluate as opposed to the result of executing that code.
The only thing about the IO
monad that is “special” is that  eventually  it needs to be evaluated. In Haskell this is implicitly done by the main
function (which is why it returns IO
). Using the catseffect library in Scala, you’re responsible for doing it yourself.
Regardless, the problem being solved is (conceptually) the same, and the solution is exactly the same.
fin.
I hope this helps you to understand monads a bit more, what problem(s) they are used to solve, and  more importantly  giving you the knowledge to know when and how to use them effectively!

There are many different categories of values (e.g. optional, either, asynchronous, …), but for our problem we’re going to focus only on this particular category… ↩

If coming from a language like Go, think about functions that return
error
values. If you just blissfully ignore them it’s a bug. You need to either handle them (and do something else that can’t fail), or propogate the error, which instantly means your return value is now in the category of “may fail”. ↩