Semigroups and Monoids

Published on February 16, 2020 Semigroups and Monoids

If you ask someone who has interacted with me in the last five years to describe me, they may say: Brandon loves monoids. I do love monoids, and although I do think there are enough existing materials on the subject on the internet, I figured I should probably add my take to the mix.

As engineers, we study algebraic structures (like semigroups and monoids) for a few reasons:

Mathematics gives us an objective solution to "clean code" and API design — discovering the algebraic structure underlying the problem gives rise to a minimally complex and maximally expressive interface.
These structures give names to incredibly abstract notions. Notions that we otherwise, as humans, would have a hard time discussing. When something has a name, our brains can reason about them. Shared vocabulary means more productivity for teams. Moreover, using these proper names introduces a ~hundred years of mathematics and computer science content for further study.

Semigroups and Monoids are the "20%" of algebraic objects that get you "80%" of the power. These are a functional programmer's basic building blocks: The ability to detect, digest, and discover them levels you up as an engineer!

Since I want this post to be maximally relevant to the audiences I think I'll reach, I'm preparing all code examples in OCaml, Haskell, and Swift throughout this post.

Algebraic structures in programs

Algebraic structures in typed programming languages are defined by signatures/protocols/type-classes/interfaces. Instances of these structures are declared by conformances/instances of these signatures. In addition to those instances that don't type-check, the set of instances is further restricted by laws or equivalences between two programs which should always be true. For example, a structure with a commutativity law aka $\forall x,y. x \oplus y = y \oplus x$ permits an implementation for $\oplus$ for integer multiplication but rejects matrix multiplication.

Semigroup

A semigroup is a type coupled with a closed binary associative operation that acts on that type, $(T, \oplus)$ . Addition over integers, $(Int, +)$ , multiplication over integers, $(Int, \times)$ , and concat over non-empty lists, $(NonEmptyList, ++)$ are all semigroups. Likewise for cache composition and sequencing animations.

The associativity law demands $\forall x, y, z. (x \oplus y) \oplus z = x \oplus (y \oplus z)$ . This is the case for all the examples shown above. A counter-example for illustration purposes: Subtraction over integers, $(Int, -)$ . Proof: Take $x=1,y=2,z=3$ , $(1 - 2) - 3$ evaluates to $-4$ , but $1 - (2-3)$ evaluates to $+2$ !

Since it's hard to type $\oplus$ in our programming development environments, we typically use <> to denote this operation. If the operation happens to be commutative as well, we use + -- though in ML langs we need to make an exception.

OCaml

Haskell

Swift

module type Semigroup = sig
  type t
  (* We don't use <> in the ML langs because <> is traditionally "is not equal" *)
  val (+) : t -> t -> t
end

Instances of semigroups are instances of the corresponding signature/protocol/type class:

OCaml

Haskell

Swift

module Sum : Semigroup = struct
  type t = int
  let (+) = Int.(+)
end

Algebraic properties give us magical powers. Associativity gives the programmer and the runtime the freedom to re-associate chunks of work.

As programmers, we get to choose grouping together operations in whichever way we feel is most appropriate for the situation.

OCaml

Haskell

Swift

let xy = x + y in
let all = xy + z in
(* or *)
let all = x + y + z in
()
(* ... *)

On the other hand, the machine can choose to schedule this work whenever it pleases. As a consequence, semigroups can hook into many pieces of machinery in other libraries and frameworks, for example, we can use the associativity to imbue parallelism into our work for free!

OCaml

Haskell

Swift

(* Work to do: x + y + z + w *)
let xy = x + y in (* thread1 *)
let zw = z + w in (* thread2 *)
xy + zw

Associativity is a very common property, so whenever you find yourself with a binary operation — it's worth asking: Is this associative — is this a semigroup?

Monoids

A monoid extends semigroups with an identity, $\epsilon$ . So a monoid is a type, a closed binary associative operation, and an identity: $(T, \oplus, \epsilon)$ . Many of the examples above for semigroups are also monoids: Addition of integers uses $0$ as an identity. Multiplication of integers' identity is $1$ . We can construct an identity cache to make cache composition a monoid.

To be a valid identity, the following law must hold: $\forall x. x \oplus \epsilon = \epsilon \oplus x = x$ , in other words, combining with the identity on the left or the right is the same as doing nothing at all. There is no $\epsilon$ which obeys that law that makes $(NonEmptyList, ++, \epsilon)$ a monoid. However, $(List, ++, [])$ is a monoid because concatenating the empty list on the left and right over any other list is the same as not concatenating at all.

Since it's hard to type $\epsilon$ in our programming development environments, we typically use empty to denote this operation.

OCaml

Haskell

Swift

module type Monoid = sig
  include Semigroup
  val empty : t
end

An example instance:

OCaml

Haskell

Swift

module ListM : Monoid = struct
  include ListS (* a semigroup *)
  let empty = []
end

The power of an identity is that there always exists a default or a known empty. Monoids let us "drop the option":

OCaml

Haskell

Swift

let annoyinglyNeedsOption =
  if computation() then Some (x + y) else None
(* to *)
let expressiveNoNeedOption =
  if computation() then x + y else M.empty

Monoids are the building blocks of composition. And composition leads to clean, simple, and expressive code. Moreover, when you and your colleagues can speak about these abstract notions concretely you get a huge productivity boost!

Thank you Tiziano Coroneo, @_lksz_, Kaan Dedeoglu, Avery Morin, and Hillel Wayne for pointing out some mistakes in earlier versions of this post!