How the Free Monad and Functors Represent Syntax

How the Free Monad and Functors Represent Syntax
This is post is the result of a Socratic conversation I had with GPT-4o about what "syntax represented by the free monad for a functor that provides a signature." meant. After the question and answering, I asked it to write an outline of all that I learned and used that outline as the basis for asking it to write a blog post.

The text was written by GPT-4o, but the post required a fair bit of curation to make it include the insights and ah-ha moments I had during the conversation. It would otherwise be devoid of any of the juicy details.
🔰
The target audience are developers that already had mild exposure to functional programming.

Introduction

I was watching a talk by Nicolas Wu from Imperial College London about the evolution of effects when I saw a slide that read:

"syntax represented by the free monad for a functor that provides a signature."

It was a densely packed statement, and I wanted to unpack what it really meant.

At its core, this phrase describes a powerful way to build expressions in a domain-specific language (DSL) while keeping them flexible and composable. We need a way to represent computations as structured expressions that can be dynamically extended, transformed, or interpreted in multiple ways. Instead of working with rigid trees that require manually modifying types whenever we introduce a new operation, we want a system that allows us to:

  • Dynamically add and remove operations from an expression tree without restructuring the entire program.
  • Defer execution so that transformations and optimizations can be applied before evaluation.
  • Easily modify, analyze, or interpret expressions in different ways without changing the underlying representation.
  • Keep a uniform type structure, so modifications don’t require constant changes in type signatures.

In this post, we’ll break down how the Free Monad enables these properties, how functors provide structure, and why this technique is useful when working with DSLs. By the end, you’ll not only understand what this phrase means but also why it’s a powerful tool for structuring computations.

Functors and Monads as Building Blocks

A functor is a data type that allows you to apply a function to its contents while preserving its structure. "Preserving its structure" means that while the values inside the container may change, the overall shape of the container remains the same. For example, when mapping over a list, each element is modified, but the number of elements and their ordering do not change. When mapping over a tree, the transformation applies to each node, but the tree's branching structure remains intact. This makes functors a useful abstraction for applying functions over structured data without altering their fundamental organization. This is an extremely useful abstraction because it allows transformations of data inside containers without affecting the container itself. Many common data types in functional programming are functors.

For example:

Lists: Mapping over a list applies a function to every element.

fmap (+1) [1,2,3] -- Produces [2,3,4]

Maybe: Applying a function inside Just preserves the structure, while Nothing remains unchanged.

fmap (*2) (Just 5) -- Produces Just 10
fmap (*2) Nothing  -- Produces Nothing

Trees: If you define a tree structure, you can apply a function to every value while keeping the tree’s shape.

data Tree a = Leaf a | Node (Tree a) (Tree a) deriving Functor
fmap (*3) (Node (Leaf 1) (Leaf 2)) -- Produces Node (Leaf 3) (Leaf 6)

A monad, on the other hand, extends functors by introducing sequencing through bind (>>=), which is analogous to flatMap in other languages. The key idea is that monads allow transformations that themselves return monadic values, meaning they produce results that are still wrapped inside a monad. When you use fmap, you apply a function that transforms values inside the monad, but the result is still wrapped in another monadic container. For example, mapping over Just 5 with a function that returns another Just would give Just (Just 10), introducing an extra layer of wrapping. Similarly, mapping over a list with a function that returns another list results in nested lists.

This is where bind (>>=) comes in. Instead of simply applying a function and keeping the extra layer of wrapping, bind ensures that we unwrap the extra monadic container, apply the function, and return the same monadic type we started with. This flattening process makes monads composable, allowing operations to be chained together seamlessly. Without bind, every transformation would create additional layers of nested structures, making composition difficult and unwieldy.

For example, using Maybe as a monad:

Just 5 >>= (\x -> Just (x * 2)) -- Produces Just 10
Nothing >>= (\x -> Just (x * 2)) -- Produces Nothing

If we try to do the same with lists, binding ensures that we don’t get nested lists:

[1,2,3] >>= (\x -> [x, x+1]) -- Produces [1,2,2,3,3,4]

This flattening behavior is why monads are useful for composing computations that return wrapped values.

What is a Free Monad?

A Free Monad is a way to represent computations in a structure that allows composition without enforcing an immediate interpretation. It is called "free" because it is the most general monadic structure that satisfies the monad laws without adding any extra computational behavior beyond what is required by the monad interface itself.

One way to think about it is as an implementation of a monad that only satisfies the monad interface but does nothing else. Normally, when defining a monad, you also define specific behavior for how computations should be sequenced and interpreted. The Free Monad, however, defers execution and simply builds a tree-like structure of computations without specifying how they should be executed.

For example, a Free Monad is defined as:

data Free f a = Pure a | Free (f (Free f a))
  • Pure a represents a computation that has finished and holds a value.
  • Free (f (Free f a)) represents an operation wrapped in a functor f, where the next computation is another Free Monad.

This structure allows computations to be expressed without committing to an evaluation strategy. Instead of executing immediately, expressions are built up into a composable structure that can later be interpreted by an evaluator. This makes Free Monads particularly useful for defining domain-specific languages (DSLs) and effect systems, where different interpreters can evaluate the structure in different ways.

ExprF: The Functor That Defines Syntax

Now that we understand functors and monads, we can introduce ExprF, a functor that defines the building blocks of a domain-specific language (DSL). Our goal, as mentioned in the introduction, is to create a way to build expressions in a DSL that makes it easy to dynamically add or remove operations from the syntax tree. ExprF plays a crucial role in this by defining the operations that can appear in our DSL, acting as the "signature" for our syntax. Instead of holding direct values, ExprF describes the structure of computations, allowing us to construct expressions in a modular and extensible way:

 data ExprF next
   = Const Int          -- Represents a constant value
   | Add next next      -- Represents an addition operation
   | Multiply next next -- Represents multiplication
   deriving Functor

Here, ExprF doesn't define a whole computation—it defines a single step. The next parameter represents what comes after the current operation, allowing us to build trees of expressions. However, on its own, ExprF is just a functor; it allows transformations but doesn’t define how computations should be sequenced.

This is where the Free Monad comes in. It provides a way to compose these operations into a full expression tree while keeping things flexible. This connects back to the original phrase: "provides the signature" (the allowed operations in the DSL) provides the means to represent full syntax trees dynamically.

The Problem with Manually Constructing ASTs

If we were to manually construct an AST without the Free Monad, we would run into deep nesting issues. Each time we extend an expression, its type becomes increasingly complex, making it harder to modify dynamically. For example, consider the following structure:

data ExprF a = Const Int | Add a a | Multiply a a
expr = Add (Add (Const 1) (Const 2)) (Const 3)

This approach means every modification requires changes to the type signature, making it rigid and inflexible. As we continue expanding the AST, every additional layer introduces further nesting in the type signature, making it more cumbersome to manipulate dynamically.

Let's illustrate this further with a step-by-step example:

data ExprF a = Const Int | Add a a | Multiply a a

Starting with a simple constant:

expr1 :: ExprF Int
expr1 = Const 1

Adding another operation introduces an additional nesting layer:

expr2 :: ExprF (ExprF Int)
expr2 = Add (Const 1) (Const 2)

Extending it further increases the complexity:

expr3 :: ExprF (ExprF (ExprF Int))
expr3 = Add (Add (Const 1) (Const 2)) (Const 3)

Visually, this looks like:

    Add
   /   \
  Add   3
 /   \
1     2

With each added operation, the type of the expression grows deeper, making modifications and extensions cumbersome. This is problematic in a DSL, where we want to dynamically construct expressions without constantly restructuring type signatures.

How the Free Monad Enables Composition

Instead of having types that grow deeper and more complex with every operation, we now have a flexible structure where each operation is wrapped in `Free`, allowing it to be composed seamlessly. This lets us build, modify, and interpret syntax trees in a more modular way, without being locked into a specific structure upfront. By using this approach, we enable more dynamic and extensible computation models, making DSLs significantly more expressive and adaptable.

The Free Monad provides a wrapper that keeps syntax flexible. Instead of each node explicitly containing deeper expressions, Free introduces a consistent structure:

data Free f a = Pure a | Free (f (Free f a))

This allows us to define expressions in a way that doesn’t require deep nesting. Using Free, we can build our AST dynamically without affecting the type signature:

expr :: Free ExprF Int
expr = Free (Add (Free (Multiply (Pure 2) (Pure 3))) (Free (Add (Pure 4) (Pure 5))))

Now, let's visualize how Free intersperses throughout the syntax tree, wrapping each operation. Unlike the manually constructed AST where operations were deeply nested, Free ensures that the structure remains flexible and uniform.

          Free
           |
          Add
         /    \
      Free    Free
       |        |
    Multiply   Add
    /      \   /   \
  Pure 2  Pure 3  Pure 5

Each Free node serves as a wrapper that ensures the compositional structure of our syntax tree remains consistent. Instead of hard-coding how expressions should nest, Free allows us to dynamically construct expressions without altering their type signatures.

This wrapping approach makes it much easier to modify or extend an expression. For example, if we wanted to introduce a Subtract operation at any point, we could do so without breaking the structure of our tree. Every operation exists in the same uniform format, making it easier to traverse, analyze, or transform the AST.

This also allows for deferred evaluation—since expressions are wrapped in Free, we don't commit to a specific execution strategy up front. Instead, we can manipulate, optimize, or even interpret the structure in multiple ways before execution. This is especially useful in DSLs, where expressions might be transformed before being compiled or executed.

By introducing Free, we've essentially decoupled syntax from execution, making our DSL more extensible and easier to manage. The composition enabled by Free ensures that each operation can be expressed in a modular way, allowing us to dynamically construct and manipulate expressions without the constraints imposed by deeply nested ASTs.

The Role of fmap in Free

Since the Free Monad is built on a functor, fmap allows us to transform expressions one layer at a time without modifying the entire structure. But why do we do it one layer at a time?

The reason is that fmap is designed to operate on the structure locally, ensuring that modifications happen incrementally rather than requiring a full traversal of the tree. When constructing an expression, this means that we can modify only the immediate substructure of an operation while leaving the rest of the expression unchanged. This is useful both during expression building and evaluation:

  • During expression building, it allows us to add new operations dynamically without needing to restructure the entire tree.
  • During evaluation, it ensures that transformations, optimizations, or rewrites can be applied selectively rather than requiring a complete overhaul of the expression.

For example, if we apply fmap show to an ExprF structure, it converts numbers to strings but preserves the underlying tree structure:

fmap show (Add 1 2) -- Produces Add "1" "2"

This localized transformation ensures that modifications can be applied gradually instead of requiring a full traversal. This is particularly powerful in DSLs where syntax trees may undergo multiple passes—such as optimization, interpretation, or compilation—before execution. By enabling fine-grained transformations, fmap ensures that expressions remain composable and modifiable at different stages of processing, making the Free Monad approach an effective way to manage structured computations.

This localized transformation ensures that modifications can be applied incrementally rather than requiring a full traversal of the tree. By modifying expressions one layer at a time, we gain greater control over how transformations are applied, making it easier to build, modify, and optimize expressions without disrupting their structure.

For example, consider an expression tree where we start with Add 1 2 and later decide to double the result by wrapping it in a Multiply operation. Instead of reconstructing the entire tree from scratch, we can apply this modification incrementally:

expr1 = Free (Add (Pure 1) (Pure 2))
expr2 = Free (Multiply expr1 (Pure 2))

This results in the following transformation:

      Multiply
      /      \
    Add       2
   /    \
  1      2

Rather than requiring a full restructuring, we simply wrap the previous expression in a new operation. This approach enables localized updates, making syntax trees easier to extend and modify dynamically.

This approach is particularly beneficial in DSLs where expressions undergo multiple passes before execution. By enabling gradual rewrites, we ensure that syntax trees remain flexible and adaptable at different stages of processing—whether during parsing, transformation, or evaluation. This is what makes fmap so essential: it allows changes to be localized, preserving the overall integrity of the syntax tree while allowing for incremental and controlled modifications. Ultimately, this ability to transform expressions in a structured yet adaptable way is what makes the Free Monad such a powerful tool for representing and manipulating syntax.

Bringing It Together

Now that we have both Free and ExprF, we can see how they interact. The Free Monad enables composition, ensuring that expressions can be built incrementally and modified dynamically. The functor provides the syntax rules, defining what operations are available. Together, they allow us to define syntax trees that are both structured and flexible.

By separating concerns—functors handle transformation, and Free Monads handle composition—we gain a powerful tool for defining extensible DSLs. This distinction allows us to express computations structurally while deferring interpretation, meaning we can analyze, optimize, or compile expressions without being tied to a single execution model.

Understanding "Syntax Represented by a Free Monad"

When we talk about syntax in this context, we mean expressions that can be modified before evaluation. Instead of being rigid structures that must be defined in advance, these expressions are flexible, allowing for transformations, modifications, and optimizations before they are executed.

The Free Monad, specifically Free ExprF a, gives us a way to represent these expressions while deferring their evaluation. Rather than immediately computing a result, expressions wrapped in Free remain in a structured form that can be analyzed, modified, or composed before anything is actually executed.

This is a key distinction: Free is what allows syntax to be composable and transformable, meaning that we can build up complex expressions dynamically, rearrange them as needed, and even apply different interpretation strategies. Without Free, we would be forced into a much more rigid structure where every new operation must be manually integrated into an ever-growing syntax tree.

At the same time, it is important to recognize that Free and Functor serve different roles. Free is responsible for composition, meaning it provides the mechanism that allows different parts of a syntax tree to be stitched together dynamically. Functor, on the other hand, is responsible for transformation, meaning it enables us to rewrite, optimize, or manipulate the structure of an expression one layer at a time. Together, they allow us to construct, modify, and interpret expressions in a way that remains flexible and extensible.

Why Functors Provide the Signature

In our approach to structuring syntax, the functor ExprF serves as the foundation that defines which operations are allowed. It provides the basic building blocks of the syntax tree by describing individual computation steps such as addition, multiplication, and constants. Without this structure, we would have no formal way to specify the kinds of expressions that can exist in our DSL.

For example, ExprF defines operations like:

data ExprF next
  = Const Int        -- Represents a constant value
  | Add next next    -- Represents an addition operation
  | Multiply next next  -- Represents multiplication
  deriving Functor

Here, ExprF acts as a blueprint for constructing syntax trees. Each operation (Add, Multiply, etc.) can take subexpressions as arguments, allowing expressions to be composed. However, on its own, ExprF only defines these operations one layer at a time—it does not describe how they should be combined into full expressions.

This is where the Free Monad comes in. Free ExprF a elevates ExprF from being just a single-layer structure to a fully composable syntax representation. The functor provides the "syntax rules"—it specifies what operations exist—but Free allows us to chain and sequence these operations into complete expressions.

Without a functor, we would not be able to modularly modify or interpret syntax. The functor acts as the interface for transformations, while the Free Monad ensures that these transformations remain composable and extendable. This separation of concerns is what makes Free Monads an elegant way to structure and manipulate syntax dynamically.

Why Free Monads Enable Composition

The Free Monad provides a way to dynamically modify expressions while maintaining a consistent structure. When using >>= (bind) in Free, we can modify an expression tree in a way that avoids deeply nested types, keeping syntax trees easier to extend and manipulate. Instead of embedding modifications directly within an operation, >>= allows us to defer decisions about computation structure, meaning we can sequence or transform expressions dynamically without restructuring the entire tree.

For example, if we want to modify an existing expression tree, we don’t need to manually traverse and replace individual nodes. Instead, we use >>= to bind a function that describes how the transformation should proceed, and the Free Monad handles the composition seamlessly:

modifyTree :: Free ExprF Int -> Free ExprF Int
modifyTree expr = expr >>= \x -> Free (Multiply (Pure x) (Pure 2))

This approach ensures that any transformation applied to an expression remains modular and extensible. Unlike a traditional AST where changes might require altering multiple levels of nesting, Free Monads allow modifications to be expressed in a way that integrates smoothly with existing expressions.

By structuring our syntax with Free Monads, we separate concerns effectively: the Functor (ExprF) handles the structure of individual operations, while the Free Monad (Free ExprF a) enables composition. This distinction is what makes Free Monads so useful for defining extensible DSLs and effect systems. We can add, remove, or modify operations dynamically while keeping the syntax tree manageable and flexible, ensuring that the entire computation remains composable from start to finish.

The downsides

While using Free Monads and Functors to represent syntax provides flexibility and composability, this approach is not without its drawbacks.

One significant downside is performance overhead. Since Free Monads introduce an additional layer of indirection, deeply nested expressions wrapped in Free can lead to inefficiencies in evaluation. Each additional Free wrapper requires an extra step during traversal, which can result in slower execution compared to a more direct representation.

Another challenge is readability and debugging. While the Free Monad provides a structured way to represent syntax, the added abstraction can make it harder to understand what an expression is actually doing at a glance. The presence of Free at every level of an expression tree can make debugging more cumbersome, as errors may not be as straightforward to trace compared to a manually constructed AST.

There is also the issue of boilerplate code. Writing interpreters for Free Monads requires defining multiple layers—one for the syntax (ExprF), one for the Free wrapper, and then an interpreter to evaluate or transform expressions. This added complexity can be a barrier to adoption, especially for developers unfamiliar with functional programming abstractions.

Alternative Approaches

Several alternatives exist for structuring DSLs and effectful computations:

  • Tagless Final: Instead of using Free Monads, a tagless final approach represents syntax using type classes, making interpretation more direct and often more efficient.
  • Generalized Algebraic Data Types (GADTs): GADTs provide a way to encode syntax trees with more precise type information, reducing some of the boilerplate associated with Free Monads.
  • Freer Monads (Church-encoded Free Monads): These are an optimization of Free Monads that help reduce some of the performance overhead while maintaining composability.

Each of these approaches has trade-offs in terms of flexibility, performance, and ease of interpretation. While Free Monads provide a powerful way to represent composable syntax trees, they should be chosen with an understanding of their trade-offs and alternatives.

Conclusion

Now, we can finally decode the original phrase: "Syntax represented by the free monad for a functor that provides a signature." This statement describes a structured way to represent computations in a domain-specific language. The functor, ExprF, defines the syntax rules, specifying what operations are valid in our expression language. Meanwhile, the Free Monad, Free ExprF a, provides a flexible structure that enables composable, deferred construction of expressions. By separating the definition of individual operations from their composition, we gain a powerful system for constructing, modifying, and interpreting expressions dynamically.

This approach allows us to build complex syntax trees while maintaining the ability to add new operations without restructuring existing code. Instead of committing to an execution model upfront, we can transform, analyze, and interpret expressions in multiple ways before evaluation. This flexibility is what makes the Free Monad such a powerful abstraction—it enables modular and extensible syntax representations, making DSLs easier to manage and evolve over time.