The OCaml module system can be intimidating, and it typically involves the use of many keywords—for example, open
and include
, which allow importing definitions into a module. Since version OCaml 4.08
, the open
primitive has been generalized to allow the opening of arbitrary module expressions. In this article, we’ll explore how to use this generalization to reproduce a common practice in other languages, what I somewhat pompously call import strategies, to describe patterns like import {a, b as c} from K
, without relying on a (sub-)language dedicated specifically to importing.
OCaml, modules, and import schemes
The generalization of open
is documented in the paper "Extending
OCaml's
open
",
presented at the OCaml Workshop
2017,
and implemented—in OCaml version 4.08
, in pull requests
#1506
and
#2147
(likely followed
by a few fixes after merging). This generalization greatly increases
the flexibility of the open
construct, making it possible to use
various tricks to finely control the import of module components
into another module.
Some of the techniques shown here are directly adapted from the paper, which, beyond discussing implementation strategies, also explores various use cases—though some of them fall outside the scope of this article, as they don't relate to import strategies.
It's likely that many of the tricks presented here won't become
idiomatic in OCaml codebases. In my view, their main purpose is to
highlight the increased flexibility of the open
primitive, without
relying on a dedicated syntactic extension for importing
components—while also showcasing a few somewhat far-fetched
encodings... just for the sake of demonstration.
Importing Module Components
When describing an OCaml program, one constructs collections of
modules that export components (types, submodules, exceptions, and
functions). It is therefore crucial to finely control their
accessibility from other modules. For this, we have two primitives —
open
and include
— whose difference is subtle. To clearly describe
the differences between these two primitives, we will base ourselves
on this (somewhat artificial) module, which we will use in the
following examples:
module A = struct
type a = T_a
let a_a = 1
let a_b = T_a
module B = struct
type b = T_b
let b_a = T_b
let b_b = T_a
end
end
module A : sig
type a = T_a
val a_a : int
val a_b : a
module B : sig
type b = T_b
val b_a : b
val b_b : a
end
end
As you may notice, the implementation—and de facto, the signature —
of the module is not very interesting; it will only serve to
illustrate my point. What we want is to use, in a file c.ml
(which
will denote the module C
), the components described in A
.
The first approach, and the most obvious one, is to use their full
names (a fully qualified call) by using the module path. For
example, let's create a pair of int
and A.a
:
let c_a = (A.a_a, A.B.b_b)
However, using fully qualified paths can sometimes be tedious (or
even completely unreadable). That’s why we’ll look at how to import
components from module A
into module C
. However, since the goal
of this article is not to be a tutorial on open
and include
, but
to explore generalized opens to describe import schemes, we won’t go
into detail about these two features, which are already thoroughly
documented in the language
manual — in the
sections dedicated to
modules,
overriding through
opens,
and generalized opens
Module Opening
The open
primitive allows importing the components of a module
into another module, without re-exporting them in the current
module. For example, let's use open
to rewrite the function c_a
:
open A
let c_a = (a_a, B.b_b)
val c_a : int * A.a
As we can see from the (inferred) signature, it does not export
the components of module A
and references the type a
by its full
path. It would also be possible to open A.B
using open A.B
or even
open B
(since A
is already open).
In the same way that it's possible to open modules in the implementation, it's also possible to open modules in the signature, allowing shorter paths when referring to types (or modules).
Local module opening
The opening cases we previously observed were global to the module in which they were invoked, which can be somewhat restrictive when you want to open multiple modules, exposing, for example, arithmetic operators. Fortunately, it's possible to open at the expression level, in two different ways:
-
let open Module in ...
opens the module locally — lexically scoped — within the following expression block. This is very useful when you want to open a module only within a function; -
Module.(...)
opens the module — also lexically scoped — only within the parentheses. This is very useful when you want to open a module just within a single expression. For example, imagine theFloat
module exposes a submoduleInfix
providing the usual arithmetic operators; you could describe a floating-point operation like this:let x = Float.Infix.(1.2 + 3.14 + 1.68)
.
The absence of local opening can be very limiting. For example, the
language F#
only allows global
openings,
which makes defining operators in a dedicated module cumbersome. This
encourages the use of operator
overloading
(or even Statically Resolved Type
Parameters),
sometimes resulting in considerable complexity.
Module inclusion
The include
primitive is very similar to the open
primitive except
that it — as its name suggests — includes the contents of the targeted
module into the body of the module where it is called. For example, if
we had used include
instead of open
in our previous example, let’s
observe the impact on the inferred signature:
include A
let c_a = (a_a, B.b_b)
type a = A.a = T_a
val a_a : int
val a_b : a
module B = A.B
val c_a : int * A.a
Even though the signature varies slightly from the one we defined
earlier — there are some subtleties regarding the propagation of
type and module equalities, described in the section "Recovering
the type of a
module",
related to
strengthening
— we can see that the contents of module A
have been included,
added to module C
. Unlike opening, it is not possible to do
local inclusion, which makes perfect sense because inclusion at a
local level would have exactly the same effect as opening.
From my personal experience, I generally identify two specific use
cases for using include
:
-
Extending an existing module (for example, adding a function to the
List
module within my project, or to extend the standard library); -
Including submodules within a parent module. For example, it is quite common that within a module, there are operators (or binding operators) which are often confined in dedicated submodules (usually
Infix
andSyntax
). For API reasons, re-exporting them at the parent module level can be a good idea. This is, in fact, intensively used in Preface.
Inclusions are a powerful tool for extension, but also for code sharing, and there is much to say because it often involves substitution, destructive substitution, or strengthening, which would require writing a dedicated article!
4.08
Opening VS Inclusion before OCaml Before the merge of the proposal for generalizing opens, there was
a significant difference in the usage of open
versus include
:
the parameter each primitive accepted.
-
open
took a module path, for example:A
or evenA.B.C
; -
include
took a module expression, for example: paths likeA
orA.B.C
, but also functor applications likeF(X)
, modules constrained by signatures(M : S)
, or directly the module bodystruct ... end
.
This difference in flexibility involved rather cumbersome workarounds
to achieve the same level of expressiveness for open
compared to
include
. Indeed, to allow open
to work with functor applications
or constraints, it was necessary to use intermediate modules.
In the case of using a path, the two calls are — in terms of expressiveness — identical, because a path can also be a module expression:
include A.B.C
open A.B.C
However, as soon as we try to open somewhat more complex cases,
natively supported by include
, we quickly had to introduce
intermediate modules:
include F(X)
include (M : S)
include struct
let x = 10
end
module A = F(X)
open A
module B : S = M
open B
module C = struct
let x = 10
end
open C
Even if at first glance it may not seem dramatic, introducing
intermediate modules requires not exporting them in the signature of
the module that opens them (in its mli
or signature). Additionally,
although open
and include
are often presented symmetrically, their
asymmetry in the parameters they take was regrettable. Fortunately,
since version 4.08
(released in June 2019), thanks to the
generalization of openings, open
now accepts an arbitrary module
expression, just like include
, allowing us to use it to mimic those
import schemes mentioned at the beginning of this article.
A first benefit
The fact that the open
primitive can take arbitrary module
expressions offers a first benefit — probably trivial if you like
writing your module signatures: the definition of local
expressions. Indeed, opening a module does not export its contents,
so it is possible to very easily define non-exported top-level values
by defining them inside an open struct ... end
expression. For
example:
open struct
let x = 10
let y = 20
end
let z = x + y
val z : int
The functions x
and y
are confined within an open block, so
they are not exported, which can be very useful when you want to
define values (types or modules) locally. Moreover, since a
structure can be constrained by a signature, it is also possible,
for example, to encapsulate shared mutable state within the local
open, preventing it from escaping the scope of its opening. Here
are two examples where it is impossible to modify the reference cell
without going through the exported combinators, the first using a
constraint, the second using nested local opens:
open (
struct
let cell = ref 0
let incr () = cell := !cell + 1
let decr () = cell := !cell - 1
end :
sig
val incr : unit -> unit
val decr : unit -> unit
end)
open struct
open struct
let cell = ref 0
end
let incr () = cell := !cell + 1
let decr () = cell := !cell - 1
end
Even though the classic approach used by an OCaml developer is to use signatures for encapsulation concerns, when used as such, the generalized module opening allows you to hide, in a relatively elegant way, certain boilerplate elements necessary for building a module (which should expose a public API via a signature).
Now that we have looked at some examples of using generalized opening, let's see how it makes having a dedicated language for import schemes questionably useful.
Import Schemes
Since modularity has become a cornerstone of software design —
JavaScript, for example, has layered many
proposals
where the modularization and import strategy depends on the framework
or build system used — languages like
Python and
Haskell have introduced features similar
to OCaml's open
primitive to import components into the current
module. Generally, these import directives form a small language of
their own, governed by specific rules and grammar. Since open
has
been generalized in OCaml, it is possible to encode much of the
usual import patterns — even though some, like those proposed by
Haskell, may be somewhat verbose to encode.
For the example, we will use the following module as the import target:
module Foo : sig
val x : int
val y : string
val z : char
end
However, there is an essential nuance regarding the notion of
qualified import: in Haskell, to use a module, it must be
imported, whereas in OCaml, it is the build system (compilation
scheme) that indicates whether a module is present or not. In our
various examples, we will assume that the module Foo
, which we
described earlier, is present in our compilation scheme. Therefore,
for qualified imports — where terms are always prefixed by their
module path — no additional ceremony is necessary. It is important to
keep in mind that the tricks presented below can be combined to build
very specific (and probably unrealistic) import schemes, demonstrating
that with a bit of verbosity, the language approach still allows
more flexibility than a rigid import DSL.
Unqualified Import
The first directive simply imports the definitions from Foo
into the
current module, namely the functions x
, y
, and z
:
import * from Foo
open Foo
There is no subtlety here; importing all the terms exposed by Foo
simply consists of opening it. There isn't much more to say, as we are
not leveraging any particular language subtlety here.
Renamed Qualification
Another common directive consists of renaming the module, for example,
importing Foo
under the name Bar
so that Bar.x
, Bar.y
, and
Bar.z
are accessible in the module. For this, one can use
type-level module
aliases.
import Foo as Bar
open struct module Bar = Foo end
We use the open struct ... end
construct to keep our alias hidden
from the module's public API. This ensures that the alias doesn’t leak
outside the module. However, if the module has an explicit signature,
this is less critical, as simply omitting Bar
from the signature
will prevent it from being exposed.
Presence of the renamed module
Using a module alias leaves the module Foo
accessible, and in
some cases, we might want to make it inaccessible. The simplest
solution is to simply empty the module and, to clarify the error
related to its undesired use, we can add an
alert
indicating that the module has been removed:
open struct
module Bar = Foo
module Foo = struct end [@@alert erased]
end
Making the use of the module Foo
within the current module
impossible by raising an alert. However, since it is common practice
to provide module signatures — and thereby control the public API — you
will more often encounter renamings like: module Bar = Foo
.
Furthermore, I am not convinced that restricting access to the
original module is truly problematic.
Nested renaming
One might imagine renaming like this: import Foo as Bar.Baz
, but
OCaml does not allow full path descriptions of the form module Bar.Baz = Foo
. Instead, you need to describe the module nesting
hierarchy explicitly, like this, making the functions Bar.Baz.x
,
Bar.Baz.y
, and Bar.Baz.z
available in the current module:
open struct
module Bar = struct
module Baz = Foo
end
end
Which, I admit, is a bit verbose, but if for some obscure reason you want to rename an existing module using a composed path, you can do so by explicitly declaring the module hierarchy.
Selective Import
Sometimes, importing the entire module is overkill, and we only
want a few components exposed by it. That’s why it’s possible to
import just a part of a module. In this example, we want to import
only x
and y
:
import {x, y} from Foo
open struct
let x = Foo.x
let y = Foo.y
end
Although this approach is also somewhat verbose, it imports only the
functions x
and y
into the current module. It’s possible to
simplify this syntax by using tuples and local openings:
open struct
let (x, y) = Foo.(x, y)
end
Alternatively, it is also possible to constrain the opening using a signature, which requires specifying the types of the functions to export:
open (Foo : sig
val x : int
val y : string end)
Several proposals
(#10013
and
#11558
)
have been made to enable the use of
let-punning,
which would make the syntax less verbose. However, the first proposal
dropped punning for module members, and the second is still at the
issue stage.
Selective import with renaming
Since the first two proposals leave the user in control of naming
(it’s just function redefinition), renaming can be trivially
integrated. In this example, we expose x
and new_y_name
, which
calls Foo.y
:
import {x, y as new_y_name} from Foo
open struct
let (x, new_y_name) = Foo.(x, y)
end
Unsurprisingly, renaming is quite straightforward. However, if we
wanted to use the signature-based approach, it would require a bit
more trickery by combining an open
with an include
:
open (struct
include Foo
let new_y_name = y
end : sig
val x : int
val new_y_name : string end)
However, this last proposal is so verbose that it becomes somewhat irrational — especially compared to the previous one — and I imagine it’s the kind of code you’ll never see — or at least never want to see — in a regular codebase. That said, even though it’s heavy, I find it still quite clearly demonstrates how it is possible to compose the different constructions and tools we’ve seen earlier.
Import by Exclusion
Haskell has a somewhat special import modifier that I hesitated to
mention for a long time because I had no idea how to implement it. But
once again, thanks to the invaluable help of
@octachron and
@xhtmlboi, who both gave me roughly the
same solution, here it is. This modifier allows importing an entire
module except for a list of components. In this example, x
and
y
will be imported because we import the whole Foo
module,
except for the function z
.
import Foo hidding (z)
OCaml does not natively support constructing intersections or differences of modules. The solution proposed by @octachron and @xhtmlboi relies on function rewriting combined with the use of an alert, in a way somewhat similar to what we did to exclude a renamed module. However, before examining their solution, let's take a brief detour into the empty variant.
Empty Variant
In OCaml, it is possible to define a sum type with no constructors
using the empty
variant,
which essentially allows you to describe unrepresentable values. To
define it, you simply create a sum type with an empty branch (which,
important, is not the bottom
type, denoted ⊥
):
type empty = |
To convince yourself that the compiler can reject cases containing a
value of type empty
, you can easily experiment with pattern
matching. In the example below, the compiler raises no warnings
because the patterns are exhaustive. Since it's impossible to
construct a value of type empty
(except by cheating, for instance
using sorcery like the infamous Obj.magic
function), we can refute
handling the error case:
let f : ('a, empty) result -> 'a = function
| Ok x -> x
But in our use case, it’s not the refutation that interests us, rather
the fact that it’s impossible to describe a value of type empty
,
which we can leverage to exclude certain functions.
Function Suppression
The solution proposed to me is to make the functions we want to remove from the module impossible to call. To do this, we will first create a placeholder function that we will use to override an existing function:
type empty = |
let expelled : empty -> unit = fun _ -> ()
At first glance, our expelled
function is impossible to call because
it requires a value of type empty
, which cannot be
produced. Therefore, we can include the module we want to refine
and then override the functions we want to exclude with our
expelled
function, associating them with an alert to clarify the
error triggered by using a removed function:
open struct
include Foo
let (z [@alert expelled]) = expelled
end
And there you have it, we can be pretty sure that using z
will
cause a compilation error, and compiling a module that uses it will
raise a warning. However, the solution is far from perfect because it
does not remove the component from the module. To be honest, I've
very rarely found myself missing this feature natively. From my
perspective, selective importation usually suffices quite well.
Type Anchoring
Before concluding this article,
@octachron pointed out to me the
partial asymmetry between open
and include
when it comes to
anonymous modules (i.e., module expressions struct ... end
). This is
an issue I had already encountered theoretically, having attended the
May 2023 event of OCaml Users in Paris where
Clément Blaudeau gave his talk
Retrofitting OCaml
Modules,
which was a summary of his paper OCaml modules: formalization,
insights and
improvements.
Since open
does not export the opened components, without
associating an explicit signature, some expressions cannot be
typed. For example:
open struct type t = A end
let x = A
In this example, the type t
(and its constructor A
) is present in
the current scope; however, since it is not exported, it is impossible
to correctly type x
. If the module had a signature, one could easily
realize that there is no acceptable type for x
and that one should
either change the open
directive or avoid exporting x
. This issue
is known as type anchoring, which is extensively described in the
paper cited at the beginning of this section.
Conclusion
I sincerely believe that in daily OCaml use, we very rarely encounter
such needs. The goal of this article was essentially to show how to
use certain primitives related to the module system, alongside the
generalized open
feature, to demonstrate that having expressive and
composable primitives allows one to reproduce, sometimes trivially
(and sometimes less so), common import patterns found in other
programming languages. There are probably other fun encodings —
likely based on functors — so don’t hesitate to share them with me
so I can update this article!
To conclude, I would add that even though I proudly boasted that programming this way in OCaml is uncommon, the existence of packages like ppx_import or ppx_open shows that some syntactic sugar wouldn’t hurt—especially for selective imports.