It is not surprising that the question of Clojure's long startup time is raised time and time again on forums and social media. This topic concerns users who want to use Clojure as a scripting language, or who have a particular workflow which requires restarting the Clojure program often. Compared to languages like Python, Ruby, and JavaScript all of which have less than 100 millisecond startup time, Clojure's seconds or even tenth of seconds seem incredibly slow and wasteful.
The nature of Clojure loading is not exactly a secret. A few years ago Nicholas Kariniemi profiled Clojure startup process and described his findings in the post "Why is Clojure bootstrapping so slow?". One of the points Nick brings forward is that JVM is not to blame for Clojure's slowness — after all, a HelloWorld in Java executes in 50 ms. This is only partly true, JVM's startup time is not a problem, but the JVM as a platform makes Clojure starting time worse. JVM is optimized towards running long-lived programs and invests a lot of time upfront when initializing the classes. That's why Clojure on JVM boots much slower than ClojureScript on NodeJS.
The silver lining in this situation, and the main thing that keeps Clojure alive even with its ludicrously long bootstrap, is the REPL, of course. So what that you have to wait seconds until the project starts — you'll get your revenge later, when the results in the REPL come back instantly while others are busy recompiling and reloading. With some dexterity, you might not restart your REPL for days. E.g., Leiningen and Boot both allow to dynamically inject new dependencies into a running REPL (in Leiningen you'll need clj-refactor or Alembic, Boot can do that by itself). If you are worried about polluting your REPL state, you can adopt a workflow like Reloaded. Even when writing new scripts, I prefer starting a REPL, connecting to it from Emacs, and then interactively produce pieces of the script as I go.
So, there aren't many reasons to restart Clojure programs as you would do with Python, for example. Yet the questions stand — what exactly is behind Clojure's slow start and can we do anything about it? Today we'll try to answer the first one in an exhaustive and reproducible way.
We are going to profile Clojure's "time to REPL" for three common use cases: bare Clojure using Clojure CLI, Leiningen, and Boot. In all three cases, we'll be using the minimum possible number of dependencies (nothing besides what comes with or is required by the build tool). Clojure version is 1.9.0 in all cases.
For profiling, we'll take async-profiler which has the ability to begin profiling from the very start of JVM process. In the previous post, I announced a Clojure wrapper around async-profiler — we won't need it here because it is not suited for startup profiling. The default async-profiler is precisely what we need here.
We will not add any "accelerating" JVM flags like -XX:+TieredCompilation -XX:TieredStopAtLevel=1 -Xverify:none
. While I use those day-to-day, the choice
to enable them is opinionated and has certain tradeoffs.
This post is about profiling, not benchmarking, so the parameters of my testing rig are less critical. For what it's worth, I ran these experiments on MacBook Pro 2015 with 3.1 GHz Core i7.
To reproduce the experiments in this post do the following steps.
Get async-profiler. First, clone it
from Github. Then
you'll have to compile the native agent with make
. Note that JAVA_HOME
environment variable must be set when running make. On MacOS you can do:
$ JAVA_HOME=$(/usr/libexec/java_home) make
If all goes well, you'll find a directory build
, with two files: jattach
and
libasyncProfiler.so
. We'll only be needing the last one. You may move this
file into a more convenient place or keep it here; just keep in mind that
whenever in the post we refer to libasyncProfiler.so
, it means the valid path
to the file.
Clone FlameGraph repository. We'll
need it for generating flamegraphs from the profiling data. We'll invoke
specifically flamegraph.pl
, again, substitute it with a valid path when
running this command.
This guide describes how to install Clojure CLI. Boot and Leiningen can be downloaded from their respective websites.
Check if you have the correct Clojure version:
$ clj ~
Clojure 1.9.0
user=>
Now let's measure the time from start to instant exit:
$ time clj -e '(System/exit 0)'
clj -e '(System/exit 1)' 1.80s user 0.12s system 181% cpu 1.057 total
The time is around 1 second. It is much less than dozen-of-seconds load times you might have experienced with real-world projects. So, Clojure is not the primary time hog after all. But it is still too long for something so seemingly simple. Let's find out what is happening exactly.
$ cd /tmp
$ clj -J-agentpath:/path/to/agent/libasyncProfiler.so=start,event=cpu,file=raw-clojure.txt,collapsed -e '(System/exit 0)'
$ /path/to/FlameGraph/flamegraph.pl --colors=java --minwidth 2 raw-clojure.txt > raw-clojure.svg
Now, you can view the resulting flamegraph in the browser by visiting file:///tmp/raw-clojure.svg. Or just see my image below.
Note: I trimmed my flamegraphs a little so that they fit easier on the screen. Frames like (require ...) are manually collapsed.
If you haven't learned yet how to read flamegraphs, now is the excellent opportunity to do so. The vertical axis shows stacktraces going from bottom to top. On the horizontal axis, the width corresponds to how often the frame was seen on the stack during the execution (≈ time spent in the method). Relative position of frames on X-axis is arbitrary and doesn't mean anything. You can click the frames to zoom in/out on particular stacktraces.
You can explore the flamegraph on your own; meanwhile, I will list the main points of interest, together with some explanations. If you can't find a stackframe I'm referring to, use Search button in the top right corner and enter the sought frame.
clojure/lang/RT.load
(73.03%) that comes into
clojure/core__init.<clinit>
(64.29%). That corresponds to the initialization
of
clojure.core namespace.
The time spent there is so long because the namespace is that large, it is
already ~7700 LOC and counting.
clojure/core__init.load
(59.36%). This is
where Vars are assigned their values and metadata, and top-level
non-defining forms are executed. We can still distinguish several big
chunks.
clojure/core$fn__8055.invoke
(7.14%) corresponds
to
this form which
loads
core_instant18 namespace
(actually, I couldn't tell it from the flamegraph, I had to go to
decompile the generated class). I'm not sure why the namespace takes
so long to load though — it seems tiny, with just a single
extend-protocol
call. Needs further investigation.(load ...)
(11.70%) and (require ...)
(5.42%) loads and
initializes many medium-to-big namespaces like clojure.pprint
,
clojure.java.io
, clojure.core.protocols
, etc. Those namespaces
contain many definitions and plenty of top-level execution which adds
to the loading time significantly.java/lang/ClassLoader.loadClass
(23.40%) is where the JVM loads the
monster of a class that clojure/core__init.class
is (by the way, its
size is 316Kb).clojure/lang/Var.invoke
(18.23%), loads
clojure.core.server
namespace, which through clojure.main
triggers loading
of clojure.spec.alpha
. Spec is the latest
addition to the language, and while it has many useful applications, it is a
big piece of machinery that takes a while to load.We see that the most significant contributor to Clojure load time is the size of its base library. There are several potential ways to help that:
We are using Leiningen 2.8.1.
$ lein version
Leiningen 2.8.1 on Java 1.8.0_102 Java HotSpot(TM) 64-Bit Server VM
To profile Leiningen startup, we need a more complicated setup. First, we'll
have to create a new Lein project; otherwise, there is no way to specify the
exact Clojure version for lein repl
.
$ lein new profile-lein
$ cd profile-lein
$ sed -i 's/1\.8\.0/1.9.0/g' project.clj
$ lein repl
...
Clojure 1.9.0
Now we can measure the startup time with Lein:
$ echo '(System/exit 0)' | time lein repl
...
lein repl 6.88s user 0.61s system 124% cpu 6.003 total
The elapsed time of 6 seconds is much more than Clojure's original second.
Let's find out what exactly happens in there. There are some changes to be made
to project.clj
, here's what it should look like:
(defproject profile-lein "0.1.0-SNAPSHOT"
:jvm-opts ["-agentpath:/path/to/agent/libasyncProfiler.so=start,event=cpu,file=lein-child.txt,collapsed"]
:dependencies [[org.clojure/clojure "1.9.0"]])
And then launch the REPL like this:
$ echo '(System/exit 0)' | LEIN_JVM_OPTS='-agentpath:/path/to/agent/libasyncProfiler.so=start,event=cpu,file=lein-host.txt,collapsed' lein repl
You've probably noticed that we specified the async-profiler agent string twice.
It is not accidental. Leiningen launches two Clojure processes: one for itself,
and one for the actual project. We need to profile both of them, so we do, and
we save the output to different files, lein-host.txt
and lein-child.txt
.
Then, thanks to the beautiful textual nature of collapsed stacktraces, we can
straightforwardly merge them[1]:
$ cat lein-host.txt lein-child.txt > lein.txt
$ /path/to/FlameGraph/flamegraph.pl --colors=java --minwidth 2 lein.txt > lein.svg
I have heavily redacted this flamegraph to make it easier to navigate. Your flames would be higher.
clojure/main.<clinit>
(26.40%),
where the base Clojure is loaded. One thing is different: it contains twice
the sample count compared to when we profile bare Clojure. This is because
Leiningen loads Clojure twice.java/lang/Thread.run
(10.61%) on the right contains initialization of
Leiningen namespaces and tools.nrepl middleware. Significant part goes to
loading clojure-complete (clojure/lang/AFn.call
, 6.17%).clojure/main.main
(62.00%).
clojure/lang/RestFn.applyTo
(18.41%) consists mostly of
tools.nrepl.sever
initialization.clojure/core$require.invokeStatic
(13.88%) in the middle eventually
comes to loading pomegranate, a
library for resolving dependencies through Maven.leiningen/core/project$read.invoke
(3.61%) goes through
leiningen/core/project$load_plugins.invoke
(3.32%) towards
cemerick/.../aether$resolve_dependencies
(3.18%).
Looks like here Leiningen resolves plugins through Pomegranate.leiningen/core/main$task_args.invoke
(5.57%) also seems to be executing
nREPL-related code.leiningen/core/main$apply_task.invoke
(20.51%) ultimately
calls Reply, Clojure's alternative
REPL client.With this knowledge we can approximate that Leiningen's bootstrap process consists of:
I think there is room for improvement here, especially on the REPL side. More thorough investigation of tools.nrepl and Reply might reveal potential candidates for optimization.
Check Boot version. If Clojure is not 1.9.0, change it in ~/.boot/boot.properties
.
$ boot -V
...
BOOT_CLOJURE_VERSION=1.9.0
BOOT_VERSION=2.7.2
Now measure the start time:
$ time boot repl -s
boot repl -s 14.71s user 0.60s system 288% cpu 5.303 total
5.3 seconds, almost the same as Leiningen.
$ BOOT_JVM_OPTIONS='-agentpath:/path/to/agent/libasyncProfiler.so=start,event=cpu,file=boot.txt,collapsed' boot repl -s
$ /path/to/FlameGraph/flamegraph.pl --colors=java --minwidth 2 boot.txt > boot.svg
Boot.main
(19.35%) loads Boot's own namespaces. Boot has quite a
lot of features, thus plenty of code to load.clojure/lang/RT.<clinit>
(36.89%). Unlike Leiningen, Boot doesn't start two
JVM processes, but the number of samples loading Clojure is on par with Lein.
Why is it so? It is due to Boot's feature
called pods. Boot runs in a
single JVM process, but it still loads Clojure twice, one for its own
purposes (build tooling), and one for the application context..../ClojureRuntimeShimImpl.require
(21.00%)
seems to be initializing more Boot namespaces.clojure/lang/AFn.call
(19.75%) on the right resolves and adds dependencies
(4.46%), initializes tools.nrepl (11.31%), and prettifies REPL exceptions with
pretty (2.75%).To summarize, Boot spends those 5.3 seconds on:
I think the easiest way to reduce Boot loading time would be to trim the number of functions that need to be loaded from the start. From there, there are no immediately apparent optimizations.
As it turns out, Clojure start time is a complicated and multi-dimensional topic. Clojure projects are slow to start not only because of JVM — JVM itself starts in ~50 ms — but because of JVM specifics the classes are loaded slowly. Clojure projects are slow to start not only because of Clojure — Clojure itself starts in ~1 second — but because of Clojure specifics, the namespaces, especially not AOT-compiled one, are loaded slowly. And so on.
In this post, we have observed three common ways to use Clojure and discovered where the initial lag comes from. We haven't even started on long loading times of big projects with huge dependency trees. But now you know how to profile the startup of your application regardless of the build tool you are using. And meanwhile your REPL is loading, go make yourself a cup of tea. You earned it.