Clojure on Metal
December 21, 2019
Once a year, when December rolls around, we dust off our Clojure skills for some fun Advent of Code programming challenges. For a week or so, anyway š
For me, though, solving the problems is just an excuse to become a better engineer. Sure, you get that dopamine hit when you solve the problem, the real rush comes when you get your correct solution running FAST.
Itās not uncommon for me to spend twice as long ācleaning upā my solution as it took to solve the problem. The results of these efforts are almost always blog-worthy, and I have some fun stories to share with you this year!
The first such adventure happened on the very first problem: Advent of Code 2019: Day 1.
If you have any interest in solving Advent of Code problems yourself, please be advised that this post contains spoilers! But only for day 1. Which Iām sure you could solve very quickly. Go on, then! I promise this post will be here when you get back.
The Problem: Advent of Code 2019: Day 1
I solved this problem using the map-reduce paradigm:
- Map over the input to calculate the fuel requirements
- Reduce all the requirements into a single sum
In Clojure, the function to calculate the fuel requirements looks like:
(defn calc-fuel [mass]
(-> mass (/ 3) (Math/floor) (int) (- 2)))
Take the mass, divide it by three, round down, cast it to an int, and subtract 2. And really this was just an excuse to test out the syntax highlighting in the blog.
The full solution can be found up on GitHub.
Running
Initially, I used the lein run
command to run the solution:
$ lein trampoline run -m advent-2019.day01
The trampoline
makes it run faster. From the lein docs:
For long-running lein run processes, you may wish to save memory with the higher-order trampoline task, which allows the Leiningen JVM process to exit before launching your projectās JVM.
On my computer, day01 takes about 1.5s to run:
$ time lein trampoline run -m advent-2019.day01
Day 01, Part 1: 3337766
Day 01, Part 2: 5003788
real 0m1.571s
...
This is pretty slow for what the program does, but the above command takes the time to compile our code before running it. If we compile our code ahead of time, we donāt have to sit through the compilation!
Running Faster
Compiling our solution to a JAR means that the time it takes to run no longer includes the time to compile the code:
$ lein with-profile day01 uberjar
Compiling advent-2019.core
Compiling advent-2019.day01
Created /[...]/advent-2019/target/advent2019-0.1.0-SNAPSHOT.jar
Created /[...]/advent-2019/target/advent2019-day01.jar
$ time java -jar ./target/advent2019-day01.jar
Day 01, Part 1: 3337766
Day 01, Part 2: 5003788
real 0m0.519s
...
You can see weāve shaved about 1 second off, reducing our execution time by about ~67%.
But we can do better. We need more speed.
Running ON METAL
At this point, having followed the development on and off of GraalVM for a couple of years, I was really at the point where I wanted to see what it could do for Clojure code. Iāve always been of the mindset that, in Clojure, you have to sacrifice runtime speed in exchange for ease of development and high-level thinking. I was hopeful that, by compiling Java bytecode down to machine code, weād be able to see substantial gains both in terms of execution time as well as memory consumption.
So I set out to see if we could run this program on āthe metal.ā If we could go fast.
For those of you following along at home:
First, follow BrunoBonacciās excellent instructions to get GraalVM installed and on your path to replace your default Java installation.
If GraalVM is installed correctly, you should see:
$ java -version
openjdk version "11.0.5" 2019-10-15
OpenJDK Runtime Environment (build 11.0.5+10-jvmci-19.3-b05-LTS)
OpenJDK 64-Bit GraalVM CE 19.3.0 (build 11.0.5+10-jvmci-19.3-b05-LTS, mixed mode, sharing)
Then, weāll rebuild our JAR using GraalVM, and compile the new JAR down to machine code:
$ lein do clean, with-profile day01 uberjar
$ native-image --report-unsupported-elements-at-runtime --initialize-at-build-time -jar ./target/advent2019-day01.jar -H:Name=./target/day01
Weāre on the metal now! Our executable is over twice as large as the original JAR:
$ du -sh ./target/*
4.9M ./target/advent2019-day01.jar
10M ./target/day01
ā¦but it contains all of the Java and Clojure we need to run our program (in addition to the program itself) without use of the JVM. Large binaries are the tradeoff we make for superior performance and memory utilization.
It is time. We measure the speed:
$ time ./target/day01
Day 01, Part 1: 3337766
Day 01, Part 2: 5003788
real 0m0.004s
...
Our program is now two whole orders of magnitude faster than running a JAR on the JVM. When compared with running our program using lein run
, it is 39,275% faster.
Memory Utilization
Instead of making vague hand-wavy claims about memory utilization, letās run the numbers using /usr/bin/time -v
, which is different than BASH time
. We are interested in the āMaximum resident set sizeā, but I have included the other timing again because itās just so interesting.
# JAVA
$ /usr/bin/time -v java -jar ./target/advent2019-day01.jar
...
Command being timed: "java -jar ./target/advent2019-day01.jar"
User time (seconds): 1.41
System time (seconds): 0.21
Percent of CPU this job got: 263%
...
Maximum resident set size (kbytes): 319272
...
# BARE METAL
$ /usr/bin/time -v ./target/day01
...
Command being timed: "./target/day01"
User time (seconds): 0.00
System time (seconds): 0.00
Percent of CPU this job got: 33%
...
Maximum resident set size (kbytes): 9916
...
Looks like we have a one-two-three punch when it comes to bare metal performance:
- One OOM better CPU utilization
- Two OOM better Memory utilization
- Three OOM better real time elapsed
I cannot take credit for this image.
OK, but what does Clojure think?
Letās be honest: weāre mostly measuring things around program execution, rather than the time the computer spends actually performing calculations. Most, if not all, of the speedups we measured above come from the fact that weāre not using the JVM anymore, and donāt need to wait for it to warm up, read bytecode, and spit out an answer.
So letās see what Clojure thinks.
Clojure provides a (time)
function for us. It measures the amount of time the computer spends executing a function. Itās pretty easy to use:
Before
(defn -main []
(let [input (get-input "day01.txt" true)]
(println "Day 01, Part 1:" (part1 input))
(println "Day 01, Part 2:" (part2 input))))
After
(defn -main []
(let [input (get-input "day01.txt" true)]
(time (println "Day 01, Part 1:" (part1 input)))
(time (println "Day 01, Part 2:" (part2 input)))))
When we recompile our program and run it with Java, we get:
$ java -jar ./target/advent2019-day01.jar
Day 01, Part 1: 3337766
"Elapsed time: 5.3791 msecs"
Day 01, Part 2: 5003788
"Elapsed time: 9.0699 msecs"
When we run it on bare metal, we get:
$ ./target/day01
Day 01, Part 1: 3337766
"Elapsed time: 0.4455 msecs"
Day 01, Part 2: 5003788
"Elapsed time: 0.5274 msecs"
Oh. So itās still an order of magnitude faster than Java even forsaking all the advantages we get for not using the JVM.
Nice.
Conclusion
It took a bit of time and research getting compilation working, but I was excited to finally see if it made a difference for running Clojure. Man oh man, what a difference it made.
As a result of running these experiments, Iāve had to change my mindset about Clojure. There are still many compelling arguments online for using a language like Rust if you need ultra-fast performance on the metal, but for the vast majority of applications, I feel that Clojure no longer requires that one compromise performance in exchange for all its other amazing benefits.
Basically:
- It is faster to develop software using Clojure
- It is faster and more memory-efficient to run (compiled) Clojure than it is to run JARs on the JVM
- Clojure suffers from far less entropy and churn than the JavaScript and Python ecosystems, which further reduces development costs
- But, like our large binary sizes, thereās a high upfront cost to learning Clojure
As engineers, learning is part of the job. Iām of the opinion that taking the time to learn while off the clock is a much more efficient way to become a better developer, because youāre in charge of how you spend your time.
Learning Clojure is probably one of the best things you can do to become a better developer, even if you donāt end up using it in your day job. Once youāve made the decision to learn it, there really arenāt any downsides left! š