Hi Friends, welcome back!
Today we’re going to be continuing our work on 2019’s Advent of Code, Day 04. If you remember from our last post, our program started out awfully slow thanks to some ambiguous typing. We were able to speed it up pretty substantially by providing type hints.
Now, in the story I told yesterday, we went straight from slow code to fast code by jumping straight into type hinting. When I was puzzling this out by myself, the first thing I actually did was attempt to parallelize the application to speed it up. You don’t know what you don’t know, and I didn’t know about type hinting in Clojure, nor did I have reflection warnings turned on.
What I was delighted to discover is that parallelization in Clojure is actually very straightforward and doesn’t doesn’t require dealing with locking or mutexes or passing stuff off to threads.
There are a couple “gotchas,” which we’ll go over, but you can parallelize many operations simply by replacing a few key function calls with their counterparts from
The biggest resources for me when researching this topic were:
- @nonrecursive’s excellent “Know your Reducers” tutorial
- Rich Hickey’s Anatomy of a Reducer up on the official website
Let’s look at our original code, and see what we can do to speed it up.
Original Code (pre-reducers)
(ns advent-2019.day04 (:gen-class)) (defn get-digit [^Character x] (Character/digit x 10)) (defn get-digits "Transforms a string or a number into a list of individual digits" [^Integer number] (map get-digit (str number))) (defn valid-number? [number] (let [digits (get-digits number)] (and (not (apply distinct? digits)) (apply <= digits)))) (defn part1 [numbers] (count (filter valid-number? numbers))) (defn -main  (let [[start end] [402328 864247] numbers (range start (inc end))] (time (println "Day 04, part 1:" (part1 numbers)))))
Remember this guy? When we left off, we were running in about half a second:
“Elapsed time: 503.0648 msecs”
Here’s how we’re going to speed it up:
clojure.core.reducersinto our namespace
- Replace our sequence functions with the reducers
- Deal with the “gotchas”
This is pretty straightforward:
(ns advent-2019.day99 (:gen-class))
(ns advent-2019.day04 (:require [clojure.core.reducers :as r]) (:gen-class))
No surprises there!
Replace sequence functions with reducers
This is where the magic happens. @nonrecursive does a great job of describing why reducers do such a good job speeding things up. From the link above:
Reducers are designed to perform better (often dramatically better) than seq functions under some circumstances by performing eager computation, eliminating intermediate collections and allowing parallelism.
filter are “seq functions” — they operate on a sequence, in order.
By moving to
r/filter, we can start to take advantages of the eager computations and parallelism mentioned above.
filter, deal with the gotcha
Let’s first replace our
(defn part1 [numbers] (count (filter valid-number? numbers)))
(defn part1 [numbers] (count (r/filter valid-number? numbers)))
Run the program and what do we get?
Syntax error (UnsupportedOperationException) compiling at (/tmp/form-init13929748832015658567.clj:1:74). count not supported on this type: reducers$folder$reify__332
And here’s our first “gotcha:” turns out that we can’t
count the result of a reducer because of the whole bit about “eliminating intermediate collections.”
Rich has this to say:
The premise of the reducers library is that the minimum definition of collection is something that is reducible
The functions consist only of the core logic of their operations
That logic does not include any notion of collection, nor order
filtering and kin can ‘skip’ inputs by simply returning the incoming result
So the reason we can’t “count” our results is that we haven’t actually accumulated the results of our
r/filter into a data structure we could count.
That’s an easy fix, though!
(defn part1 [numbers] (->> numbers ; Turn this into a threading macro for readability (r/filter valid-number?) r/foldcat ; Add this line! count))
r/foldcat, we accumulate our results so that sequence functions like
count can be invoked on them.
Let’s see how much faster it is!
“Elapsed time: 508.2671 msecs”
Turns out, there’s another gotcha with reducers: some sequences are not ”foldable.” Vectors, however, are foldable sequences, so we can see some pretty substantial gains by simply casting our range to a vector, like so:
(defn part1 [numbers] (->> numbers vec ; Add this line! (r/filter valid-number?) r/foldcat count))
“Elapsed time: 238.0544 msecs”
And with that one change, our code runs almost twice as fast!
Taking what Rich was saying into mind, we should refactor our code now to take advantage of the benefits imparted by composing reducers. By reducing intermediate collections, we allow Clojure to do more work in parallel before bringing everything back together.
Instead of getting the digits for our number inside of the
valid-number? function, let’s refactor it to accept digits, and chain
get-digits into our reducer chain:
(defn valid-number? [number] (let [digits (get-digits number)] (and (not (apply distinct? digits)) (apply <= digits)))) (defn part1 [numbers] (->> numbers vec (r/filter valid-number?) r/foldcat count))
(defn valid-number? [digits] ; No more (let) binding here (and (not (apply distinct? digits)) (apply <= digits))) (defn part1 [numbers] (->> numbers vec (r/map get-digits) ; Add this line! (r/filter valid-number?) r/foldcat count))
How’d we end up doing?
“Elapsed time: 226.0884 msecs”
Just a little faster, only about 5%. Before I discovered type hinting, however, these changes yielded substantially better results!
If you remember from our previous post, type hinting enabled us to get from 9 seconds of runtime down to half a second. But if you remember from the top of this post, I actually implemented folding and reducers before I knew type hinting existed.
When I was stumbling through this on my own, I was able to get the time down from 9s to 1.5s using reducers. I think in that case, we were able to take better advantage of the parallelism because the reflection was adding so much extra work to do.
So while “twice as fast” doesn’t make for quite as good a headline as “six times faster,” the fact remains that we’re still seeing substantial performance improvements from a single import and a couple small tweaks to our existing code. Nice!
At this point, if we hope to further improve performance (and I do love improving performance), we will need to employ profiling tools to see where we spend the most time (where our code is the “hottest”). If we can identify those areas and make them more performant, our code will get even faster.
So, in a later post, we’ll explore how to profile our functions in Clojure to find out where we spend the most time for some targeted performance improvements. Stay tuned!