Reducing Runtime with Reducers
December 22, 2019
Hi Friends, welcome back!
Today weâre going to be continuing our work on 2019âs Advent of Code, Day 04. If you remember from our last post, our program started out awfully slow thanks to some ambiguous typing. We were able to speed it up pretty substantially by providing type hints.
Now, in the story I told yesterday, we went straight from slow code to fast code by jumping straight into type hinting. When I was puzzling this out by myself, the first thing I actually did was attempt to parallelize the application to speed it up. You donât know what you donât know, and I didnât know about type hinting in Clojure, nor did I have reflection warnings turned on.
What I was delighted to discover is that parallelization in Clojure is actually very straightforward and doesnât doesnât require dealing with locking or mutexes or passing stuff off to threads.
There are a couple âgotchas,â which weâll go over, but you can parallelize many operations simply by replacing a few key function calls with their counterparts from clojure.core.reducers
.
The biggest resources for me when researching this topic were:
- @nonrecursiveâs excellent âKnow your Reducersâ tutorial
- Rich Hickeyâs Anatomy of a Reducer up on the official website
Letâs look at our original code, and see what we can do to speed it up.
Original Code (pre-reducers)
(ns advent-2019.day04
(:gen-class))
(defn get-digit [^Character x] (Character/digit x 10))
(defn get-digits
"Transforms a string or a number into a list of individual digits"
[^Integer number] (map get-digit (str number)))
(defn valid-number? [number]
(let [digits (get-digits number)]
(and (not (apply distinct? digits))
(apply <= digits))))
(defn part1 [numbers]
(count (filter valid-number? numbers)))
(defn -main []
(let [[start end] [402328 864247]
numbers (range start (inc end))]
(time (println "Day 04, part 1:" (part1 numbers)))))
Remember this guy? When we left off, we were running in about half a second:
âElapsed time: 503.0648 msecsâ
Hereâs how weâre going to speed it up:
- Require
clojure.core.reducers
into our namespace - Replace our sequence functions with the reducers
- Deal with the âgotchasâ
Requiring Reducers
This is pretty straightforward:
Before
(ns advent-2019.day99
(:gen-class))
After
(ns advent-2019.day04
(:require [clojure.core.reducers :as r])
(:gen-class))
No surprises there!
Replace sequence functions with reducers
This is where the magic happens. @nonrecursive does a great job of describing why reducers do such a good job speeding things up. From the link above:
Reducers are designed to perform better (often dramatically better) than seq functions under some circumstances by performing eager computation, eliminating intermediate collections and allowing parallelism.
By themselves, map
and filter
are âseq functionsâ â they operate on a sequence, in order.
By moving to r/map
and r/filter
, we can start to take advantages of the eager computations and parallelism mentioned above.
Replace filter
, deal with the gotcha
Letâs first replace our filter
function:
Before
(defn part1 [numbers]
(count (filter valid-number? numbers)))
After
(defn part1 [numbers]
(count (r/filter valid-number? numbers)))
Run the program and what do we get?
Syntax error (UnsupportedOperationException) compiling at (/tmp/form-init13929748832015658567.clj:1:74).
count not supported on this type: reducers$folder$reify__332
And hereâs our first âgotcha:â turns out that we canât count
the result of a reducer because of the whole bit about âeliminating intermediate collections.â
Rich has this to say:
The premise of the reducers library is that the minimum definition of collection is something that is reducible
The functions consist only of the core logic of their operations
That logic does not include any notion of collection, nor order
filtering and kin can âskipâ inputs by simply returning the incoming result
So the reason we canât âcountâ our results is that we havenât actually accumulated the results of our r/filter
into a data structure we could count.
Thatâs an easy fix, though!
(defn part1 [numbers]
(->> numbers ; Turn this into a threading macro for readability
(r/filter valid-number?)
r/foldcat ; Add this line!
count))
By invoking r/foldcat
, we accumulate our results so that sequence functions like first
and count
can be invoked on them.
Letâs see how much faster it is!
âElapsed time: 508.2671 msecsâ
Itâs⊠not?
Turns out, thereâs another gotcha with reducers: some sequences are not âfoldable.â Vectors, however, are foldable sequences, so we can see some pretty substantial gains by simply casting our range to a vector, like so:
(defn part1 [numbers]
(->> numbers
vec ; Add this line!
(r/filter valid-number?)
r/foldcat
count))
âElapsed time: 238.0544 msecsâ
And with that one change, our code runs almost twice as fast!
Composing with r/map
Taking what Rich was saying into mind, we should refactor our code now to take advantage of the benefits imparted by composing reducers. By reducing intermediate collections, we allow Clojure to do more work in parallel before bringing everything back together.
Instead of getting the digits for our number inside of the valid-number?
function, letâs refactor it to accept digits, and chain get-digits
into our reducer chain:
Before
(defn valid-number? [number]
(let [digits (get-digits number)]
(and (not (apply distinct? digits))
(apply <= digits))))
(defn part1 [numbers]
(->> numbers
vec
(r/filter valid-number?)
r/foldcat
count))
After
(defn valid-number? [digits]
; No more (let) binding here
(and (not (apply distinct? digits))
(apply <= digits)))
(defn part1 [numbers]
(->> numbers
vec
(r/map get-digits) ; Add this line!
(r/filter valid-number?)
r/foldcat
count))
Howâd we end up doing?
âElapsed time: 226.0884 msecsâ
Just a little faster, only about 5%. Before I discovered type hinting, however, these changes yielded substantially better results!
If you remember from our previous post, type hinting enabled us to get from 9 seconds of runtime down to half a second. But if you remember from the top of this post, I actually implemented folding and reducers before I knew type hinting existed.
When I was stumbling through this on my own, I was able to get the time down from 9s to 1.5s using reducers. I think in that case, we were able to take better advantage of the parallelism because the reflection was adding so much extra work to do.
So while âtwice as fastâ doesnât make for quite as good a headline as âsix times faster,â the fact remains that weâre still seeing substantial performance improvements from a single import and a couple small tweaks to our existing code. Nice!
At this point, if we hope to further improve performance (and I do love improving performance), we will need to employ profiling tools to see where we spend the most time (where our code is the âhottestâ). If we can identify those areas and make them more performant, our code will get even faster.
So, in a later post, weâll explore how to profile our functions in Clojure to find out where we spend the most time for some targeted performance improvements. Stay tuned!