Reducing Runtime with Reducers

December 22, 2019

Hi Friends, welcome back!

Today we’re going to be continuing our work on 2019’s Advent of Code, Day 04. If you remember from our last post, our program started out awfully slow thanks to some ambiguous typing. We were able to speed it up pretty substantially by providing type hints.

Now, in the story I told yesterday, we went straight from slow code to fast code by jumping straight into type hinting. When I was puzzling this out by myself, the first thing I actually did was attempt to parallelize the application to speed it up. You don’t know what you don’t know, and I didn’t know about type hinting in Clojure, nor did I have reflection warnings turned on.

What I was delighted to discover is that parallelization in Clojure is actually very straightforward and doesn’t doesn’t require dealing with locking or mutexes or passing stuff off to threads. There are a couple “gotchas,” which we’ll go over, but you can parallelize many operations simply by replacing a few key function calls with their counterparts from clojure.core.reducers.

The biggest resources for me when researching this topic were:

@nonrecursive’s excellent “Know your Reducers” tutorial
Rich Hickey’s Anatomy of a Reducer up on the official website

Let’s look at our original code, and see what we can do to speed it up.

Original Code (pre-reducers)

(ns advent-2019.day04
  (:gen-class))

(defn get-digit [^Character x] (Character/digit x 10))

(defn get-digits
  "Transforms a string or a number into a list of individual digits"
  [^Integer number] (map get-digit (str number)))

(defn valid-number? [number]
  (let [digits (get-digits number)]
    (and (not (apply distinct? digits))
         (apply <= digits))))

(defn part1 [numbers]
  (count (filter valid-number? numbers)))

(defn -main []
  (let [[start end] [402328 864247]
        numbers (range start (inc end))]
    (time (println "Day 04, part 1:" (part1 numbers)))))

Remember this guy? When we left off, we were running in about half a second:

“Elapsed time: 503.0648 msecs”

Here’s how we’re going to speed it up:

Require clojure.core.reducers into our namespace
Replace our sequence functions with the reducers
Deal with the “gotchas”

Requiring Reducers

This is pretty straightforward:

Before

(ns advent-2019.day99
  (:gen-class))

After

(ns advent-2019.day04
  (:require [clojure.core.reducers :as r])
  (:gen-class))

No surprises there!

Replace sequence functions with reducers

This is where the magic happens. @nonrecursive does a great job of describing why reducers do such a good job speeding things up. From the link above:

Reducers are designed to perform better (often dramatically better) than seq functions under some circumstances by performing eager computation, eliminating intermediate collections and allowing parallelism.

By themselves, map and filter are “seq functions” — they operate on a sequence, in order. By moving to r/map and r/filter, we can start to take advantages of the eager computations and parallelism mentioned above.

Replace `filter`, deal with the gotcha

Let’s first replace our filter function:

Before

(defn part1 [numbers]
  (count (filter valid-number? numbers)))

After

(defn part1 [numbers]
  (count (r/filter valid-number? numbers)))

Run the program and what do we get?

Syntax error (UnsupportedOperationException) compiling at (/tmp/form-init13929748832015658567.clj:1:74).
count not supported on this type: reducers$folder$reify__332

And here’s our first “gotcha:” turns out that we can’t count the result of a reducer because of the whole bit about “eliminating intermediate collections.” Rich has this to say:

The premise of the reducers library is that the minimum definition of collection is something that is reducible

The functions consist only of the core logic of their operations

That logic does not include any notion of collection, nor order

filtering and kin can ‘skip’ inputs by simply returning the incoming result

So the reason we can’t “count” our results is that we haven’t actually accumulated the results of our r/filter into a data structure we could count. That’s an easy fix, though!

(defn part1 [numbers]
  (->> numbers   ; Turn this into a threading macro for readability
       (r/filter valid-number?)
       r/foldcat ; Add this line!
       count))

By invoking r/foldcat, we accumulate our results so that sequence functions like first and count can be invoked on them.

Let’s see how much faster it is!

“Elapsed time: 508.2671 msecs”

It’s… not?

Turns out, there’s another gotcha with reducers: some sequences are not ”foldable.” Vectors, however, are foldable sequences, so we can see some pretty substantial gains by simply casting our range to a vector, like so:

(defn part1 [numbers]
  (->> numbers
       vec  ; Add this line!
       (r/filter valid-number?)
       r/foldcat
       count))

“Elapsed time: 238.0544 msecs”

And with that one change, our code runs almost twice as fast!

Composing with `r/map`

Taking what Rich was saying into mind, we should refactor our code now to take advantage of the benefits imparted by composing reducers. By reducing intermediate collections, we allow Clojure to do more work in parallel before bringing everything back together.

Instead of getting the digits for our number inside of the valid-number? function, let’s refactor it to accept digits, and chain get-digits into our reducer chain:

Before

(defn valid-number? [number]
  (let [digits (get-digits number)]
    (and (not (apply distinct? digits))
         (apply <= digits))))

(defn part1 [numbers]
  (->> numbers
       vec
       (r/filter valid-number?)
       r/foldcat
       count))

After

(defn valid-number? [digits]
  ; No more (let) binding here
  (and (not (apply distinct? digits))
       (apply <= digits)))

(defn part1 [numbers]
  (->> numbers
       vec
       (r/map get-digits) ; Add this line!
       (r/filter valid-number?)
       r/foldcat
       count))

How’d we end up doing?

“Elapsed time: 226.0884 msecs”

Just a little faster, only about 5%. Before I discovered type hinting, however, these changes yielded substantially better results!

If you remember from our previous post, type hinting enabled us to get from 9 seconds of runtime down to half a second. But if you remember from the top of this post, I actually implemented folding and reducers before I knew type hinting existed.

When I was stumbling through this on my own, I was able to get the time down from 9s to 1.5s using reducers. I think in that case, we were able to take better advantage of the parallelism because the reflection was adding so much extra work to do.

So while “twice as fast” doesn’t make for quite as good a headline as “six times faster,” the fact remains that we’re still seeing substantial performance improvements from a single import and a couple small tweaks to our existing code. Nice!

At this point, if we hope to further improve performance (and I do love improving performance), we will need to employ profiling tools to see where we spend the most time (where our code is the “hottest”). If we can identify those areas and make them more performant, our code will get even faster.

So, in a later post, we’ll explore how to profile our functions in Clojure to find out where we spend the most time for some targeted performance improvements. Stay tuned!

Original Code (pre-reducers)

Requiring Reducers

Before

After

Replace sequence functions with reducers

Replace filter, deal with the gotcha

Before

After

Composing with r/map

Before

After

Replace `filter`, deal with the gotcha

Composing with `r/map`