Reflecting on Reflection
December 21, 2019
The other day I was working on Day 4 of Advent of Code. I was pretty delighted with myself for solving this one in a handful of minutes when days 2 and 3 took significantly longer. Once I had my code returning the correct value for all the sample inputs, I went ahead and ran it on the main input.
āElapsed time: 9002.5386 msecsā
Nine seconds. My computer crunched numbers for 9 whole seconds before it spit out the answer.
I didnāt like that. Iāll be the first to admit that Iām no Clojure expert, but my solution wasnāt inelegant.
(defn get-digits
"Transforms a string or a number into a list of individual digits"
[number] (map #(Character/digit % 10) (str number)))
(defn valid-number? [number]
(let [digits (get-digits number)]
(and (not (apply distinct? digits))
(apply <= digits))))
(defn part1 [[start end]]
(count (filter valid-number? (range start (inc end)))))
(defn -main []
(time (println "Day 04, part 1:" (part1 [402328 864247]))))
Nice and short! And yet, when run, my computer ground its gears for nine whole seconds (on an Intel i7-8700K, no less).
Well, at least I had access to part 2 of the problem now, and it looks like part 2 needed access to the same range of integers. Maybe generating that giant list of integers was adding a lot of time? I decided to exclude it from the timing calculations:
(defn part1 [numbers]
(count (filter valid-number? numbers)))
(defn -main []
(let [[start end] [402328 864247]
numbers (range start (inc end))]
(time (println "Day 04, part 1:" (part1 numbers)))))
āElapsed time: 9057.4244 msecsā
Wait, itās slower? Butā¦ oh yeah, itās not actually generating all those numbers at once. From the range page on clojuredocs.org:
Returns a lazy seq of nums from start (inclusive) to end
Because range
is lazy, itās not actually building a list of numbers before we start timing the solution to our problem.
At this point, I was curious to see how much faster it would be to run our code on metal rather than on the JVM, so I compiled the code using the GraalVM native-image
tool, and ran it again:
$ ./target/day04
Exception in thread "main" java.lang.ClassNotFoundException: java.lang.Character
at com.oracle.svm.core.hub.ClassForNameSupport.forName(ClassForNameSupport.java:60)
at java.lang.Class.forName(DynamicHub.java:1197)
at clojure.lang.RT.classForName(RT.java:2211)
at clojure.lang.RT.classForName(RT.java:2220)
at advent_2019.day99$get_digits$fn__475.invoke(day99.clj:6)
at clojure.core$map$fn__5866.invoke(core.clj:2755)
at clojure.lang.LazySeq.sval(LazySeq.java:42)
at clojure.lang.LazySeq.seq(LazySeq.java:51)
at clojure.lang.RT.seq(RT.java:535)
...
A stack trace? At this point, I knew something in my code was to blame, and I took to the internet to see if I could find a clue.
Many of the search results turning up with Clojure and native-image
troubleshooting tips mentioned adding a check for warn-on-reflection, so I thought it would be a good idea to turn on and try to compile my code again.
Reflection warning, advent_2019/day04.clj:6:18 - call to static method digit on java.lang.Character canāt be resolved (argument types: unknown, long).
Whoa! Now weāre on to something! Apparently my code was relying on reflection to determine function argument types.
But wait a minute ā Clojure is dynamically typed. How else are you going to determine types? Is there a way to not rely on reflection in Clojure?
As a matter of fact, there is! It turns out that Clojure supports type hinting, which was news to me! From the official Clojure reference:
Clojure supports the use of type hints to assist the compiler in avoiding reflection in performance-critical areas of code.
Immediately followed by a very helpful example:
(defn len [x]
(.length x))
(defn len2 [^String x]
(.length x))
user=> (time (reduce + (map len (repeat 1000000 "asdf"))))
"Elapsed time: 3007.198 msecs"
4000000
user=> (time (reduce + (map len2 (repeat 1000000 "asdf"))))
"Elapsed time: 308.045 msecs"
4000000
Wait a minuteā¦ an order of magnitude faster?? And all I need to do is add some type hints?
Letās refactor our code a little bit:
Before
(defn get-digits
"Transforms a string or a number into a list of individual digits"
[number] (map #(Character/digit % 10) (str number)))
After
(defn get-digit [^Character x] (Character/digit x 10))
(defn get-digits
"Transforms a string or a number into a list of individual digits"
[^Integer number] (map get-digit (str number)))
Having made the change, letās run our code again:
āElapsed time: 517.1315 msecsā
Holy smokes! From 9.05s down to 0.52s! Thatās over 17x faster!
Turns out that, in this case, the reflection was really expensive. The get-digits
function is critical to our solution, and is called thousands of times. By āavoiding reflection in performance-critical areas of our code,ā we saw a pretty significant improvement in run time!
Since making this discovery, I have added the following to my project.clj
file:
:global-vars {*warn-on-reflection* true}
Now, the compiler (and my REPL of choice as well!) will bark at me every time I make ambiguous typing choices. Doing so allowed me to find type ambiguities in my Day 02 and 03 solutions as well!
In the end, with a little type hinting, it turns out my solution wasnāt so bad after all š
ā¦but we can still make it faster. Stay tuned for next time!