Stream Performance in Elixir

by Jeremy D. Frens on May 15, 2015

I’ve been learning Elixir over the past month. I was happy to discover the Stream module which implements lazy enumerations. I was intrigued by them even before Ruby introduced them in Ruby 2.0, and it was one of the appeals of learning Clojure.

Consider this Elixir code:

File.stream!("/usr/share/dict/words")
|> Enum.map(&strip_and_reverse/1)
|> Enum.each(&IO.puts/1)

This code reads the file line by line, then maps a strip_and_reverse function over every line (which strips off a newline on the end and reverses the word on that line), and finally puts every reversed word onto standard output.¹ Since I’m using functions from Enum, this code will execute eagerly. Enum.map and Enum.each will wait until the previous step is finished before starting. So we’ll have some enumerator created before both of those calls, and that can be costly for 235,886 words.

The streaming/lazy version looks pretty similar:

File.stream!("/usr/share/dict/words")
|> Stream.map(&strip_and_reverse/1)
|> Stream.each(&IO.puts/1)
|> Stream.run

The Stream.run at the end of the pipeline is needed to trigger the computation. The Stream functions will wait until their result is actually demanded, and Stream.run demands the whole result. However, the Stream functions are so lazy that they work on only one line at a time. One line is read from the file, then it’s stripped and reversed, then that result is put to the standard output. Then the pipeline is triggered for the second line. And so it goes.

Intermediate enumerations of any sort are never needed, and the code will never have all 235,886 words in memory at any time. At least in theory.

I wrote two more variations: one with an eager map and a lazy each, the other with a lazy map and an eager each.

I expected the stream version to kick butt, but, no:

Version	time (s)
all stream	4.408
eager map	4.171
eager puts	4.322
all eager	3.986

Not only does the all eager version win, it’s pretty significant. Not a butt kicking, but still pretty impressive. I don’t know if I have any good excuses for this, but some things to consider:

What if I processed more words?
What if I processed longer words?
What if I did more processing on each word?

Despite my hate for IO.puts, I was reluctant to remove it since it actually did something, but without it, the times are much closer:

Version	time (s)
all stream	1.471
eager map	1.443
eager puts	1.470
all eager	1.467

Those look so close to not really matter, but I really wanted the stream version to kick some butt.

The other benefit from a stream is that the amount of work done is governed by the last operation. Consider these two blocks of code:

# eager
File.stream!("/usr/share/dict/words")
|> Enum.map(&strip_and_reverse/1)
|> Enum.take(20)
|> Enum.each(&IO.puts/1)

# stream
File.stream!("/usr/share/dict/words")
|> Stream.map(&strip_and_reverse/1)
|> Stream.take(20)
|> Stream.each(&IO.puts/1)
|> Stream.run

Both of them will only print out the first 20 words from the file. However, the eager version works strictly left to right (or top-down depending on how you read the code). So the entire stream of words from the file are stripped and reversed, then it picks the first twenty to print. The lazy version is controlled right to left, so the take(20) controls the map and asks it for only 20 stripped-and-reversed words; in turn, the map asks the file for just 20 words. Prepare yourself:

Version	time (s)
eager first 20	1.390
stream first 20	0.001

Butt kicking acheived.

I’d like to try out the variations I listed above, increasing data sizes and algorithm complexity. I’m particularly curious what effect concurrency and the parallelism from multicores would have on this problem, but Elixir/Erlang favor server-level concurrency, not data level. But that all sounds good for a future blog post.

The script I used for my experiments is available if you’d like to play around with it. I’d appreciate any feedback if I did anything that was really stupid. Or even mildly stupid.

Footnotes

I am embarassed by the IO.puts in this code, but not much. ↩

elixir lazy