Side Effects Matter

by Jeremy D. Frens on May 15, 2016
part of the Ruby, Elixir, Variable Bindings, and Concurrency series

In the previous two articles on closures and scope in Ruby and on closures and scope in Elixir, we saw how Elixir bypassed any variable reassignments. What looks like a reassignment is really just a binding to a new variable in a new scope.

I talked very briefly how side effects cause headaches in Ruby.¹ I want to elaborate on this so that we can understand better why Erlang and Elixir work hard to eliminate them.

Who’s doing what to whom?

First, our Elixir functions:

h = fn x -> x + 10 end
x = 5
f = fn -> x = x * 2 ; x end

h is completely self-contained and couldn’t possibly pollute any other environment. f is also closed off from polluting other environments because of Elixir semantics: it can read x, but the rebinding to x in the function creates a whole new x with its own scope. So as written, h will always compute the same thing given the same argument; f will always return 10.

Now consider again the Ruby equivalents:

h = ->(x) { x + 10 }
x = 5
f = ->() { x = x * 2 ; x }

Suppose you had to pass these functions into some library code. Which one would concern you the most? h is safe. When you pass it into a library, that library can call h as many times as it likes, and it will never affect anything in your code. Similarly, there’s nothing you can do in your code that will affect the library as it calls h.

But when you pass in f to a library, you’re at the mercy of that library and the library is at your mercy. Does the library call f two or three times? What happens if we change x after we pass in f into some library’s object? How does all this change when the library is upgraded? Will your tests recognize this change?

To make matters worse, these problems are usually non-local in both time and space. f could sit in an object for two hours then get called—non-local in time. f (the problem) is in your code, the problem shows up in some library—non-local in space. Non-local problems are always a pain to debug.

Concurrency

Side effects are even more important when we consider concurrency. Environments are shared across threads in Ruby:

x = 5
thr = Thread.new { f.() } ; thr.join
x # => 10
x = 13
thr = Thread.new { f.() } ; thr.join
x # => 26

f brings the environment from the original thread over to the new thread. Now timing becomes a serious issue:

x = 5
thr = Thread.new { f.() } ; xx = x ; thr.join
xx # => 5
x # => 10

The interesting value here is in xx.

In my experiments, the thread doesn’t have enough time to execute f before xx = x is computed, so x is still 5 as recorded by xx. But after the thread is done, x does, in fact, get updated to 10.

If we delay that capture just a bit, xx records a different value:

x = 5
thr = Thread.new { f.() } ; sleep 2 ; xx = x ; thr.join
xx # => 10
x # => 10

Now the new thread finishes before we record the value of x, and we see 10 in xx.

We still have the same non-locality problems. f might not be called directly from the thread; there might be several library layers to get through first. But now, we also have to consider the order in which these threads execute. This order might not always be consistent, and the exact timing might be really hard to reproduce when debugging or testing.

Side-effecting objects

I’ve probably pushed on h and f as functions in Ruby a little too far. How many functions like these do Rubyist write on a regular basis? We write classes to generate objects! And now I’ve convinced us all to stop writing anonymous functions. Problem solved!

Not so fast.

First of all, code blocks suffer from all of the same problems as anonymous functions. However, due to the way we tend to write our blocks, I don’t think it is much of a problem. I leave it as an exercise for to rewrite all the Ruby examples using blocks instead of stabby lambdas.

Besides, objects are a problem.

“But they don’t capture any scope!” you exclaim. They do, but we don’t describe it that way, and it’s much more explicit. We store state in instance variables.

class F
  attr_accessor :x
  def initialize(x) ; @x = x      ; end
  def call          ; @x = @x * 2 ; end
end

f = F.new(5)
f.x # => 5
f.() # => 10
f.x # => 10
f.x = -13
f.() # => -26

For all intents and purposes, f.x behaves for the object f just like the global x did for function f. We get all of the same problems.

I can anticipate a few objections:

“But I would make F#x= a private method!” Just so that some other method in the class can call it? Hiding it behind a chain of methods calls only makes things harder (non-local in space).
“I wouldn’t even write the mutator.” And you’re not changing @x directly in some other method?
“That’s right: no mutator and no assignments to @x!” Thank you for proving my point: no side effects leads to better code!

Objects and concurrency

Let’s switch to a less contrived example:

class Inc
  def initialize(value) ; @value = value      ; end
  def value             ; @value              ; end
  def inc!              ; @value = @value + 1 ; end
end

x = Inc.new(0)
x.inc!
x.inc!
x.value # => 2

Reassigning a value seems more natural in this example.

We could use this concurrently:

x = Inc.new(0)
t1 = Thread.new { x.inc! }
t2 = Thread.new { x.inc! }
t1.join
t2.join
x.value # => 1 or 2

That’s right. The answer can be 1. Consider this sequence of events:

t1 reads 0 from @value.
t2 reads 0 from @value.
t1 increments 0 to 1.
t2 increments 0 to 1.
t1 sets @value to 1.
t2 sets @value to 1.

Two calls to inc!, and only one increment.

In fact, there are only two sequences of those instructions that work: t1 does all its work, then t2; alternatively, t2 could go first. Those are the only two sequences that give you the right answer of 2. Any other timing results in the wrong answer.

Now, I ran this over 1000 times on my laptop, and I got 2 every time. I think I know why, which eventually lead me to this code:

class Inc
  def initialize(value) ; @value = value ; end
  def value             ; @value         ; end
  def inc!
    v = @value
    sleep 0
    @value = v + 1
  end
end

x = Inc.new(0)
t1 = Thread.new { 10.times { x.inc! } }
t2 = Thread.new { 10.times { x.inc! } }
t1.join
t2.join
x.value # => ???

Two changes:

inc! breaks up the read from the computation-and-write with a sleep.
Both threads increment 10 times.

The result should be 20.

If I take out the sleep completely, I always get 20.
With the sleep, even with the duration set to zero…
- I usually get 10.
- Sometimes I get 20.
- A couple times I got 15.
- I got 17 and 18 at least once each.²

This is known as a race condition. It’s a race to see which thread gets to read, compute, and write first.

Race conditions: my legal obligation

I am required by the Committee for Those Who Write about Race Conditions to use The One True Example: bank accounts.

Now imagine that Inc were called SavingsAccount, and inc! was deposit. Now imagine that you and your significant other are making deposits at the same time.

Two deposits, one adjustment.

You won’t be using that bank for very long after that mistake.

This is known as a race condition, and it is the bane of concurrency.

To be continued…

I’m going to end this article here, leaving Ruby frozen in carbonite and Elixir without a hand. Next time we’ll look at some options we have in Ruby to fix this problem (probably using Ewoks), and we’ll see how Elixir does it oh so much better (probably using those totally awesome AT-ST walkers).³

Footnotes

Get it? Side effects cause headaches. Headaches themselves are side effects of every medication! I crack me up. ↩
I believe that without the sleep, inc! is such simple code that it always executes in one chunk. But with the sleep, even with the duration set to 0, I believe it triggers a context switch to the other thread. So t1 reads, then sleeps, so t2 reads, then sleeps, t1 continues. ↩
I mean, AT-ATs and AT-STs are just so impractically awesome, right? I need to watch Star Wars again… ↩

elixir ruby concurrency