Saturday, 2 August 2014

Spurious wakeups in Java and how to avoid them.

Spurious wakeups

If you haven't heard this word now would be a good time for it. A very interesting topic. I was not aware of this problem some time back  and when I did some research on it turns out the concept does exist and must be taken care of in production level code. It is in fact a well thought design decision (Read more in Interesting Explanation section).

What are Spurious wakeups?

As per Wiki

Spurious wakeup describes a complication in the use of condition variables as provided by certain multithreading APIs such as POSIX Threads and the Windows API.

Even after a condition variable appears to have been signaled from a waiting thread's point of view, the condition that was awaited may still be false. One of the reasons for this is a spurious wakeup; that is, a thread might be awoken from its waiting state even though no thread signaled the condition variable. For correctness it is necessary, then, to verify that the condition is indeed true after the thread has finished waiting.

Why do they occur?

The pthread_cond_wait() function in Linux is implemented using the futex system call. Each blocking system call on Linux returns abruptly with EINTR when the process receives a signal. ... pthread_cond_wait() can't restart the waiting because it may miss a real wakeup in the little time it was outside the futex system call. This race condition can only be avoided by the caller checking for an invariant. A POSIX signal will therefore generate a spurious wakeup.

Interesting Explanation

Just think of it... like any code, thread scheduler may experience temporary blackout due to something abnormal happening in underlying hardware / software. Of course, care should be taken for this to happen as rare as possible, but since there's no such thing as 100% robust software it is reasonable to assume this can happen and take care on the graceful recovery in case if scheduler detects this (eg by observing missing heartbeats).

Now, how could scheduler recover, taking into account that during blackout it could miss some signals intended to notify waiting threads? If scheduler does nothing, mentioned "unlucky" threads will just hang, waiting forever - to avoid this, scheduler would simply send a signal to all the waiting threads.

This makes it necessary to establish a "contract" that waiting thread can be notified without a reason. To be precise, there would be a reason - scheduler blackout - but since thread is designed (for a good reason) to be oblivious to scheduler internal implementation details, this reason is likely better to present as "spurious".

How to avoid a spurious wakeup?

A thread can also wake up without being notified, interrupted, or timing out, a so-called spurious wakeup. While this will rarely occur in practice, applications must guard against it by testing for the condition that should have caused the thread to be awakened, and continuing to wait if the condition is not satisfied. In other words, waits should always occur in loops, like this one:

     synchronized (obj) {
         while (<condition does not hold>)
         ... // Perform action appropriate to condition

Important Links

No comments:

Post a Comment

t> UA-39527780-1 back to top