Tuesday, July 31, 2007

volatile and threading

Until recently I hadn't much experience writing multi-threading programs in C++ so when I tried to I found that I'm really confused how multi-threading programs mix with volatile variables. So I did a little research and quick summary is: this topic is confusing. It looks like if you put locks around global variables shared between threads you shouldn't care about volatile flag. Definitely under POSIX threads and most likely when using other threading libraries as well. If you don't and rely on atomic operations it seems that you have to use volatile flag for shared global variables but concerning portability it is a grey area.

Longer story is below:

Suppose we have a piece of code which waits for a certain external condition to happen. The code could look like

bool gEvent = false;

void waitLoop() {
while (!gEvent) {
sleep(1);
}
...
}
Let's assume that this is a single threaded program and the external condition we are waiting for is a Unix signal. The signal handler is very simple - it simply sets gEvent to true:
void wakeUp() {
gEvent = true;
}
The problem with the code above is that compiler would optimize out check of the condition inside waitLoop() incorrectly assuming from local analysis of the code that gEvent never changes. The fix is to declare gEvent with volatile modifier which basically tells compiler that the variable can be changed at any time and that is unsafe to perform any optimization based on the analysis of local code:
volatile bool gEvent = false;
Let's take another example. The code is same but this time it is a mutli-threaded program where one thread waits for another. So waitLoop() runs inside one thread and wakeUp() eventually called from another. Is the code still correct? Probably yes if we keep volatile flag and if operations which read or write gEvent variable can be considered as atomic. The later assumptions seems to be correct for most (all?) platforms.

But what if we cannot treat operations which read or write gEvent variable as atomic? For example it might be an instance of a more complex type; for example an instance of class which contains other information then just a information whenever event have happened or not:
struct EventInfo {
EventInfo(bool happened = false, const string& source = "")
: fHappened(happened), fSource(source)
{}
bool fHappened;
string fSource;
}

volatile EventInfo gEventInfo;

void waitLoop() {
while (!fEventInfo.fHappened) {
sleep(1);
}
const string& eventSource = fEventInfo.fSource;
...
}

void wakeUp() {
gEventInfo = EventInfo(true, "wakeUp");
}
This code is still ok for single-threaded program where wakeUp() is a signal handler but is unsafe for multi-threaded program where wakeUp() runs in a separate thread as operations on gEventInfo cannot be treated as atomic anymore.

So how do we fix it? We should surround places where code reads or writes gEventInfo with locks to make sure only one thread accesses gEventInfo at a time. I'll use boost thread library in the example.
boost::mutex gMutex;

void waitLoop() {
string eventSource;

for (bool eventHappened = false; !eventHappened; ) {
{
boost::mutex::scoped_lock lock(gMutex);
eventHappened = fEventInfo.fHappened;
eventSource = fEventInfo.fSource;
}
sleep(1);
}
...
}

void wakeUp() {
boost::mutex::scoped_lock lock(gMutex);

gEventInfo = EventInfo(true, "wakeUp");
}
Comparing this code with earlier examples it looks like we still need to declare gEventInfo variable as volatile but it turns out we don't really need to. Quote from Thread Cannot be Implemented as a Library [PDF]:
In practice, C and C++ implementations that support
Pthreads generally proceed as follows:
  1. Functions such as pthread_mutex_lock() that are guaranteed by the standard to “synchronize memory” include hardware instructions (“memory barriers”) that prevent hardware reordering of memory operations around the call.
  2. To prevent the compiler from moving memory operations around calls to functions such as pthread_mutex_lock(), they are essentially treated as calls to opaque functions, about which the compiler has no information. The compiler effectively assumes that pthread_mutex_lock() may read or write any global variable. Thus a memory reference cannot simply be moved across the call. This approach also ensures that transitive calls, e.g. a call to a function f() which then calls pthread_mutex_lock(), are handled in the same way more or less appropriately, i.e. memory operations are not moved across the call to f() either, whether or not the entire user program is being analyzed at once.
So at least if you using POSIX threads (boost::threads under Linux uses them) your code is probably safe without use of volatile as long as you use locks around global variables shared between several threads. Good question whenever this example code is portable to other platforms; after all boost::threads supports threading libraries other then POSIX which may have other rules for mutexes and locks. I haven't researched this yet as for now I don't really care about other platforms.

Some interesting links on this topic:
  • A Memory model for C++: FAQ - mentions shortly reasons why volatile keyword is insufficient to ensure synchronization between threads and has links on papers for further reading.
  • http://www.artima.com/cppsource/threads_meeting.html - Not much to read there but I love this quote: "Not all the dragons were so easily defeated, unfortunately. Among the issues guaranteed to waste at least 20 minutes of group time with little or nothing to show ... What does volatile mean?" (this in context of multi-threaded programs). If C++ experts cannot agree on this ...
  • Another person gets confused over use of volatile and threads. Interesting discussion on comp.programming.threads.

Wednesday, July 25, 2007

boost::thread and boost::mutex tutorial

For most Boost's libraries its documentation requires you to read everything from the start to the end before you can write any code. Compare that with most of CPAN modules where usually you can start using CPAN module after quickly scanning synopsis and maybe description parts of it's POD documentation. POD documentation as a rule has good examples right on top of the page. Boost's documentation usually doesn't.

So I was looking for basic usage examples for boost::thread and boost::mutex classes and initially I couldn't find any because I was using wrong search keywords. In the end I figured out how to use boost::thread and boost::mutex classes in my application hard way by reading Boost documentation without relying on any examples. But afterwards I did find a very good article on this topic with many simple examples: The Boost.Threads Library on Dr.Dobb's. So I'm posting this link here for google. It is in top 10 hits for some relevant keywords but it is not for others (for example for boost thread mutex tutorial) and this is why I missed it initially. If my blog post helps any Boost.Threads newbie to get started then I would consider time spent writing this post to be not wasted.

Wednesday, July 18, 2007

Starting new blog

I used to have a blog at use.perl.org but I just was too lazy to write often. One problem it seems you need a certain discipline to keep doing that. Also I just didn't like this site blogging engine. It just looked too simple and offered little control.

At the certain point I tried to switch over a blog on my personal sitebut instead of actually blogging I got carried away by designing a "perfect" system for my blog. I spend hours evaluating different software for my blog and I had very exotic requirements like being able to use SCM software to store my posts. That implied I need a blog software which uses raw files to store posts. I ended up hacking something monstrous what was a combination of Blosxom, darcs and make. And it wasn't that convenient to use at all either. In the end I probably spent much more time setting up all this then actually blogging.

So now I want to start from the scratch: pick some blogging engine which doesn't get into a way and discipline myself to actually write periodically. From my experience learning new programming languages you learn much faster when you have an actual project you are trying to implement in the new language. In a similar venue I'd expect it would be much easier to find new topics for my blog each day if I have a certain new fun project on my mind. And this new project is going to be teaching myself OCaml. Let's see how it goes.