Wednesday, June 30, 2004

Java and MT


Java's memory model is very aggressive, and you have to be very careful when accessing memory from multiple threads. You of course have to synchronize access to memory locations, but you have to synchronize them even when it looks like you don't have to. There are several cases where you must use synchronize:
  • To provide a mutual exclusion barrier to prevent one thread from modifying a data structure while the other is reading it.
  • To provide a memory barrier to prevent memory operation reordering from doing something you didn't want to have happen.
  • To make the memory you're accessing volatile so that the runtime optimizer doesn't throw away your request to read a memory location.
Here's a good web page that discusses this.
A good rule to use is that when in doubt, synchronize.
Reordering can only hit you with a multiple-cpu machine, but the problems that I've been running into recently happen on my single CPU machine, with something like this:
(Note that everything after this is speculation based on behavior I've seen):
int m_y = 0;
Thread1() {
    synchronized(m_x) {
        m_y = 1;
    }
}
void Thread2() {
    while(true)
        System.out.println(m_y);
}
Even after the code in Thread1 has executed in its thread, the code in Thread2 will print 0; I believe this is because the runtime optimizer doesn't bother to look at the value of m_y after the first access. This is similar to a compile-time optimizer, which you'd fix with volatile. But a compile-time optimizer couldn't do anything in this situation.
But in Java the runtime optimizer will make it so that the first access gets the value, but it won't bother reading the value from memory anymore after that.
This strange behavior goes away by putting the synchronize(m_x) around the access to m_y. I believe this tells the runtime optimizer that something is likely to have been changed by another thread.