July 19, 2015 · python concurrency

Waiting until a Thread is ready in Python

Concurrency in Python can be tricky. Users of the threading module know how easy it is to get wrong. One thing that bit me quite a bit was how easy it was to get races when using threading.Thread.

The Race Condition

The following example is a fairly common race condition developers run into when using threading.Thread - we start a program in a thread, then operate on it before it is fully ready, causing our program to explode:

from threading import Thread

class ChildProgram:
    def __init__(self):
        self.connection = None

    def start(self):
        # some expensive setup...
        self.connection = True

    def some_logic(self):
        # do something with self.connection
        pass

program = ChildProgram()

thread = Thread(target=program.start)
thread.start()

# the following explodes because 
# program has not finished setting up!
program.some_logic()

What happened?

  1. We define a ChildProgram to run in a separate Thread.
  2. Then we call thread.start(), which calls program.start(), which has some expensive setup logic.
  3. Finally, we call program.some_logic(), expecting our ChildProgram to be completely ready, and our program explodes.

But why?

Turns out thread.start() is a non-blocking operation - when called, or program did not wait until program.start() finished. Instead, it continued on, causing us to call program.some_logic() before all of our setup logic completed. That's because program.start() was executing concurrently in another thread.

So how can we be sure our ChildProgram is fully ready before operating on it from our main thread?

Using the Event Synchronization Primitive

The correct way to solve this problem is by utilizing a threading.Event, which allows:

...one thread to signal an event and other threads wait for it.

In our case, we want the main thread to wait until the child thread to signal that it is ready.

Fortunately, this is trivial:

from threading import Event, Thread

class ChildProgram:
    def __init__(self, ready=None):
        self.ready = ready
        self.connection = None

    def connect(self):
        # lets make connection, expensive
        self.connection = SomeConnection()
        
        # then fire the ready event
        self.ready.set()

    def some_logic(self):
        # do something with self.connection
        pass

ready = Event()
program = ChildProgram(ready)

# configure & start thread
thread = Thread(target=program.connect)
thread.start()

# block until ready
ready.wait()

# now we can safely use program
program.some_logic()

Now that we've setup our ready Event(), we were able to ensure that our ChildProgram is fully initialized, and program.some_logic() is safe to call.

Don't rely on is_alive()

One might be tempted to call Thread.is_alive() to determine if the program in the thread is ready to go. This, however, would be a mistake because:

...this method returns True just before the run() method starts...

This means that is_alive() will return True even if the program you've started in the thread is not fully ready to accept work. In other words, if the code you are running in the thread takes awhile to setup, then relying on is_alive() to determine if the program in the thread is ready to interact is not enough.

Check out the example updated to use is_alive():


thread = Thread(target=program.start)
thread.start()

# block until thread is alive
while not thread.is_alive():
   pass

# the following explodes because 
# program has not finished setting up!
program.some_logic()

The above is not reliable because is_alive() will return True before our ChildProgram has finished setting up.

Cheers.