I've decided to kick off a return to blogging with a series on multi-threaded development in Python (CPython, to be specific). Yes, we all know
there's a GIL in the water, but multi-threading is still an extremely useful concurrency strategy in Python for i/o-bound activities ... which tends to characterize most of my use cases for concurrency. But there are lots of things that make multi-threaded programming tricky and there aren't quite as many resources out there for Python (as there are for Java, say).
Disclaimer: I am not an expert at multi-threaded programming in Python (or any other language). Most of this has been trial & error, some help from the Google, and a lot of foundation from the
excellent book on the subject by Brian Goetz:
Java Concurrency in Practice (despite the title, the principles in the book apply to Python too). If you know of a better way or better explanation, please leave a comment so we can all benefit.
After getting over some of the challenges of mutable state and atomicity of operations, I think one of the things that probably bit me next in Python specifically was handling of asynchronous exceptions (like
KeyboardInterrupt and in some cases
SystemExit) -- and specifically how one goes about actually
stopping a multi-threaded application. Too many times I would end up with a script that would just hang when I hit CTRL-C (and I'd have to explicitly kill it). So let's start there.
Asynchronous Exceptions
The
KeyboardInterrupt exception is actually an OS signal; specifically, the
signal module translates
SIGINT into the
KeyboardInterrupt exception. The rule is that on platforms where this
signal module is present, these signal exceptions will be raised in the
main thread. The
SystemExit exception is similar, in that no matter which thread raises it, it will always be raised in the main thread. On other platforms, apparently they may be raised anywhere (see the "Caveats" section of the
thread module reference documentation for more info); for the sake of focus here, we will assume that you are working on a platform with the
signal module present.
Let's start out with a simple example of a multi-threaded program that you cannot abort with CTRL-C:
import time
import threading
def dowork():
while True:
time.sleep(1.0)
def main():
t = threading.Thread(target=dowork, args=(), name='worker')
t.start()
# Block until the thread completes.
t.join()
if __name__ == '__main__':
main()
The problem here is that the
worker thread will not exit when the main thread receives the
KeyboardInterrupt "signal". So even though the
KeyboardInterrupt will be raised (almost certainly in the t.join() call), there's nothing to make the activity in the
worker thread stop. As a result, you'll have to go kill that python process manually, sorry.
Stopping Worker Threads
Solution 1: Kill with Prejudice
So the quick fix here is to make the
worker thread a
daemon thread. From the
threading reference documentation:
A thread can be flagged as a “daemon thread”. The significance of this flag is that the entire Python program exits when only daemon threads are left.
So in practice here, if you stop your main thread, your daemon thread will just stop in the middle of whatever it was doing & exit. In many cases this abrupt termination of any worker threads may be appropriate; however, there may also be cases where you actually want to manage what happens when your threads terminate; maybe they need to commit (or rollback) a transaction, save their state, etc. For this a more thoughtful approach is required.
Solution 2: Instruct Politely
The alternative to just killing them is to instruct the thread to stop using some agreed-upon system. You are probably aware (or have guessed) by now that there is no
Thread.stop() method in Python (and the one in Java is deprecated and generally considered a Bad Idea™). So what you must do is to implement a "thread interruption policy" which in our case is basically a signaling mechanism that the main thread can use to tell the worker thread to stop. Python provides a
threading.Event class that is for exactly this type of inter-thread signaling.
The
threading.Event objects are very simple two-state (on/off) flags that can be used without any additional locking to pass "messages" between threads. Here is a basic stratagy for using a
threading.Event to communicate a 'shutdown' message to a worker thread:
- You share a "shutdown" threading.Event instance between the threads (i.e. you either pass it to the threads or put it in a mutually accessible place).
- You set the event from the main thread when you receive the appropriate signal. Here we're focused on KeyboardInterrupt, but presumably users could also take some action within your application (e.g. "stop" button) to stop your application, i.e.
shutdown_event.set()
- You check it (frequently) in another thread and take the appropriate action once it has been set.
while not shutdown_event.is_set():
do_some_work()
do_some_cleanup()
It is probably worth pointing out here that this system is really just some conventions that you've established between your main thread and the workers. If the workers don't periodically check the shutdown event, then they won't stop their work -- and CTRL-C still won't work.
Putting it Together
After applying the
threading.Event model to our example, we are able to have our CTRL-C respected relatively quickly (as quickly as the worker thread gets around to checking the event).
import time
import threading
shutdown_event = threading.Event()
def dowork():
while not shutdown_event.is_set():
time.sleep(1.0)
def main():
""" Start some threads & stuff. """
t = threading.Thread(target=dowork, args=(), name='worker')
t.start()
try:
while t.is_alive():
t.join(timeout=1.0)
except (KeyboardInterrupt, SystemExit):
shutdown_event.set()
if __name__ == '__main__':
main()
Working around uninterruptable Thread.join()
You may have noticed that we changed how we called
Thread.join(). Calling the
join() method on a thread without a timeout will block until that thread returns/completes. As I understand it, this is due to a mutex in the
join() method which has the implication that you
cannot interrupt it with
KeyboardInterrupt. You can work around this, though, by essentially checking in a loop until the thread does exit:
while t.is_alive():
t.join(timeout=0.1)
Other Events and Exceptions
You may notice that in the compiled example that I am also catching the
SystemExit for sake of completeness. In a more complex app, you would need to make sure that other exceptions were also handled so that they would result in the shutdown message going to the worker threads.
You could also choose to register a signal handler (in your main thread) for other OS signals and raise an appropriate exception (e.g.
SystemExit) or take other actions. The important point here is that these would all need to be handled in your main thread and communicated by some sort of convention to the worker thread(s).
In Summary
Dealing with these "asynchronous events" in multi-threaded applications can be a little confusing (and sometimes a little frustrating when your app refuses to exit). Understanding the key points here will hopefully help make this a bit clearer:
- Signals are handled by the main thread. This means that KeyboardInterrupt is always raised in the main thread.
- Daemon threads will exit automatically when the main thread exits.
- For cases where you need more control over thread termination, use threading.Event objects to "signal" threads to exit.
- Be aware that Thread.join() calls will block and cannot be interrupted! Use an alternative while-loop strategy for joining instead.