Blocking your UI thread with PlaySound
For better or worse, the Windows UI model ties a window to a particular thread, that has led to a programming paradigm where work is divided between "UI threads" and "I/O threads". In order to keep your application responsive, it's critically important to not perform any blocking operations on your UI thread and instead do them on the "I/O threads".
One thing that people don't always realize is that even asynchronous APIs block. This isn't surprising - a single processor core can only do one thing at a time (to be pedantic, the processor cores can and do more than one thing at a time, but the C (or C++) language is defined to run on an abstract machine that enforces various strict ordering semantics, thus the C (or C++) compiler will do what is necessary to ensure that the languages ordering semantics are met[1]).
So what does an "async" API really do given that most APIs are written in languages that don't contain native concurrency support[2] ? Well, usually it packages up the parameters to the API and queues it to a worker thread (this is what the CLR does for many of the "async" CLR operations - they're not really asynchronous, they're just synchronous calls made on some other thread).
For some asynchronous APIs (like ReadFile and WriteFile) you CAN implement real asynchronous semantics - under the covers, the ReadFile API adds a read request to a worker queue and starts the I/O associated with reading the data from disk, when the hardware interrupt occurs indicating that the read is complete, the I/O subsystem removes the read request from the worker queue and completes it [3].
The critical thing to realize is that even for the APIs that do support real asynchronous activity there's STILL synchronous processing going on - you still need to package up the parameters for the operation and add them to a queue somewhere, and that can stall the processor. For most operations it doesn't matter - the time to queue the parameters is sufficiently small that you can perform it on the UI thread.
And sometimes it isn't. It turns out that my favorite API, PlaySound is a great example of this. PlaySound provides asynchronous behavior with the SND_ASYNC flag, but it does a fair amount of work before dispatching the call to a worker thread. Unfortunately, some of the processing done in the application thread can take many milliseconds (especially if this is the first call to winmm.dll).
I originally wrote down the operations that were performed on the application's thread, but then I realized that doing so would cement the behavior for all time, and I don't want to do that. So the following will have to suffice:
In general, PlaySound does the processing necessary to determine the filename (or WAV image) in the application thread and posts the real work (rendering the sound) to a worker thread. That processing is likely to involve synchronous I/Os and registry reads. It may involve searching the path looking for a filename. For SND_RESOURCE, it will also involve reading the resource data from the specified module.
Because of this processing, it's possible for the PlaySound(..., SND_ASYNC) operation to take several hundred milliseconds (and we've seen it take as long as several seconds if the current directory is located on an unreliable network). As a result, even the SND_ASYNC version of the PlaySound API should be avoided on UI threads[4].
[1] I bet most of you didn't know that the C language definition strictly defines an abstract machine on which the language operates.
[2] Yes, I know about the OpenMP extensions to C/C++, they don't change this scenario.
[3] I know that this is a grotesque simplification of the actual process.
[4] For those that are now scoffing: "What a piece of junk - why on earth would you even bother doing the SND_ASYNC if you're not going to really be asynchronous", I'll counter that the actual rendering of the audio samples for many sounds takes several seconds. The SND_ASYNC flag moves all the actual audio rendering off the application's thread to a worker thread, so it can result in a significant improvement in performance.
Comments
Anonymous
May 15, 2007
The comment has been removedAnonymous
May 15, 2007
Nitpick (apologies): Re your point about C++'s abstract machine (I know too little about C99 to comment on it) -- it's single-threaded, and as far as I'm aware defines no semantics for multi-threaded code. The C++ standardisation committee are working on addressing this -- see the papers on concurrency under http://www.open-std.org/jtc1/sc22/wg21/docs/papers/Anonymous
May 15, 2007
Not a real nitpick, it's valid. Someone else pointed this out as well. The C/C++ abstract machine actually is mute about concurrancy, which means that it's implementation defined.Anonymous
May 15, 2007
The comment has been removedAnonymous
May 15, 2007
Way off topic I know, but is there any further information about this C/C++ abstract machine I can read up on?Anonymous
May 15, 2007
s/, unlikes/, entirely unlikes/ If Doug Adams were still with us he'd ignore me for that.Anonymous
May 21, 2007
While I'm not a Windows programmer by any means, isn't this entire issue a matter of how you define what you mean by _ASYNC? You're basically saying that there are a certain set of operations that the function call will do synchronously, and a certain set that aren't. Wouldn't the problem be solved quite easily* by performing the registry and file-space operations asynchronously?Yes, this is yet again an oversimplification since the ever-present backwards compatibility giant looms in the background.
- First time poster Sash. PS: Great work with your articles! Although I'm not a Windows programmer, I'm a big fan of software design in general, and your work is almost always enlightening.
- Anonymous
May 21, 2007
> but is there any further information about this C/C++ abstract machine I can read up on?
- Get yourself a copy of the standards (C99 and C++03) ... were $18 for a PDF each, maybe a little more now. Start at www.ansi.org.
- Hang about on the standardisation newsgroups (comp.std.c and comp.std.c++).