Jaa


What’s wrong with this code, part 26 – the answer

Yesterday I posted a code snippet from inside a real piece of code inside the client side of a client/server utility in a Microsoft product.

 static DWORD WINAPI _PlayBeep(__in void* pv)
{
    UNREFERENCED_PARAMETER(pv);
    PlaySound(L".Default", NULL, SND_SYNC | SND_ALIAS);
    return 0;
}

LRESULT WndProc(...)
{
    :
    :
    case WM_KEYDOWN:
        if (!_AcceptInputKeys(wParam, lParam))
        {
            QueueUserWorkItem(_PlayBeep, NULL, 0);
        }
        break;
}

The bug here was that the call to PlaySound is synchronous.  Like many Win32 APIs, PlaySound is made thread-safe by taking a critical section around the code that does the work inside the call to PlaySound (since the semantics of PlaySound are that only one PlaySound call plays at any time, this is a reasonable thing to do).  Because the call to PlaySound specifies SND_SYNC, that processing is done on the thread that calls PlaySound and thus blocks the _PlayBeep call until the sound completes.  The “.Default” sound is approximately 1 second long, so the call to _PlayBeep takes about a second to complete.

Unfortunately keyboards repeat at a lot faster than 1 keystroke/second and each keystroke causes a call to QueueUserWorkItem.  Because there’s no thread available to process the request, the thread pool logic creates a new thread to process the next work item for each keystroke.  This quickly backs up and eventually you run into the 500 thread limit in the default process threadpool.  So you’re going to have the machine play “Ding’s” for hours before this cleans itself up.

But in this case the effects were was even worse. 

Remember when I told you that it was a client/server utility?  In this case, it meant that it used async RPC to communicate with the server.  And async RPC uses the default thread pool for on certain applications.  When the response to a server request came in, there were no RPC threads running so RPC tried to create a thread in the default threadpool.  Which failed because the default threadpool was full.

So the client/server utility stopped working.  It took about 15 seconds of hammering on invalid keys to get the utility into this state, NOT pretty.

 

The first fix was to change the WM_KEYDOWN to WM_KEYUP (so that you actually had to release the keys instead of letting the typematic feature repeat for you).  That didn’t work because it still could be reproduced, it just took longer.

The final fix was:

 static DWORD WINAPI _PlayBeep()
{
    PlaySound(L".Default", NULL, SND_ASYNC | SND_ALIAS);
    return 0;
}

LRESULT WndProc(...)
{
    :
    :
    case WM_KEYDOWN:
        if (!_AcceptInputKeys(wParam, lParam))
        {
            _PlayBeep();
        }
        break;
}

So instead of queueing a work item, the team changed the code to simply call PlaySound with SND_ASYNC and they were done.  When you specify SND_ASYNC, the PlaySound call queues the request to an internal-to-playsound worker thread, and it cancels the old sound before it starts playing a new sound (thus the sounds don’t back up).

 

The object lesson here is: Don’t queue long running items to the thread pool because it can lead to unexpected results.  And even a 1 second “Ding” can count as “long running” if it can be queued rapidly enough.

Comments

  • Anonymous
    June 29, 2009
    Under "things you should not do" I'd also add "playing Ding in response to an unknown key in a server UI utility".

  • Anonymous
    June 29, 2009
    Also filed under "reasons to not reinvent the wheel." There was a perfectly good async call already. Start thinking in microseconds, and a second becomes an eternity.

  • Anonymous
    June 30, 2009
    The comment has been removed

  • Anonymous
    June 30, 2009
    Maurits, Notepad is quite old, and was designed for general audience. On the other hand, the server utility we're discussing doesn't need those bells and whistles.

  • Anonymous
    June 30, 2009
    As a followup then, when would you ever want to use SND_SYNC?  Or, put another way, what are the potential costs of this fix?

  • Anonymous
    June 30, 2009
    I guess as a specific case of "long running items", don't do any UI work whatsoever from threadpool threads, right?  At most you should Post (not Send) a message to a window or thread whose message queue has already been created (you know because the thread created its message queue before starting the work item).