Jaa


Sometimes it's better just to waitfor

You may have noticed that the sample blocking detection script in KB articles 251004 and 271509 uses WAITFOR in a loop in order to run the sp_blocker_pssNN stored procedure. Ever wonder why that is? Why wouldn’t we schedule the execution of the procedure externally using one of the many facilities readily available for doing this sort of thing (e.g., sleep.exe, schtasks.exe, SQL Agent, the Windows AT command, etc.)? Doesn’t keeping a connection open to SQL Server use up a certain amount of resources in and of itself? Could merely running the blocking detection script in this fashion cause blocking given that the resources on the system are always finite?

One reason the script works this way is to ensure that the sp_blocker_pssNN proc can run on a consistent schedule. Once we give up our UMS worker, there’s no guarantee that we’ll get another. If a system is under heavy load, disconnecting a key user such as one running diagnostic code is the last thing you would want to do. When you attempt to reconnect, all workers may be busy doing other things (including, perhaps, being blocked), and the system may have already hit the max worker thread limit. The very problem you’re trying to diagnose – blocking – could keep you from connecting and troubleshooting the issue. By staying in a single batch and using WAITFOR to efficiently cycle the proc, we keep resource usage to a minimum while still doing the best we can to ensure that we’ll be able to get the diagnostics we’re after.

Question: If a user runs a batch of T-SQL from a client app and, although the batch has completed, remains connected, can he be assured that he’ll get a UMS worker when he submits the next batch? Answer: No. Merely being connected doesn’t ensure that you retain a UMS worker. Once it carries out a user’s work request, a worker is free to service those by other users. The only way to reserve a worker indefinitely is to run indefinitely, and that’s exactly what the blocking detection script does – it runs indefinitely in the most efficient way possible in order to keep system impact to a minimum while collecting the required information.

Normally, you wouldn’t want to have a single user intentionally take over a UMS worker using a technique like this. You should just submit work to the server as you need to and let UMS decide how to schedule it. Commandeering a worker in this fashion interferes with the server's ability to service multiple users with a single UMS worker – an important concept in keeping the server scalable. The blocking detection script takes the novel approach it does because it has to work properly when the system is under extreme stress. It has to function when the system may be running out of available UMS workers because those are exactly the kinds of situations in which being able to collect blocking-related info is especially critical.

Fortunately, SQL Server 2005 offers a far better solution to this conundrum: the Dedicated Administrator Connection. Besides addressing the "unavailable worker" scenario, the DAC provides a robust way to connect to the server when it might otherwise be unresponsive due to other reasons as well. Note that you can have only one DAC connection active at a time, and there are limits on the commands it supports, but it does present a great way to get into the server to troubleshoot problems when you might not be otherwise able to. Check the Books Online for details.

Comments