Windows Azure Storage Client Library: Potential Deadlock When Using Synchronous Methods

Update 11/06/11: The bug is fixed in the Windows Azure SDK September release .


Summary

In certain scenarios, using the synchronous methods provided in the Windows Azure Storage Client Library can lead to deadlock. Specifically, scenarios where the system is using all of the available ThreadPool threads while performing synchronous method calls via the Windows Azure Storage Client Library. Examples or such behavior are ASP.NET requests served synchronously, as well as a simple work client where you queue up N number of worker threads to execute. Note if you are manually creating threads outside of the managed ThreadPool or executing code off of a ThreadPool thread then this issue will not affect you.

When calling synchronous methods, the current implementation of the Storage Client Library blocks the calling thread while performing an asynchronous call. This blocked thread will be waiting for the asynchronous result to return. As such, if one of the asynchronous results requires an available ThreadPool thread (e.g. Memory­Stream.­Begin­Write, File­Stream.­­BeginWrite, etc.) and no ThreadPool threads are available; its callback will be added to the ThreadPool work queue to wait until a thread becomes available for it to run on. This leads to a condition where the calling thread is blocked until that asynchronous result (callback) unblocks it, but the callback will not execute until threads become unblocked; in other words the system is now deadlocked.

Affected Scenarios

This issue could affect you if your code is executing on a ThreadPool thread and you are using the synchronous methods from the Storage Client Library. Specifically, this issue will arise when the application has used all of its available ThreadPool threads. To find out if your code is executing on a ThreadPool thread you can check System.Threading.Thread.CurrentThread.IsThreadPoolThread at runtime. Some specific methods in the Storage Client Library that can exhibit this issue include the various blob download methods (file, byte array, text, stream, etc.)

Example

For example let’s say that we have a maximum of 10 ThreadPool threads in our system which can be set using ThreadPool.SetMaxThreads. If each of the threads is currently blocked on a synchronous method call waiting for the async wait handle to be set which will require a ThreadPool thread to set the wait handle, we are deadlocked since there are no available threads in the ThreadPool that can set the wait handle.

Workarounds

The following workarounds will avoid this issue:

  • Use asynchronous methods instead of their synchronous equivalents (i.e. use BeginDownloadByteArray/ EndDownloadByteArray rather than DownloadByteArray). Utilizing the Asynchronous Programming Model is really the only way to guarantee performant and scalable solutions which do not perform any blocking calls. As such, using the Asynchronous Programming Model will avoid the deadlock scenario detailed in this post.
  • If you are unable to use asynchronous methods, limit the level of concurrency in the system at the application layer to reduce simultaneous in flight operations using a semaphore construct such as System.Threading.Semaphore or Interlocked.Increment/Decrement. An example of this would be to have each incoming client perform an Interlocked.Increment on an integer and do a corresponding Interlocked.Decrement when the synchronous operation completes. With this mechanism in place you can now limit the number of simultaneous in flight operations below the limit of the ThreadPool and return “Server Busy” to avoid blocking more worker threads. When setting the maximum number of ThreadPool threads via ThreadPool.SetMaxThreads be cognizant of any additional ThreadPool threads you are using in the app domain via ThreadPool.QueueUserWorkItem or otherwise so as to accommodate them in your scenario. The goal is to limit the number of blocked threads in the system at any given point. Therefore, make sure that you do not block the thread prior to calling synchronous methods, since that will result in the same number of overall blocked threads. Instead when application reaches its concurrency limit you must ensure that no more additional ThreadPool threads become blocked.

Mitigations

If you are experiencing this issue and the options above are not viable in your scenario, you might try one of the options below. Please ensure you fully understand the implications of the actions below as they will result in additional threads being created on the system.

  • Increasing the number of ThreadPool threads can mitigate this to some extent; however deadlock will always be a possibility without a limit on simultaneous operations.
  • Offload work to non ThreadPool threads (make sure you understand the implications before doing this, the main purpose of the ThreadPool is to avoid the cost of constantly spinning up and killing off threads which can be expensive in code that runs frequently or in a tight loop).

Summary

We are currently investigating long term solutions for this issue for an upcoming release of the Windows Azure SDK. As such if you are currently affected by this issue please follow the workarounds contained in this post until a future release of the SDK is made available. To summarize, here are some best practices that will help avoid the potential deadlock detailed above.

  • Use Asynchronous methods for applications that scale – simply stated synchronous does not scale well as it implies the system must lock a thread for some amount of time. For applications with low demand this is acceptable, however threads are a finite resource in a system and should be treated as such. Applications that desire to scale should use simultaneous asynchronous calls so that a given thread can service many calls.
  • Limit concurrent requests when using synchronous APIs - Use semaphores/counters to control concurrent requests.
  • Perform a stress test - where you purposely saturate the ThreadPool workers to ensure your application responds appropriately.

References

MSDN ThreadPool documentation: https://msdn.microsoft.com/en-us/library/y5htx827(v=VS.90).aspx

Implementing the CLR Asynchronous Programming Model: https://msdn.microsoft.com/en-us/magazine/cc163467.aspx

Developing High-Performance ASP.NET Applications: https://msdn.microsoft.com/en-us/library/aa719969(v=VS.71).aspx

Joe Giardino