Blob Download Bug in Windows Azure SDK 1.5

Update: We have now released a fix for this issue. The download blob methods in this version throw an IOException if the connection is closed while downloading the blob, which is the same behavior seen in versions 1.4 and earlier of the StorageClient library.  

We strongly recommend that users using SDK version 1.5.20830.1814 upgrade their applications immediately to this new version 1.5.20928.1904. You can determine if you have the affected version of the SDK by going to Programs and Features in the Control Panel and verify the version of the Windows Azure SDK. If version 1.5.20830.1814 is installed, please follow these steps to upgrade:

  1. Click “Get Tools & SDK” on the  Windows Azure SDK download page . You do not need to uninstall the previous version first.
  2. Update your projects to use the copy of Microsoft.WindowsAzure.StorageClient.dll found in C:\Program Files\Windows Azure SDK\v1.5\bin\

 

We found a bug in the StorageClient library in Windows Azure SDK 1.5 that impacts the DownloadToStream, DownloadToFile, DownloadText, and DownloadByteArray methods for Windows Azure Blobs. 

If a client is doing a synchronous blob download using the SDK 1.5 and its connection is closed, then the client can get a partial download of the blob. The problem is that the client does not get an exception when the connection is closed, so it thinks the full blob was downloaded.  For example, if the blob was 15MB, and the client downloaded just 1MB and the connection was closed, then the client would only have 1MB (instead of 15MB) and think that it had the whole blob.  Instead, the client should have gotten an exception. The problem only occurs when the connection to the client is closed, and only for synchronous downloads, but not asynchronous downloads. 

The issue was introduced in version 1.5 of the Azure SDK when we changed the synchronous download methods to call the synchronous Read API on the web response stream. We see that once response headers have been received, the synchronous read method on the .NET response stream does not throw an exception when a connection is lost and the blob content has not been fully received yet. Since an exception is not thrown, this results in the Download method behaving as if the entire download has completed and it returns successfully when only partial content has been downloaded.

The problem only occurs when all of the following are true:

  • A synchronous download method is used
  • At least the response headers are received by the client after which the connection to the client is closed before the entire content is received by the client

Notably, one scenario where this can occur is if the request timeout happens after the headers have been received, but before all of the content can be transferred. For example, if the client set the timeout to 30 seconds for download of a 100GB blob, then it’s likely that this problem would occur, because 30 seconds is long enough for the response headers to be received along with part of the blob content, but is not long enough to transfer the full 100GB of content.

This does not impact asynchronous downloads, because asynchronous reads from a response stream throw an IOException when the connection is closed.  In addition, calls to OpenRead() are not affected as they also use the asynchronous read methods.

We will be releasing an SDK hotfix for this soon and apologize for any inconvenience this may have caused. Until then we recommend that customers use SDK 1.4 or the async methods to download blobs in SDK 1.5. Additionally, customers who have already started using SDK 1.5, can work around this issue by doing the following: Replace your DownloadToStream, DownloadToFile, DownloadText, and DownloadByteArray methods with BeginDownloadToStream/EndDownloadToStream. This will ensure that an IOException is thrown if the connection is closed, similar to SDK 1.4. The following is an example showing you how to do that:

 CloudBlob blob = new CloudBlob(uri);
blob.DownloadToStream(myFileStream); // WARNING: Can result in partial successful downloads

// NOTE: Use async method to ensure an exception is thrown if connection is 
// closed after partial download
blob.EndDownloadToStream(
blob.BeginDownloadToStream(myFileStream, null /* callback */, null /* state */));

If you rely on the text/file/byte array versions of download, we have the below extension methods for your convenience, which wraps a stream to work around this problem.

 using System.IO;
using System.Text;
using Microsoft.WindowsAzure.StorageClient;

public static class CloudBlobExtensions
{
    /// <summary>
    /// Downloads the contents of a blob to a stream.
    /// </summary>
    /// <param name="target">The target stream.</param>
    public static void DownloadToStreamSync(this CloudBlob blob, Stream target)
    {
        blob.DownloadToStreamSync(target, null);
    }

    /// <summary>
    /// Downloads the contents of a blob to a stream.
    /// </summary>
    /// <param name="target">The target stream.</param>
    /// <param name="options">An object that specifies any additional options for the 
    /// request.</param>
    public static void DownloadToStreamSync(this CloudBlob blob, Stream target, 
        BlobRequestOptions options)
    {
        blob.EndDownloadToStream(blob.BeginDownloadToStream(target, null, null));
    }

    /// <summary>
    /// Downloads the blob's contents.
    /// </summary>
    /// <returns>The contents of the blob, as a string.</returns>
    public static string DownloadTextSync(this CloudBlob blob)
    {
        return blob.DownloadTextSync(null);
    }

    /// <summary>
    /// Downloads the blob's contents.
    /// </summary>
    /// <param name="options">An object that specifies any additional options for the 
    /// request.</param>
    /// <returns>The contents of the blob, as a string.</returns>
    public static string DownloadTextSync(this CloudBlob blob, BlobRequestOptions options)
    {
        Encoding encoding = Encoding.UTF8;

        byte[] array = blob.DownloadByteArraySync(options);

        return encoding.GetString(array);
    }

    /// <summary>
    /// Downloads the blob's contents to a file.
    /// </summary>
    /// <param name="fileName">The path and file name of the target file.</param>
    public static void DownloadToFileSync(this CloudBlob blob, string fileName)
    {
        blob.DownloadToFileSync(fileName, null);
    }

    /// <summary>
    /// Downloads the blob's contents to a file.
    /// </summary>
    /// <param name="fileName">The path and file name of the target file.</param>
    /// <param name="options">An object that specifies any additional options for the 
    /// request.</param>
    public static void DownloadToFileSync(this CloudBlob blob, string fileName, 
        BlobRequestOptions options)
    {
        using (var fileStream = File.Create(fileName))
        {
            blob.DownloadToStreamSync(fileStream, options);
        }
    }

    /// <summary>
    /// Downloads the blob's contents as an array of bytes.
    /// </summary>
    /// <returns>The contents of the blob, as an array of bytes.</returns>
    public static byte[] DownloadByteArraySync(this CloudBlob blob)
    {
        return blob.DownloadByteArraySync(null);
    }

    /// <summary>
    /// Downloads the blob's contents as an array of bytes. 
    /// </summary>
    /// <param name="options">An object that specifies any additional options for the 
    /// request.</param>
    /// <returns>The contents of the blob, as an array of bytes.</returns>
    public static byte[] DownloadByteArraySync(this CloudBlob blob, 
        BlobRequestOptions options)
    {
        using (var memoryStream = new MemoryStream())
        {
            blob.DownloadToStreamSync(memoryStream, options);

            return memoryStream.ToArray();
        }
    }
}

Usage Examples:

 blob.DownloadTextSync();
blob.DownloadByteArraySync();
blob.DownloadToFileSync(fileName);

Joe Giardino