Udostępnij za pośrednictwem


Compressing messages in WCF part one - Fixing the GZipMessageEncoder bug

The compression options for WCF out of the box are limited in .Net 4.0. However, a sample is provided for GZip compression that shows you how to write your own MessageEncoder that can wrap the output of another encoder and apply GZip to the messages. If your environment has a network bandwidth limitation, compressing the messages going across the wire could be very helpful. In this series, we will be taking a look at how to use the GZip message encoder and what effect it has on your performance.

Download the WCF/WF Samples from here: https://www.microsoft.com/downloads/en/details.aspx?FamilyID=35ec8682-d5fd-4bc3-a51a-d8ad115a8792&displaylang=en

The first thing to do is examine the code for GZipMessageEncoder itself. Let's open up the solution. Download and install the WCF/WF samples to the directory of your choice. Then navigate to the WCF/Extensibility/MessageEncoder/Compression/CS directory and open the solution. Right-click on the solution in the solution explorer pane and choose "Set Startup Projects". Choose the Multiple startup projects radio button and use the dropdown to change the client and service projects' actions to "Start". Then you should be able to hit F5. The service and client windows should come up and execute, exchanging a couple messages back and forth.

The GZipMessageEncoder works by using another encoder underneath. In the sample, buffered messages are used. This means that the entire message is stored in a single continguous byte[]. We can examine the effect of compression on the buffered message by altering the code a bit to write the sizes before and after compression. To do this, open the GZipMessageEncodeFactory.cs file. Navigate to the GZipMessageEncoder class and the WriteMessage method that returns an ArraySegment<byte>. Alter the code as shown below:

 //One of the two main entry points into the encoder. Called by WCF to encode a Message into a buffered byte array.
public override ArraySegment<byte> WriteMessage(Message message, int maxMessageSize, 
    BufferManager bufferManager, int messageOffset)
{
    //Use the inner encoder to encode a Message into a buffered byte array
    ArraySegment<byte> buffer = innerEncoder.WriteMessage(message, maxMessageSize, 
        bufferManager, 0);
    //Compress the resulting byte array
    System.Diagnostics.Debug.WriteLine("Original size: {0}", buffer.Count);
    buffer = CompressBuffer(buffer, bufferManager, messageOffset);
    System.Diagnostics.Debug.WriteLine("Compressed size: {0}", buffer.Count);
    return buffer;
}

This just writes to diagnostics the size of the buffer. Here we can see how well our messages are being compressed. Hit F5 again to run and then bring up the Output view window in Visual Studio. You should see something like this:

 Original size: 751
Compressed size: 1024
Original size: 426
Compressed size: 512
Original size: 2714
Compressed size: 1024
Original size: 2382
Compressed size: 1024

There are a couple problems here. First, it looks like small messages actually get bigger. Second, the compressed sizes are in exact powers of two.

The first problem could be explained somewhat by the second problem. Let's examine the CompressBuffer code to see if we can find out what's wrong.

 //Helper method to compress an array of bytes
static ArraySegment<byte> CompressBuffer(ArraySegment<byte> buffer, BufferManager bufferManager, 
    int messageOffset)
{
    MemoryStream memoryStream = new MemoryStream();
    
    using (GZipStream gzStream = new GZipStream(memoryStream, CompressionMode.Compress, true))
    {
        gzStream.Write(buffer.Array, buffer.Offset, buffer.Count);
    }

    byte[] compressedBytes = memoryStream.ToArray();
    int totalLength = messageOffset + compressedBytes.Length;
    byte[] bufferedBytes = bufferManager.TakeBuffer(totalLength);

    Array.Copy(compressedBytes, 0, bufferedBytes, messageOffset, compressedBytes.Length);

    bufferManager.ReturnBuffer(buffer.Array);
    ArraySegment<byte> byteArray = new ArraySegment<byte>(bufferedBytes, messageOffset, 
        bufferedBytes.Length - messageOffset);

    return byteArray;
}

The highlighted portion above is what's causing our problem. The bufferedBytes variable is a buffer taken from the BufferManager. The BufferManager will give you a buffer that is at least as large as what you asked for, usually rounding up to the nearest power of two. This means that when we write bufferedBytes.Length as the number of bytes in the ArraySegment, we're not getting the correct number. Instead, replace bufferedBytes.Length - messageOffset with compressedBytes.Length. Run the test again to see the improvements:

 Original size: 751
Compressed size: 592
Original size: 426
Compressed size: 377
Original size: 2714
Compressed size: 874
Original size: 2382
Compressed size: 670

This looks much better! For those of you who are curious, I've already reported this bug to the samples team and it should be cleared up in the next release.

Comments

  • Anonymous
    August 09, 2012
    Thank you for this!  Just what I needed.

  • Anonymous
    October 31, 2012
    What if one wants to use it in a custom binding that encrypts the message, how do you compress it before it is encrypted ? (I know that there is a security issue regarding doing this)

  • Anonymous
    April 10, 2014
    Hi there, is there a way to support dynamic compression. I.e. check if the client accepts gzip? Regards, Jeroen

  • Anonymous
    April 10, 2014
    Hi Jeroen, Yes there is. Actually it's built into WCF now. Take a look here: msdn.microsoft.com/.../aa751889(v=vs.110).aspx Scroll down to the heading "Compression and the Binary Encoder".

  • Anonymous
    January 14, 2015
    Dustin, Question 1: Why is byte[] bufferedBytes = bufferManager.TakeBuffer(totalLength); not byte[] bufferedBytes = bufferManager.TakeBuffer(compressedBytes.Length); Question 2: Why is Array.Copy(compressedBytes, 0, bufferedBytes, messageOffset, compressedBytes.Length); not Array.Copy(compressedBytes, 0, bufferedBytes, 0, compressedBytes.Length); Question 3: Why is ArraySegment<byte> byteArray = new ArraySegment<byte>(bufferedBytes, messageOffset,        bufferedBytes.Length - messageOffset); not ArraySegment<byte> byteArray = new ArraySegment<byte>(bufferedBytes, 0,        compressedBytes.Length); Thanks for any feedback.

  • Anonymous
    January 14, 2015
    @David It might help to look at this example as well: msdn.microsoft.com/.../ms195359(v=vs.110).aspx My understanding is that the messageOffset is telling you where the message should start inside the buffer. There is no content in that part of the buffer that you have to copy over, but WCF is asking you to leave space for something there. I'm not sure if that's for a header or for some other use.

  • Anonymous
    January 14, 2015
    @Dustin I initially thought that as well.  The more I thought about it though the more I didn't understand if the basis of the message handling was the BufferManager (to avoid a lot of array allocations to be GC) why wouldn't there just be another buffer for things of that nature rather than intermixing it with a message content buffer? The help file description of the messageOffset parameter was not very enlightening. Dug downstream into the TextMessageEncoder and the BinaryMessageEncoder to see if they stuffed anything in that space and they don't appear to. referencesource.microsoft.com/.../TextMessageEncoder.cs.html referencesource.microsoft.com/.../BinaryMessageEncoder.cs.html Further found interesting the notable use of 0 for messageOffset in public ArraySegment<byte> WriteMessage(Message message, int maxMessageSize, BufferManager bufferManager)        {            ArraySegment<byte> arraySegment = WriteMessage(message, maxMessageSize, bufferManager, 0);            return arraySegment;        } found in referencesource.microsoft.com/.../MessageEncoder.cs.html So that leaves upstream .... I think.

  • Anonymous
    January 14, 2015
    @David - Ya, it's definitely weird. It's one of those things that I think made sense to someone a long time ago but they didn't document it.