Jaa


WCF Binary XML and dictionaries

One of the encodings which come with WCF (since its first version, in .NET Framework 3.0) is a fast and lightweight encoding for XML documents. The WCF Binary XML format (“officially” called .NET Binary Format: XML Data Structure – https://msdn.microsoft.com/en-us/library/cc219210(v=PROT.10).aspx). It is essentially a new way of representing XML, without using the “normal” angle-bracket notation (<element attr="value">the content</element>), resulting in a smaller document (for example, all “end element” nodes are represented by a single byte in the binary format (or even none, as some text nodes have the information that they are followed by an end element). That already results in a significant compression of the data.

However, where the binary format can really shine is when we use its dictionaries. Essentially, a dictionary string is a string which can be identified by a small integer. So instead of having to output the whole name (for example, the name ‘Person’ would be encoded in 7 bytes – 1 for the string length and 6 for each of the letters), if this string belonged to one of the dictionaries agreed upon by the two parties (one which is encoding the XML in the binary format, the other which will read the encoded binary into XML), it would normally use 1 (or 2) bytes to be represented.

An example can show it easier. The following “.NET Binary XML” document represents the XML <Envelope></Envelope>:

40 08 45 6E 76 65 6C 6F 70 65 01

The first byte (0x40) indicates that this is a “ShortElement” record, followed by the length of the element name (0x08), followed by the UTF-8 encoding of the local name (0x45 … 0x65). Finally, the last byte (0x01) represents an end element.

If the string Envelope is part of a dictionary, however, the same document can be rewritten replacing the 9 bytes used for the element with its dictionary id. Assuming that the dictionary id of “Envelope” is 0x02, this is what the same document would be encoded:

42 02 01

The first byte was changed to indicate that this is a new type of record: a “ShortDictionaryElement”. It’s followed by the dictionary id of the string it represents.

Two dictionaries

Replacing the strings with its dictionary ids is without question a great size reduction strategy for the binary encoding. However, somehow both parties need to understand that 02 means “Envelope”. The parties can agree on a set of defined words which will be used in most of the messages, and those words will become the dictionary. Or the sender can include, along with the message to the receiver, a list of words which will define the dictionary. The former is what I call a “static” dictionary, since both parties must agree, ahead of time, of all strings (and their ids) which comprise the dictionary. The latter is what I call a “dynamic” dictionary, since the dictionary table is created on the fly, for each message (or session of messages).

WCF uses both kinds of dictionaries. Since most of the (non-REST) communication in WCF is done using SOAP (and more specifically, the binary encoding binding element only supports SOAP version 1.2), there are many strings which are present in all messages, such as “Envelope”, “mustUnderstand”, “Header”, “Body”, “Action”, some namespaces and so on. There are also strings for WS-* protocols, such as WS-Security, WS-RM, WS-Trust, WS-Transaction, and so on. WCF already “knows” about those strings, and therefore they don’t need to be transmitted with the messages, they can simply be encoded with their ids. This dictionary is used whenever the BinaryMessageEncodingBindingElement is used. The static dictionary used by WCF is detailed at https://msdn.microsoft.com/en-us/library/cc219187(v=PROT.10).aspx – the table lists all strings in that dictionary, along with their corresponding dictionary ids.

The second kind of dictionary is used for contract-specific data. That includes data contract names and namespaces, data member names, service contract names and namespaces, operation names and so on. Take the type Character below, for example. Every time an instance of Character is serialized, the writer will output the name of all the fields, and the name of the type (or member, if it’s declared as a member of another type). So if it’s possible that more than one instance of Character will be serialized, then replacing those names with dictionary ids will help reduce the message size – even if we have to send those strings prior to the message itself.

  1. [DataContract(Name = "Character", Namespace = "https://my.namespace.com/cartoonCharacters")]
  2. public class Character
  3. {
  4.     [DataMember]
  5.     public string Name;
  6.     [DataMember]
  7.     public int Age;
  8.     [DataMember]
  9.     public DateTime DateOfBirth;
  10.     [DataMember]
  11.     public string Type;
  12. }

The dynamic dictionary is used by WCF only when the binary encoding is used with the TCP or Named Pipe transport. That’s because, for simple messages, the overhead of transmitting the strings along with the document would increase the message size instead of reducing it (unless the strings which are “dictionarized” are used multiple times, using a dictionary doesn’t pay off). So WCF uses the dynamic dictionary for sessionful protocols only, in which multiple messages can be sent over the same connection, so each string in the dictionary only needs to be sent once – subsequent messages received in the same connection can simply reference those strings which were sent with previous ones. In HTTP it is not used, since each request can go to a different node under a load balancer, each request would need to send all strings again. The “official” specification of the format WCF uses to send the strings which form the dynamic dictionary can be found at https://msdn.microsoft.com/en-us/library/cc219190(v=PROT.10).aspx.

One more thing: the binary format itself doesn’t define whether an id in a dictionary node comes from a static or a dynamic dictionary – it simply defines that the ids are to be interpreted as keys from a dictionary. In order to differentiate between the two dictionary types, WCF uses the convention that even ids (0, 2, 4, …) represent strings from the static dictionary (and the ids in documentation for the WCF static dictionary in the link I had above are all even), while odd ids (1, 3, 5, …) represent strings from the dynamic dictionary. That’s purely a WCF implementation detail, as the binary format itself doesn’t mandate such distinction.

XmlBinaryWriter / XmlBinaryReader

Unless you’re keen in manipulating bytes directly, you’ll likely end up using the WCF XML binary writer / reader to produce / consume documents in the XML binary format. The classes XmlDictionaryWriter and XmlDictionaryReader (from the namespace System.Xml, in the System.Runtime.Serialization assembly) each contain a few overloads to create a XML writer / reader using the binary formats. The table below shows two of those overloads:

  1. public class XmlDictionaryWriter
  2. {
  3.     public static XmlDictionaryWriter CreateBinaryWriter(Stream stream, IXmlDictionary dictionary, XmlBinaryWriterSession writerSession);
  4. }
  5.  
  6. public class XmlDictionaryReader
  7. {
  8.     public static XmlDictionaryReader CreateBinaryReader(Stream stream, IXmlDictionary dictionary, XmlDictionaryReaderQuotas quotas, XmlBinaryReaderSession readerSession);
  9. }

The IXmlDictionary parameter is the one used for the static dictionary – it’s the same parameter for both the writer and the reader. The XmlBinaryWriterSession is passed when creating the writer, and this instance will receive calls to add new dictionary strings which don’t belong to the IXmlDictionary parameter passed to it – those strings, if successfully added to the session, will need to be transmitted alongside the message so that the reader can recreate the dynamic dictionary on its side – by adding those values to the instance of XmlBinaryReaderSession it passes to the factory method in XmlDictionaryReader.

DataContractSerializer and the dynamic dictionary

Besides being used within a WCF client/server interaction, the dynamic dictionary can also be used when we’re doing stand-alone serialization – think storage of an “object” in a database for later retrieval, for example. That is especially the case if there are many instances of the same type, which would benefit from “dictionarizing” the member names. I haven’t seen any examples out there, so I’ll post one way where it can be done here.

<warning>The code below works, but it does not have any security / validity checks to keep it small for informational purposes. If you’re planning on using it on any production system, please, pretty please you’ll need to add some validation, especially in the class MyReaderSession below.</warning>

Basically, we need to find out which strings have been added to the writer session, and then we add those strings to the “serialized” object itself (it needs to go prior to the serialized part, since in order to deserialize the object we need to know the contents of the dynamic dictionary). The class XmlBinaryWriterSession doesn’t give a direct way to find out which strings have been added to it, but it’s a simple matter of deriving from that class and overriding the TryAdd method:

  1. class MyWriterSession : XmlBinaryWriterSession
  2. {
  3.     List<string> dictionaryStrings = new List<string>();
  4.     public override bool TryAdd(XmlDictionaryString value, out int key)
  5.     {
  6.         bool result = base.TryAdd(value, out key);
  7.         if (result)
  8.         {
  9.             this.dictionaryStrings.Add(value.Value);
  10.         }
  11.  
  12.         return result;
  13.     }
  14.  
  15.     public void SaveDictionaryStrings(Stream stream)
  16.     {
  17.         MemoryStream ms = new MemoryStream();
  18.         foreach (string dicString in this.dictionaryStrings)
  19.         {
  20.             byte[] stringBytes = Encoding.UTF8.GetBytes(dicString);
  21.             this.WriteMB31(ms, stringBytes.Length);
  22.             ms.Write(stringBytes, 0, stringBytes.Length);
  23.         }
  24.  
  25.         this.WriteMB31(stream, (int)ms.Position);
  26.         ms.Position = 0;
  27.         ms.CopyTo(stream);
  28.     }
  29.  
  30.     private void WriteMB31(Stream stream, int value)
  31.     {
  32.         byte b = (byte)(value & 0x7F);
  33.         while (value > 0x7F)
  34.         {
  35.             stream.WriteByte((byte)(b | 0x80));
  36.             value = value >> 7;
  37.             b = (byte)(value & 0x7F);
  38.         }
  39.  
  40.         stream.WriteByte((byte)(b & 0x7F));
  41.     }
  42. }

We can then use that new session to initialize a binary writer which will be then passed to the DataContractSerializer. The method MyWriterSession.SaveDictionaryStrings in the derived writer session class uses the same format as used by the dynamic dictionary in WCF to save the string table to the output stream. And we copy the serialized object to the result stream, right after the string table.

  1. MemoryStream ms = new MemoryStream();
  2. MyWriterSession writerSession = new MyWriterSession();
  3. XmlDictionaryWriter xdw = XmlDictionaryWriter.CreateBinaryWriter(ms, null, writerSession);
  4. DataContractSerializer dcs = new DataContractSerializer(typeof(List<Character>));
  5. dcs.WriteObject(xdw, listOfCharacters);
  6. xdw.Flush();
  7. MemoryStream result = new MemoryStream();
  8. writerSession.SaveDictionaryStrings(result);
  9. ms.Position = 0;
  10. ms.CopyTo(result);

During deserialization, the receiver first needs to recreate Next, the reader session needs to be created based on the string table. This helper class below accomplishes that, creating a XmlBinaryReaderSession object which can be passed directly to the CreateBinaryReader method.

  1. class MyReaderSession
  2. {
  3.     public static XmlBinaryReaderSession CreateReaderSession(Stream stream)
  4.     {
  5.         XmlBinaryReaderSession result = new XmlBinaryReaderSession();
  6.         int nextId = 0;
  7.         int bytesRead;
  8.         int sessionSize = ReadMB31(stream, out bytesRead);
  9.         while (sessionSize > 0)
  10.         {
  11.             int stringSize = ReadMB31(stream, out bytesRead);
  12.             sessionSize -= bytesRead + stringSize;
  13.             byte[] stringBytes = new byte[stringSize];
  14.             stream.Read(stringBytes, 0, stringBytes.Length);
  15.             string dicString = Encoding.UTF8.GetString(stringBytes);
  16.             result.Add(nextId++, dicString);
  17.         }
  18.  
  19.         return result;
  20.     }
  21.  
  22.     private static int ReadMB31(Stream stream, out int bytesRead)
  23.     {
  24.         bytesRead = 0;
  25.         int result = 0;
  26.         int shift = 0;
  27.         int b = stream.ReadByte();
  28.         bytesRead++;
  29.         do
  30.         {
  31.             result = result | (b << shift);
  32.             shift += 7;
  33.             if (b >= 0x80)
  34.             {
  35.                 b = stream.ReadByte();
  36.                 bytesRead++;
  37.             }
  38.         } while (b >= 0x80);
  39.  
  40.         return result;
  41.     }
  42. }

And to close the sample, here’s the code to deserialize the object:

  1. XmlBinaryReaderSession readerSession = MyReaderSession.CreateReaderSession(serializedStream);
  2. XmlDictionaryReader xdr = XmlDictionaryReader.CreateBinaryReader(serializedStream, null, XmlDictionaryReaderQuotas.Max, readerSession);
  3. DataContractSerializer dcs = new DataContractSerializer(typeof(List<Character>));
  4. List<Character> result = (List<Character>)dcs.ReadObject(xdr);

In a simple test I did, with the Character class shown above and a list with 6 instances, I got a reduction in size of the serialized object of about 28% when using dictionaries, compared with to simply using the binary format without dictionaries. Different data types will yield different improvements, so try it out to see if the gains offset the additional complexity in your code (which isn’t much).

More code and other utilities

Understanding the WCF XML binary format is complicated. I have a few tools which I use to help me understand those, which I posted at https://github.com/carlosfigueira/WCFSamples/tree/master/WCFBinaryTools. Among those tools are a parser for the format, along with two programs that use it – one command-line tool, which outputs the binary nodes to the console, and one GUI-based, which shows the nodes in a tree view. There’s also a sample to use the dynamic dictionary with the DataContractSerializer, which is quite useful if you’re serializing collections of a type (since the members will be repeated for each member of the collection).

Comments

  • Anonymous
    June 11, 2012
    Thanks Carlos, that is great information.Something remains unclear for me. If you look at the byte stream, how a parser can make the difference when there is a string table in from of the document, or when the document starts directly without any string table.The first byte of the string table being the  length in bytes of the whole table, a parser can still interpret this first byte as a code described in the MC-NBFX standard, and not as starting a string table.How can a parser make the difference ?Thank you