Serialization and Deserialization
Windows Communication Foundation (WCF) includes a new serialization engine, the DataContractSerializer. The DataContractSerializer translates between .NET Framework objects and XML, in both directions. This topic explains how the serializer works.
When serializing .NET Framework objects, the serializer understands a variety of serialization programming models, including the new data contract model. For a full list of supported types, see Types Supported by the Data Contract Serializer. For an introduction to data contracts, see Using Data Contracts.
When deserializing XML, the serializer uses the XmlReader and XmlWriter classes. It also supports the XmlDictionaryReader and XmlDictionaryWriter classes to enable it to produce optimized XML in some cases, such as when using the WCF binary XML format.
WCF also includes a companion serializer, the NetDataContractSerializer. The NetDataContractSerializer:
- Is not secure. For more information, see the BinaryFormatter security guide.
- Is similar to the BinaryFormatter and SoapFormatter serializers because it also emits .NET Framework type names as part of the serialized data.
- Is used when the same types are shared on the serializing and the deserializing ends.
Both DataContractSerializer and NetDataContractSerializer derive from a common base class, XmlObjectSerializer.
Warning
The DataContractSerializer serializes strings containing control characters with a hexadecimal value below 20 as XML entities. This may cause a problem with a non-WCF client when sending such data to a WCF service.
Creating a DataContractSerializer Instance
Constructing an instance of the DataContractSerializer is an important step. After construction, you cannot change any of the settings.
Specifying the Root Type
The root type is the type of which instances are serialized or deserialized. The DataContractSerializer has many constructor overloads, but, at a minimum, a root type must be supplied using the type
parameter.
A serializer created for a certain root type cannot be used to serialize (or deserialize) another type, unless the type is derived from the root type. The following example shows two classes.
[DataContract]
public class Person
{
// Code not shown.
}
[DataContract]
public class PurchaseOrder
{
// Code not shown.
}
<DataContract()> _
Public Class Person
' Code not shown.
End Class
<DataContract()> _
Public Class PurchaseOrder
' Code not shown.
End Class
This code constructs an instance of the DataContractSerializer
that can be used only to serialize or deserialize instances of the Person
class.
DataContractSerializer dcs = new DataContractSerializer(typeof(Person));
// This can now be used to serialize/deserialize Person but not PurchaseOrder.
Dim dcs As New DataContractSerializer(GetType(Person))
' This can now be used to serialize/deserialize Person but not PurchaseOrder.
Specifying Known Types
If polymorphism is involved in the types being serialized that is not already handled using the KnownTypeAttribute attribute or some other mechanism, a list of possible known types must be passed to the serializer’s constructor using the knownTypes
parameter. For more information about known types, see Data Contract Known Types.
The following example shows a class, LibraryPatron
, that includes a collection of a specific type, the LibraryItem
. The second class defines the LibraryItem
type. The third and four classes (Book
and Newspaper
) inherit from the LibraryItem
class.
[DataContract]
public class LibraryPatron
{
[DataMember]
public LibraryItem[] borrowedItems;
}
[DataContract]
public class LibraryItem
{
// Code not shown.
}
[DataContract]
public class Book : LibraryItem
{
// Code not shown.
}
[DataContract]
public class Newspaper : LibraryItem
{
// Code not shown.
}
<DataContract()> _
Public Class LibraryPatron
<DataMember()> _
Public borrowedItems() As LibraryItem
End Class
<DataContract()> _
Public Class LibraryItem
' Code not shown.
End Class
<DataContract()> _
Public Class Book
Inherits LibraryItem
' Code not shown.
End Class
<DataContract()> _
Public Class Newspaper
Inherits LibraryItem
' Code not shown.
End Class
The following code constructs an instance of the serializer using the knownTypes
parameter.
// Create a serializer for the inherited types using the knownType parameter.
Type[] knownTypes = new Type[] { typeof(Book), typeof(Newspaper) };
DataContractSerializer dcs =
new DataContractSerializer(typeof(LibraryPatron), knownTypes);
// All types are known after construction.
' Create a serializer for the inherited types using the knownType parameter.
Dim knownTypes() As Type = {GetType(Book), GetType(Newspaper)}
Dim dcs As New DataContractSerializer(GetType(LibraryPatron), knownTypes)
' All types are known after construction.
Specifying the Default Root Name and Namespace
Normally, when an object is serialized, the default name and namespace of the outermost XML element are determined according to the data contract name and namespace. The names of all inner elements are determined from data member names, and their namespace is the data contract’s namespace. The following example sets Name
and Namespace
values in the constructors of the DataContractAttribute and DataMemberAttribute classes.
[DataContract(Name = "PersonContract", Namespace = "http://schemas.contoso.com")]
public class Person2
{
[DataMember(Name = "AddressMember")]
public Address theAddress;
}
[DataContract(Name = "AddressContract", Namespace = "http://schemas.contoso.com")]
public class Address
{
[DataMember(Name = "StreetMember")]
public string street;
}
<DataContract(Name:="PersonContract", [Namespace]:="http://schemas.contoso.com")> _
Public Class Person2
<DataMember(Name:="AddressMember")> _
Public theAddress As Address
End Class
<DataContract(Name:="AddressContract", [Namespace]:="http://schemas.contoso.com")> _
Public Class Address
<DataMember(Name:="StreetMember")> _
Public street As String
End Class
Serializing an instance of the Person
class produces XML similar to the following.
<PersonContract xmlns="http://schemas.contoso.com">
<AddressMember>
<StreetMember>123 Main Street</StreetMember>
</AddressMember>
</PersonContract>
However, you can customize the default name and namespace of the root element by passing the values of the rootName
and rootNamespace
parameters to the DataContractSerializer constructor. Note that the rootNamespace
does not affect the namespace of the contained elements that correspond to data members. It affects only the namespace of the outermost element.
These values can be passed as strings or instances of the XmlDictionaryString class to allow for their optimization using the binary XML format.
Setting the Maximum Objects Quota
Some DataContractSerializer
constructor overloads have a maxItemsInObjectGraph
parameter. This parameter determines the maximum number of objects the serializer serializes or deserializes in a single ReadObject method call. (The method always reads one root object, but this object may have other objects in its data members. Those objects may have other objects, and so on.) The default is 65536. Note that when serializing or deserializing arrays, every array entry counts as a separate object. Also, note that some objects may have a large memory representation, and so this quota alone may not be sufficient to prevent a denial of service attack. For more information, see Security Considerations for Data. If you need to increase this quota beyond the default value, it is important to do so both on the sending (serializing) and receiving (deserializing) sides because it applies to both when reading and writing data.
Round Trips
A round trip occurs when an object is deserialized and re-serialized in one operation. Thus, it goes from XML to an object instance, and back again into an XML stream.
Some DataContractSerializer
constructor overloads have an ignoreExtensionDataObject
parameter, which is set to false
by default. In this default mode, data can be sent on a round trip from a newer version of a data contract through an older version and back to the newer version without loss, as long as the data contract implements the IExtensibleDataObject interface. For example, suppose version 1 of the Person
data contract contains the Name
and PhoneNumber
data members, and version 2 adds a Nickname
member. If IExtensibleDataObject
is implemented, when sending information from version 2 to version 1, the Nickname
data is stored, and then re-emitted when the data is serialized again; therefore, no data is lost in the round trip. For more information, see Forward-Compatible Data Contracts and Data Contract Versioning.
Security and Schema Validity Concerns with Round Trips
Round trips may have security implications. For example, deserializing and storing large amounts of extraneous data may be a security risk. There may be security concerns about re-emitting this data that there is no way to verify, especially if digital signatures are involved. For example, in the previous scenario, the version 1 endpoint could be signing a Nickname
value that contains malicious data. Finally, there may be schema validity concerns: an endpoint may want to always emit data that strictly adheres to its stated contract and not any extra values. In the previous example, the version 1 endpoint’s contract says that it emits only Name
and PhoneNumber
, and if schema validation is being used, emitting the extra Nickname
value causes validation to fail.
Enabling and Disabling Round Trips
To turn off round trips, do not implement the IExtensibleDataObject interface. If you have no control over the types, set the ignoreExtensionDataObject
parameter to true
to achieve the same effect.
Object Graph Preservation
Normally, the serializer does not care about object identity, as in the following code.
[DataContract]
public class PurchaseOrder
{
[DataMember]
public Address billTo;
[DataMember]
public Address shipTo;
}
[DataContract]
public class Address
{
[DataMember]
public string street;
}
<DataContract()> _
Public Class PurchaseOrder
<DataMember()> _
Public billTo As Address
<DataMember()> _
Public shipTo As Address
End Class
<DataContract()> _
Public Class Address
<DataMember()> _
Public street As String
End Class
The following code creates a purchase order.
// Construct a purchase order:
Address adr = new Address();
adr.street = "123 Main St.";
PurchaseOrder po = new PurchaseOrder();
po.billTo = adr;
po.shipTo = adr;
' Construct a purchase order:
Dim adr As New Address()
adr.street = "123 Main St."
Dim po As New PurchaseOrder()
po.billTo = adr
po.shipTo = adr
Notice that billTo
and shipTo
fields are set to the same object instance. However, the generated XML duplicates the information duplicated, and looks similar to the following XML.
<PurchaseOrder>
<billTo><street>123 Main St.</street></billTo>
<shipTo><street>123 Main St.</street></shipTo>
</PurchaseOrder>
However, this approach has the following characteristics, which may be undesirable:
Performance. Replicating data is inefficient.
Circular references. If objects refer to themselves, even through other objects, serializing by replication results in an infinite loop. (The serializer throws a SerializationException if this happens.)
Semantics. Sometimes it is important to preserve the fact that two references are to the same object, and not to two identical objects.
For these reasons, some DataContractSerializer
constructor overloads have a preserveObjectReferences
parameter (the default is false
). When this parameter is set to true
, a special method of encoding object references, which only WCF understands, is used. When set to true
, the XML code example now resembles the following.
<PurchaseOrder ser:id="1">
<billTo ser:id="2"><street ser:id="3">123 Main St.</street></billTo>
<shipTo ser:ref="2"/>
</PurchaseOrder>
The "ser" namespace refers to the standard serialization namespace, http://schemas.microsoft.com/2003/10/Serialization/
. Each piece of data is serialized only once and given an ID number, and subsequent uses result in a reference to the already serialized data.
Important
If both "id" and "ref" attributes are present in the data contract XMLElement
, then the "ref" attribute is honored and the "id" attribute is ignored.
It is important to understand the limitations of this mode:
The XML the
DataContractSerializer
produces withpreserveObjectReferences
set totrue
is not interoperable with any other technologies, and can be accessed only by anotherDataContractSerializer
instance, also withpreserveObjectReferences
set totrue
.There is no metadata (schema) support for this feature. The schema that is produced is valid only for the case when
preserveObjectReferences
is set tofalse
.This feature may cause the serialization and deserialization process to run slower. Although data does not have to be replicated, extra object comparisons must be performed in this mode.
Caution
When the preserveObjectReferences
mode is enabled, it is especially important to set the maxItemsInObjectGraph
value to the correct quota. Due to the way arrays are handled in this mode, it is easy for an attacker to construct a small malicious message that results in large memory consumption limited only by the maxItemsInObjectGraph
quota.
Specifying a Data Contract Surrogate
Some DataContractSerializer
constructor overloads have a dataContractSurrogate
parameter, which may be set to null
. Otherwise, you can use it to specify a data contract surrogate, which is a type that implements the IDataContractSurrogate interface. You can then use the interface to customize the serialization and deserialization process. For more information, see Data Contract Surrogates.
Serialization
The following information applies to any class that inherits from the XmlObjectSerializer, including the DataContractSerializer and NetDataContractSerializer classes.
Simple Serialization
The most basic way to serialize an object is to pass it to the WriteObject method. There are three overloads, one each for writing to a Stream, an XmlWriter, or an XmlDictionaryWriter. With the Stream overload, the output is XML in the UTF-8 encoding. With the XmlDictionaryWriter overload, the serializer optimizes its output for binary XML.
When using the WriteObject method, the serializer uses the default name and namespace for the wrapper element and writes it out along with the contents (see the previous "Specifying the Default Root Name and Namespace" section).
The following example demonstrates writing with an XmlDictionaryWriter.
Person p = new Person();
DataContractSerializer dcs =
new DataContractSerializer(typeof(Person));
XmlDictionaryWriter xdw =
XmlDictionaryWriter.CreateTextWriter(someStream,Encoding.UTF8 );
dcs.WriteObject(xdw, p);
Dim p As New Person()
Dim dcs As New DataContractSerializer(GetType(Person))
Dim xdw As XmlDictionaryWriter = _
XmlDictionaryWriter.CreateTextWriter(someStream, Encoding.UTF8)
dcs.WriteObject(xdw, p)
This produces XML similar to the following.
<Person>
<Name>Jay Hamlin</Name>
<Address>123 Main St.</Address>
</Person>
Step-By-Step Serialization
Use the WriteStartObject, WriteObjectContent, and WriteEndObject methods to write the end element, write the object contents, and close the wrapper element, respectively.
Note
There are no Stream overloads of these methods.
This step-by-step serialization has two common uses. One is to insert contents such as attributes or comments between WriteStartObject
and WriteObjectContent
, as shown in the following example.
dcs.WriteStartObject(xdw, p);
xdw.WriteAttributeString("serializedBy", "myCode");
dcs.WriteObjectContent(xdw, p);
dcs.WriteEndObject(xdw);
dcs.WriteStartObject(xdw, p)
xdw.WriteAttributeString("serializedBy", "myCode")
dcs.WriteObjectContent(xdw, p)
dcs.WriteEndObject(xdw)
This produces XML similar to the following.
<Person serializedBy="myCode">
<Name>Jay Hamlin</Name>
<Address>123 Main St.</Address>
</Person>
Another common use is to avoid using WriteStartObject and WriteEndObject entirely, and to write your own custom wrapper element (or even skip writing a wrapper altogether), as shown in the following code.
xdw.WriteStartElement("MyCustomWrapper");
dcs.WriteObjectContent(xdw, p);
xdw.WriteEndElement();
xdw.WriteStartElement("MyCustomWrapper")
dcs.WriteObjectContent(xdw, p)
xdw.WriteEndElement()
This produces XML similar to the following.
<MyCustomWrapper>
<Name>Jay Hamlin</Name>
<Address>123 Main St.</Address>
</MyCustomWrapper>
Note
Using step-by-step serialization may result in schema-invalid XML.
Deserialization
The following information applies to any class that inherits from the XmlObjectSerializer, including the DataContractSerializer and NetDataContractSerializer classes.
The most basic way to deserialize an object is to call one of the ReadObject method overloads. There are three overloads, one each for reading with a XmlDictionaryReader, an XmlReader
, or a Stream
. Note that the Stream
overload creates a textual XmlDictionaryReader that is not protected by any quotas, and should be used only to read trusted data.
Also note that the object the ReadObject
method returns must be cast to the appropriate type.
The following code constructs an instance of the DataContractSerializer and an XmlDictionaryReader, then deserializes a Person
instance.
DataContractSerializer dcs = new DataContractSerializer(typeof(Person));
FileStream fs = new FileStream(path, FileMode.Open);
XmlDictionaryReader reader =
XmlDictionaryReader.CreateTextReader(fs, new XmlDictionaryReaderQuotas());
Person p = (Person)dcs.ReadObject(reader);
Dim dcs As New DataContractSerializer(GetType(Person))
Dim fs As New FileStream(path, FileMode.Open)
Dim reader As XmlDictionaryReader = _
XmlDictionaryReader.CreateTextReader(fs, New XmlDictionaryReaderQuotas())
Dim p As Person = CType(dcs.ReadObject(reader), Person)
Before calling the ReadObject method, position the XML reader on the wrapper element or on a non-content node that precedes the wrapper element. You can do this by calling the Read method of the XmlReader or its derivation, and testing the NodeType, as shown in the following code.
DataContractSerializer ser = new DataContractSerializer(typeof(Person),
"Customer", @"http://www.contoso.com");
FileStream fs = new FileStream(path, FileMode.Open);
XmlDictionaryReader reader =
XmlDictionaryReader.CreateTextReader(fs, new XmlDictionaryReaderQuotas());
while (reader.Read())
{
switch (reader.NodeType)
{
case XmlNodeType.Element:
if (ser.IsStartObject(reader))
{
Console.WriteLine("Found the element");
Person p = (Person)ser.ReadObject(reader);
Console.WriteLine("{0} {1} id:{2}",
p.Name , p.Address);
}
Console.WriteLine(reader.Name);
break;
}
}
Dim ser As New DataContractSerializer(GetType(Person), "Customer", "http://www.contoso.com")
Dim fs As New FileStream(path, FileMode.Open)
Dim reader As XmlDictionaryReader = XmlDictionaryReader.CreateTextReader(fs, New XmlDictionaryReaderQuotas())
While reader.Read()
Select Case reader.NodeType
Case XmlNodeType.Element
If ser.IsStartObject(reader) Then
Console.WriteLine("Found the element")
Dim p As Person = CType(ser.ReadObject(reader), Person)
Console.WriteLine("{0} {1}", _
p.Name, p.Address)
End If
Console.WriteLine(reader.Name)
End Select
End While
Note that you can read attributes on this wrapper element before handing the reader to ReadObject
.
When using one of the simple ReadObject
overloads, the deserializer looks for the default name and namespace on the wrapper element (see the preceding section, "Specifying the Default Root Name and Namespace") and throws an exception if it finds an unknown element. In the preceding example, the <Person>
wrapper element is expected. The IsStartObject method is called to verify that the reader is positioned on an element that is named as expected.
There is a way to disable this wrapper element name check; some overloads of the ReadObject
method take the Boolean parameter verifyObjectName
, which is set to true
by default. When set to false
, the name and namespace of the wrapper element is ignored. This is useful for reading XML that was written using the step-by-step serialization mechanism described previously.
Using the NetDataContractSerializer
The primary difference between the DataContractSerializer
and the NetDataContractSerializer is that the DataContractSerializer
uses data contract names, whereas the NetDataContractSerializer
outputs full .NET Framework assembly and type names in the serialized XML. This means that the exact same types must be shared between the serialization and deserialization endpoints. This means that the known types mechanism is not required with the NetDataContractSerializer
because the exact types to be deserialized are always known.
However, several problems can occur:
Security. Any type found in the XML being deserialized is loaded. This can be exploited to force the loading of malicious types. Using the
NetDataContractSerializer
with untrusted data should be done only if a Serialization Binder is used (using the Binder property or constructor parameter). The binder permits only safe types to be loaded. The Binder mechanism is identical to the one that types in the System.Runtime.Serialization namespace use.Versioning. Using full type and assembly names in the XML severely restricts how types can be versioned. The following cannot be changed: type names, namespaces, assembly names, and assembly versions. Setting the AssemblyFormat property or constructor parameter to Simple instead of the default value of Full allows for assembly version changes, but not for generic parameter types.
Interoperability. Because .NET Framework type and assembly names are included in the XML, platforms other than the .NET Framework cannot access the resulting data.
Performance. Writing out the type and assembly names significantly increases the size of the resulting XML.
This mechanism is similar to binary or SOAP serialization used by .NET Framework remoting (specifically, the BinaryFormatter and the SoapFormatter).
Using the NetDataContractSerializer
is similar to using the DataContractSerializer
, with the following differences:
The constructors do not require you to specify a root type. You can serialize any type with the same instance of the
NetDataContractSerializer
.The constructors do not accept a list of known types. The known types mechanism is unnecessary if type names are serialized into the XML.
The constructors do not accept a data contract surrogate. Instead, they accept an ISurrogateSelector parameter called
surrogateSelector
(which maps to the SurrogateSelector property). This is a legacy surrogate mechanism.The constructors accept a parameter called
assemblyFormat
of the FormatterAssemblyStyle that maps to the AssemblyFormat property. As discussed previously, this can be used to enhance the versioning capabilities of the serializer. This is identical to the FormatterAssemblyStyle mechanism in binary or SOAP serialization.The constructors accept a StreamingContext parameter called
context
that maps to the Context property. You can use this to pass information into types being serialized. This usage is identical to that of the StreamingContext mechanism used in other System.Runtime.Serialization classes.The Serialize and Deserialize methods are aliases for the WriteObject and ReadObject methods. These exist to provide a more consistent programming model with binary or SOAP serialization.
For more information about these features, see Binary Serialization.
The XML formats that the NetDataContractSerializer
and the DataContractSerializer
use are normally not compatible. That is, attempting to serialize with one of these serializers and deserialize with the other is not a supported scenario.
Also, note that the NetDataContractSerializer
does not output the full .NET Framework type and assembly name for each node in the object graph. It outputs that information only where it is ambiguous. That is, it outputs at the root object level and for any polymorphic cases.