Customizing .NET XML serialization process
.NET XML serialization is very convenient and useful, it will generate a dynamic assembly to perform the serialization and deserialization, you can even create an XML serialization assembly by using SGen.exe in advance to improve the startup performance.
However, in some rare cases you might want to fine tune the generated XML serialization assembly. This isn’t something officially supported, but the fact that an XML serialization assembly pre-created by SGen.exe is allowed leaves us with an extensibility point and makes this possible. In this post I’m going to show you how to do exactly this.
There’re 3 key steps involved:
- Modifying the source code.
- Naming the serialization assembly.
- Setting ParentAssemblyId.
Let’s walk through these steps in detail.
1. Modifying the source code
This step is obvious, but the questions is how do you get the source code that would be used to compile the XML serialization assembly. This is well supported, just add the following snippet into your configuration file and run your application.
<configuration>
<system.diagnostics>
<switches>
<add name="XmlSerialization.Compilation" value="1" />
</switches>
</system.diagnostics>
</configuration>
By setting this switch you’re telling XML serialization to preserve the source code it uses to generate the dynamic assembly, and by running it you’re having XML serialization to generate the source code for you. The generated assembly and the source code can be found in %temp% folder. The name of the file looks like a combination of some random characters, such as x5n5uhgy.dll and x5n5uhgy.cs.
Now you have the source file and it’s straightforward to fine tune it to suit your need.
2. Naming the serialization assembly
In order to build the serialization assembly, you also need a project. Just create a class library project with Visual Studio and add the source file into the project. When naming the assembly, you’ll need to follow the name convention. Let’s say the type being serialized lies in YourApplication.exe (or YourLibrary.dll), then the serialization assembly should be named YourApplication.XmlSerializers.dll (or YourLibrary.XmlSerializers.dll). You got the idea.
Now you build the project and get the serialization assembly, and you cannot wait to test it. It might work sometimes, in this case you’re done.
But sometimes, to your frustration, it doesn’t work, you find out XML serialization is still generating and using dynamic assembly even you’ve provided your own. Why?
3. Setting ParentAssemblyId
To answer this question let’s take a look at the following scenario: You created an assembly MyDataContracts.dll, and pre-created a serialization assembly MyDataContracts.Serialzation.dll for it. Later on you made some modifications to the type being serialized and recompiles MyDataContracts.dll, now there is a mismatch between the type inside MyDataContracts.dll and the serialization code inside MyDataContracts.Serialzation.dll, you can imagine what will happen if this old MyDataContracts.Serialzation.dll is being used with the new MyDataContracts.dll.
To solve this problem XmlSerializerVersionAttribute is introduced, it’s defined in System.Xml, the purpose is to prevent outdated pre-generated XML serialization assembly from being used. XmlSerializerVersionAttribute is used in generated assembly to save the identity information of its parent assembly (where the type being serialized lies in), and during run time, XML serialization will check if the identity information saved in the generated assembly matches the parent assembly, if the attribute doesn’t exist or the ID or version doesn’t match, the pre-generated serialization assembly will be ignored and dynamic assembly will still be generated.
If you look at the source code generated by XML serialization, you’ll find one line similar to the below:
[assembly: System.Xml.Serialization.XmlSerializerVersionAttribute(ParentAssemblyId = @"60d7b267-090e-4055-93a9-01d5489fa2ea,", Version = @"4.0.0.0")]
Mostly the answer to the previous question is that it fails the check of ParentAssemblyId, this happens when you rebuild the parent assembly which causes ParentAssemblyId to be changed – even if you didn’t change a single line of code!
Its obvious and easy to fix if there is a version mismatch, but for ParentAssemblyId mismatch, how can you get the right one? Using Reflector on System.Xml.dll you can find a method named GenerateAssemblyId, it’s the one that XML serialization uses to generate the ParentAssemblyId.
string GenerateAssemblyId (Type type){ Module[] modules = type.Assembly.GetModules(); ArrayList list = ArrayList(); (int i = ; i < modules.Length; i++) { list.Add(modules[i].ModuleVersionId.ToString()); } list.Sort(); StringBuilder builder = StringBuilder(); (int j = ; j < list.Count; j++) { builder.Append(list[j].ToString()); builder.Append(); } builder.ToString();} |
You can write a simple console application and use this method to calculate the right ParentAssemblyId. Once you’ve got the new ParentAssemblyId, set it in the XmlSerializerVersionAttribute and recompile the project.
With that, XML serialization will happily load you pre-generated assembly!
Comments
Anonymous
February 16, 2016
Hi there. I found you article useful to understand some concepts, however, I am not sure how do I set the "ParentAssemblyID". Let me elaborate what I am doing and what I am looking for. Let's say I have a dll (call it MyCompany.dll), I can successfully generate corresponding MyCompany.xmlserializer.dll file either through the setting an appropriate setting in the Visual Studio Project OR I can run script by using the PostBuild Event of the Build. However, if I investigate the GUID of MyCompany.dll and the ParentAssemblyID of the MyCompany.xmlserializer.dll they are always different. My understanding is that once the MyCompany.xmlserializer.dll is generated (either through setting the Project's setting OR by PostBuild Event) the "ParentAssemblyID" should be the same GUID of the MyCompanyDLL. If my understanding is wrong then could you advise how can I set the "ParentAssemblyID" same as GUID of the MyCompanyDLL (either by running any code OR by setting any property/configuration etc.) You mentioned the following statement in your article, however, how do I manually set the ParentAssemblyID as we run the PostBuild event on the Visual studio Project, which generates the MyCompany.xmlserializer.dll file. (You can write a simple console application and use this method to calculate the right ParentAssemblyId. Once you’ve got the new ParentAssemblyId, set it in the XmlSerializerVersionAttribute and recompile the project.) Additionally, once the IDs are set and they are deployed/hosted on IIS; if we restart the IIS Server OR restart the Application pool associated with the WCF/Web Service then would XmlSerializer generates a new GUID for MyCompany.dll? If so then it means that GUID of MyCompany.Dll would be different from "ParentAssemblyID" of MyCompany.xmlserializer.dll and this will force the CSC.exe to compile it again, which will cause performance issue. In this case how would someone keeps the GUID and ParentAssemblyID same all the time and seems impossible to keep it in sync as either IIS gets restarted for maintenance purposes all the time in our organisation. You help will be much appreciated. Thanks in advance.Anonymous
February 16, 2016
The comment has been removed