Web Application Memory leakage caused by XML operations - GetElementsByTagName()
Symptom
=============
In ASP.NET web application, if you do a lot of GetElementsByTagName() operations with an XML document which is stored in ASP.NET Application state, the CLR memory usage will continuously increase and finally leads to OOM(Out Of Memory).
Root Cause
=============
This problem occurs because the GetElementsByTagName method returns an XmlNodeList collection that registers listeners(instances of XmlNodeChangedEventHandler) on the NodeInserted and the NodeRemoved events. For example, when you call the GetElementsByTagName method ten times, the NodeInserted and the NodeRemoved events have ten listeners. Therefore, when you call the GetElementsByTagName method many times, many XmlNodeChangedEventHandler objects are created and they will only be released when the XmlDocument is released.
Analysis
=============
With the memory Userdump, we can find most of the memory is consumed by XmlNodeChangedEventHandler and XmlElementList. Please ignore the XmlElementList, because they are created together with XmlNodeChangedEventHandler. The amount of XmlNodeChangedEventHandler is almost two times of XmlElementList, this means two listeners(on NodeInserted and NodeRemoved events) serve for one XmlElementList.
0:000> !DumpHeap -stat
Using our cache to search the heap.
Address MT Size Gen
0x79bff564 1 12 System.Runtime.Remoting.Activation.ActivationListener
……
……
0x16b111c4 92,970 1,859,400 System.Xml.XmlText
0x0221236c 767 2,005,896 System.Char[]
0x0221209c 56,987 3,424,876 System.Object[]
0x79b94690 163,304 15,341,816 System.String
0x17c4f0c4 4,159,551 183,020,244 System.Xml.XmlElementList
0x16adcc14 8,319,114 232,935,192 System.Xml.XmlNodeChangedEventHandler
Total 13,363,835 objects, Total size: 456,945,968
If you never manually add the listeners on the XmlDocument object, then it is mostly caused by GetElementsByTagName() operations. And we can find the memory is continuously increasing as time go on.
However, we cannot say this is a bug for GetElementsByTagName().The MS implementation of this function conforms to the W3C Level1 DOM spec. NodeLists and NamedNodeMaps in the DOM are "live", that is, changes to the underlying document structure are reflected in all relevant NodeLists and NamedNodeMaps. In other words, GetElementsByTagName is, according to the spec, supposed to return a ‘live list’ where changes to the underlying DOM are reflected in the returned NodeList.
For details please refer to https://msdn.microsoft.com/en-us/library/system.xml.xmlelement(VS.80).aspx
Solution
=============
To avoid this problem, please replace GetElementsByTagName with SelectNodes or SelectSingleNode. Another choice, don’t maintain the XmlDocument in memory for a long time.
Regards,
ZhiXing Lv
Comments
- Anonymous
June 12, 2009
SharePoint Uploading Files to SharePoint Server 2007 from ASP.NET Web Applications by Using the HTTP - Anonymous
March 20, 2012
Thank you! We ran into this recently after updating to .NET 4 and had been tracking down the side-effects of this as opposed to the core issue. This post was extremely helpful in resolving our issue. Much appreciated!Kind regards,_Matthew - Anonymous
July 30, 2014
Thanks alot, this was a very helpful post.