Notes from debugging a managed memory leak
Recently, I spent a while digging into a managed memory leak. This is a pretty well-worn blog topic, but I figured I would add my two cents to it anyway, as I found a few things that I didn't notice in the existing blogs.
First, Rico wrote up the basic approach back in 2004, so you should start by reading this - https://blogs.msdn.com/ricom/archive/2004/12/10/279612.aspx. This will give you an intro to using sos.dll in WinDbg.
What I would like to add:
#1: How to decide if you have a leak in the first place.
Since GC's happen non-deterministically, it can be hard to know if you actually have a managed leak. For example, if you look at memory usage at the end of a user scenario, you will likely see memory usage all over the map based on when the last GC happened. The best technique I found for this is to stop after gen-2 collections. This isn't perfect since gen-2 collections can still happen at any time in your code, but it still gives you a better estimate then stopping after user scenarios.
To stop after the next gen-2 GC: !findroots -gen 2
Note that this command is new for the CLRv4 version of sos.dll (also available in Silverlight). I am assuming that you could achieve similar functionality with a well-placed breakpoint in older CLRs, but I am not familiar enough with the inner workings of the GC to tell you where.
#2: Use CLRProfiler to visualize the leaks
This may have been specific to my scenario, but I didn't have a lot of success with !gcroot. I had more success understanding the problem by loading up a .log file in CLRProfiler (https://www.microsoft.com/downloads/details.aspx?FamilyID=a362781c-3870-43be-8926-862b40aa0cd0&DisplayLang=en). One note that I found here was to _not_ use '-xml' when saving out the log as CLRProfiler doesn't understand the XML format.
To save the log out: !TraverseHeap c:\users\greggm\desktop\myheap.log
#3: !gcroot doesn't show roots in CCW's
When native code calls into managed code from COM, native code gets a CCW (COM callable wrapper). If native code leak's its CCW, the managed object will be leaked, but !gcroot will not tell you why.
Comments
Anonymous
November 04, 2009
If !gcroot doesn't show you the roots from the CCWs, is there a way of discovering them?Anonymous
November 05, 2009
You would need to look at where they are referenced from native code. In other words, check where the managed objects are getting marshalled to native, and determine if native is leaking the returned reference.