Tracking down managed memory leaks (how to find a GC leak)

If you think you've got memory leaks, or if you're just wondering what kind of stuff is on your heap you can follow the very same steps that I do and get fabulous results your friends will envy.  OK, well maybe not,  but they're handy anyway. 

These steps will help you to go from a suspected memory leak to the specific object references that are keeping your objects alive.  See the Resources at the end for the location of all these tools.

Step 1: Run your process and put it in the state you are curious about

Be sure to choose a scenario you can reproduce if that's at all possible, otherwise you'll never know if you're making headway in clearing out the memory leaks.

Step 2: Use tasklist to find its process ID

C:\>tasklist

Image Name PID Session Name Session# Mem Usage
========================= ====== ================ ======== ============
System Idle Process 0 RDP-Tcp#9 0 16 K
System 4 RDP-Tcp#9 0 112 K
smss.exe 624 RDP-Tcp#9 0 252 K
...etc...
ShowFormComplex.exe 4496 RDP-Tcp#9 0 20,708 K
tasklist.exe 3636 RDP-Tcp#9 0 4,180 K

From here we can see that my process is ID #4496

Step 3: Use VADump to get a summary of the process

C:\>vadump -sop 4496
Category Total Private Shareable Shared
Pages KBytes KBytes KBytes KBytes
Page Table Pages 35 140 140 0 0
Other System 15 60 60 0 0
Code/StaticData 4596 18384 4020 3376 10988
Heap 215 860 860 0 0
Stack 30 120 120 0 0
Teb 4 16 16 0 0
Mapped Data 129 516 0 24 492
Other Data 157 628 624 4 0

      Total Modules 4596 18384 4020 3376 10988
Total Dynamic Data 535 2140 1620 28 492
Total System 50 200 200 0 0
Grand Total Working Set 5181 20724 5840 3404 11480

Here we can see that the process is mostly code (18384k)

The vast majority of the resources that the CLR uses are under "Other Data" -- this is because the GC Heap is directly allocated with VirtualAlloc -- it doesn't go through a regular windows heap.  And same for the so-called "loader heaps" which hold type information and jitted code.  Most of the conventional "Heap" allocations are from whatever unmanaged is running.  In this case it's a winform application with piles of controls so there's storage associated with those things.

There isn't much "Other Data" here so the heap situation is probably pretty good but let's see where we stand on detailed CLR memory usage.

Step 4: Attach Windbg and load SOS

C:\> windbg -p 4496

Once the debugger loads use this command to load our extension DLL

0:004> .loadby sos mscorwks

This tells the debugger to load the extension "sos.dll" from the same place that mscorwks.dll was loaded.  That ensures that you get the right version of SOS (it should be the one that matches the mscorwks you are using)

Step 5: Get the CLR memory summary

This command gives you a summary of what we've allocated.  The output will be a little different depending on the version of the runtime you are using.  But for a simple application you get the loader heaps for the two base domain structures (they just hold objects that can be shared in assorted ways) plus the storage for the first real appdomain (Domain 1).  And of course the jitted code.

0:004> !EEHeap
Loader Heap:
--------------------------------------
System Domain: 5e093770
...etc...
Total size: 0x8000(32768)bytes
--------------------------------------
Shared Domain: 5e093fa8
...etc...
Total size: 0xa000(40960)bytes
--------------------------------------
Domain 1: 14f0d0
...etc...
Total size: 0x18000(98304)bytes
--------------------------------------
Jit code heap:
LoaderCodeHeap: 02ef0000(10000:7000) Size: 0x7000(28672)bytes.
Total size: 0x7000(28672)bytes
--------------------------------------
Module Thunk heaps:
...etc...
Total size: 0x0(0)bytes
--------------------------------------
Module Lookup Table heaps:
...etc...
Total size: 0x0(0)bytes
--------------------------------------
Total LoaderHeap size: 0x31000(200704)bytes

So here we've got 200k of stuff associated with what has been loaded, about 28k of which is jitted code. 

Next in the output (same command) is the summary of the GC Heap.

=======================================
Number of GC Heaps: 1
generation 0 starts at 0x00a61018
generation 1 starts at 0x00a6100c
generation 2 starts at 0x00a61000
ephemeral segment allocation context: none
segment begin allocated size
001b8630 7a8d0bbc 7a8f08d8 0x0001fd1c(130332)
001b4ac8 7b4f77e0 7b50dcc8 0x000164e8(91368)
00157690 02c10004 02c10010 0x0000000c(12)
00157610 5ba35728 5ba7c4a0 0x00046d78(290168)
00a60000 00a61000 00aac000 0x0004b000(307200)
Large object heap starts at 0x01a61000
segment begin allocated size
01a60000 01a61000 01a66d90 0x00005d90(23952)
Total Size 0xcdd18(843032)
------------------------------
GC Heap Size 0xcdd18(843032)

You will likely have many fewer small-heap segments than I did because I did this test on an internal debug build so there's funny things like a 12 byte segment in the dump.  But you'll see what segements there are and how big they are and you can see what the current boundaries are on the generations from which you can compute thier current exact size.  (Note that this is likely different than what the performance counters report as you can see in Maoni's blog -- those counters are budgeted from the last GC not the instanteous value -- it would be too expensive to keep updating the instantaneous value)

So in this case we can see that there is about 843k of GC heap.  Comparing that to the other data category there was about 2M total of other data.  The CLR accounts for about 1M of that.  The rest is likely bitmaps allocated from my winforms application's controls but whatever it is, it isn't CLR stuff...

Step 5: Dump the GC Heap statistics

Next we'll want to know, by type, what's on the heap at this exact instant

0:004> !DumpHeap -stat
... sorted from smallest to biggest ... etc. etc...

7b586c7c 436 10464 System.Internal.Gdi.WindowsGraphics
5ba867ac 208 11648 System.Reflection.RuntimeMethodInfo
7b586898 627 12540 System.Internal.Gdi.DeviceContext
5baa4954 677 39992 System.Object[]
5ba25c9c 8593 561496 System.String
Total 17427 objects

Note that this dump includes both reachable and unreachable objects so unless you know that the GC just ran before you did this command you'll see some dead stuff in this report as well.  Sometimes its interesting and useful to force a GC to run before you do this so that you can get a summary of just the live stuff.  Sometimes it's useful to do dumps before and after forcing a GC so that you can see what sort of things are dying.  This may be a way to gather evidence that a forced GC is necessary.  See my blog on When to Call GC.Collect().

So let's suppose that there weren't supposed to be 208 System.Reflection.RuntimeMethodInfo objects allocated here and that we thought that was a leak.  One of the things we'll want to do is to use CLR Profiler to see where those objects are being allocated -- that will give us half the picture.  But we can get the other half of the picture right here in the debugger.

Step 6: Dump Type Specific Information

We can dump each object whose type name includes a given string with a simple command

0:004> !DumpHeap -type System.Reflection.RuntimeMethodInfo
Address MT Size
00a63da4 5baa62c0 32
00a63e04 5baa6174 20
00a63e2c 5ba867ac 56
00a63e64 5baa5fa8 16
00a63e88 5baa5fa8 16
00a63f24 5baa6174 20
00a63f4c 5ba867ac 56
00a63f84 5baa5fa8 16
etc. etc. etc.
total 630 objects
Statistics:
MT Count TotalSize Class Name
5baa62c0 3 96 System.RuntimeType+RuntimeTypeCache+MemberInfoCache`1[[System.Reflection.RuntimeMethodInfo, mscorlib]]
5baa5fa8 211 3376 System.Reflection.CerArrayList`1[[System.Reflection.RuntimeMethodInfo, mscorlib]]
5baa6174 208 4160 System.Collections.Generic.List`1[[System.Reflection.RuntimeMethodInfo, mscorlib]]
5ba867ac 208 11648 System.Reflection.RuntimeMethodInfo
Total 630 objects

Note that the type we wanted was System.Reflection.RuntimeMethodInfo and we can see that it has a method table 5ba867ac.  Those are the 56 byte objects. Now we can investigate some of these and see what is causing them to stay alive.

Step 7: Identify the roots of suspected leaks

One of the lines in the dump was

00a63e2c 5ba867ac 56    

So that tells us there is an object of the type we want at address 00a63e2c.  Let's see what's keeping it alive

0:004> !gcroot 00a63e2c
Scan Thread 0 OSTHread 1598
Scan Thread 2 OSTHread 103c

DOMAIN(0014F0D0):
HANDLE(WeakLn):3f10f0:
Root:00a63d20(System.RuntimeType+RuntimeTypeCache)
->00a63da4(System.RuntimeType+RuntimeTypeCache+MemberInfoCache`1[[System.Reflection.RuntimeMethodInfo,mscorlib]])
->00a63e88(System.Reflection.CerArrayList`1[[System.Reflection.RuntimeMethodInfo, mscorlib]])
->00a63e98(System.Object[])
->00a63e2c(System.Reflection.RuntimeMethodInfo)

DOMAIN(0014F0D0):
HANDLE(Pinned):3f13ec:
Root:01a64b50(System.Object[])
->00a62f20(System.ComponentModel.WeakEventHandlerList)
->00a63fb4(System.ComponentModel.WeakEventHandlerList+ListEntry)
->00a63ec4(System.ComponentModel.WeakEventHandlerList+ListEntry)
->00aa5f6c(System.ComponentModel.WeakDelegateHolder)
->00a63e2c(System.Reflection.RuntimeMethodInfo)

I've added some extra line breaks to the output above to make it easier to read but otherwise it's the raw output.

The gcroot command is trying to tell you if the object is reachable and if so how it is reached from each root.  The dump won't include all the ways the object is reachable but you do get at least one way to find the object -- usually that's enough.  If multiple paths are dumped they often have a common tail.  However the object is reachable (here it looks like maybe only weak references are left so this guy might go away on the next collect) that should give you a hint about (some of) the remaining references.  From there you can decide what pointers to null so that the object is properly released.

Resources

You can get information on windbg at this location.
Vadump has usage information and a download from msdn and microsoft.com respectively.

If those links break, searching for windbg and vadump on the microsoft.com home page gave good results, that's how I got those links in the first place. 

CLR Profiler is available here.

It comes with documentation but there is additional material available in the Performance PAG in Chapter 13.

Comments

  • Anonymous
    December 10, 2004
    Great post, Rico. I would love to know more about doing this type of low-level debugging. My current experience is limited to source code level debuggers.

    Also, I'm a little bit curious about that last dump in your post -- the output of the gcroot command.

    Does this imply that there will be events/delegates in .Net 2.0 that use weak refs?

  • Anonymous
    December 10, 2004
    You know I'm often surprised by what I see when I dump the heap. This guy System.ComponentModel.WeakEventHandlerList seems like he could be quite interesting and I don't know thing one about him. It might be generally useful but it might also be a private type with an unfortunate name. Even internal types appear in the low level dumps like this.

    I'll see if I can't find out something for my own curiosity if nothing else.

  • Anonymous
    December 10, 2004
    Excellent post Rico. What I would like to see someone work on is dumping this information into a file directly from CLRProfiler. That way you can view the dump and the heap and object allocations knowing they all are generated under the same circumstances.

  • Anonymous
    December 10, 2004
    For 1.1 it's probably better to use sos.dll that comes with the latest debuggers. It's loaded automatically when you attach to a process that has CLR dlls loaded, and it has some fixes and new commands/shortcuts that the original Everett version of sos doesn't.

  • Anonymous
    December 10, 2004
    >That way you can view the dump and the heap and object allocations knowing they all are generated under the same circumstances.

    It is often very useful to start the process under CLRProfiler and then also attach with the debugger so you can do both on the same process. This works just fine. In fact it's extra handy because CLR Profiler's "Dump Heap Now" forces a garbage collection so you can use it to see what dead things are going away in the debugger dump and visually in CLRProfiler.

    >>For 1.1 it's probably better to use sos.dll that comes with the latest debuggers. It's loaded automatically when you attach to a process that has CLR dlls loaded, and it has some fixes and new commands/shortcuts that the original Everett version of sos doesn't.

    Nothing is ever easy :)

    It turns out there is this other thing that is also called SOS which isn't quite the same thing as the SOS that we build along with the runtime even though it has many of the same commands and common heritage. That one seems to be auto-loaded (you can find it in a subdirectory under wherever you install windbg) and I think it works on v1.0 and v1.1 of the runtime. I think it has a few features not present in the original SOS we deployed so it can be useful.

    It may be that there will be an enhanced SOS for version 2.0 of the runtime some time after it ships. So basically you can try the commands without loading an SOS explicitly and just get whatever is there or you can go with the "golden" version and take what was originally shipped.

    For myself, I always use .loadby sos mscorwks but of course I use the runtime build of the moment so anything else would be lunacy. For you, gentle reader, you may find that you like some of the enchanced features in the other SOS and it works fine on your runtime.

    Either way the leak tracking instructions are the same.

  • Anonymous
    December 11, 2004
    Rico,

    Why the VS team doesn't include some functionality, which can automate or help tracing those kinds of managed memory leaks? Those steps can be automated, can't they ?

  • Anonymous
    December 12, 2004
    Blog link of the week 50

  • Anonymous
    December 13, 2004
    Ivan Peev asks: "Why the VS team doesn't include some functionality, which can automate or help tracing those kinds of managed memory leaks? Those steps can be automated, can't they?"

    There was very little in terms of memory analysis features in Visual Studio.NET -- I think that reflects two things: first that there were lots of problems to solve and we couldn't solve them all in one release and second is that some problems we didn't really know how to solve anyway. I think some of both is going on in this case.

    Which brings me to the second point: Can this all be automated? Well, sort of, the tricky bits are in Step 5 -- how do you automatically know which types are the ones that should have gone away and which types were supposed to be living (because they are in a cache or something) -- and in Step 7 -- how do you automatically know which instances of the type are the problematic ones?

    It's rather tricky.

  • Anonymous
    December 14, 2004
    this is a good post.. thanks

  • Anonymous
    December 15, 2004
    Great blog well worth bookmarking.

  • Anonymous
    December 16, 2004
    Great post - but why not use a tool do all this hardwork? During our release cycle this fall I found MemProfiler (www.scitech.se/memprofiler/ ) and blogged a bit about my experience (http://dotnetjunkies.com/WebLog/mlevison/archive/2004/09/30/27265.aspx).

    For $100, I'm too lazy too work as hard as you.

  • Anonymous
    December 16, 2004
    Great article will make a great set of interview questions :)

  • Anonymous
    December 16, 2004
    Your blog rocks....

  • Anonymous
    December 17, 2004
    I've found the CLRProfiler provides the same information, but in a much easier to use interface.

    Just my 2 cents!

  • Anonymous
    December 17, 2004
    You can get similar/related information from CLRProfiler and even better information from some 3rd party tools.

    Advantages to the approach given above:

    1) You can do this after the fact if you witness a problem, you don't have to start under the profiler as you can attach the debugger

    2) You can get per-object information about objects and their reachability (!gcroot) which is much trickier to get from say the heap dump

    3) You can get valuable summary information about the overall memory usage of the CLR (!EEHeap)

    Plus it's all free :)

  • Anonymous
    December 19, 2004
    The comment has been removed

  • Anonymous
    December 20, 2004
    <p>&lt;ul&gt;&lt;li&gt;&lt;a href=&quot;http://www.hriders.com/web_page.cfm?web_pageID=38&quot; target=&quot;_blank&quot;&gt;1 Tb mail account?&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;http://mek.oszk.hu/01900/01984/html/&quot; target=&quot;_blank&q

  • Anonymous
    December 23, 2004
    The tool looks really neat. I'll give it a whirl when I'm back from vacation.

    Happy Holidays everyone!

  • Anonymous
    December 28, 2004
    The comment has been removed

  • Anonymous
    December 28, 2004
    http://www.scitech.se/memprofiler/

  • Anonymous
    January 05, 2005
    Thanks for the excellent article Rico. I have one (somewhat related) question about GC memory management/leaks and the CLR. Beyond the steps you outline to identify memory leaks, is there any way to control the min and max managed heap size used by the CLR? Most JVMs allow a min and max managed heap size to be specified as start-up params, and I haven't seen any mention of something similar for the CLR. I'd ideally like to be able to tell a .NET windows service app I'm developing a maximum allowed heap size, and then let the GC do it's thing within that constraint. Any thoughts?

  • Anonymous
    January 05, 2005
    Some quick responses sort of in order:

    Q: Will this work on the ASP.NET worker process?

    A: I don't see why it wouldn't. It's not magic or anything, and you can attach to it with the debugger same as any other.

    Re: http://www.scitech.se/memprofiler/

    It looks pretty cool, I'll have to play with it some more. I wonder if it works on the daily build :)

    Q: Is there any way to control the min and max managed heap size used by the CLR?

    I don't think there are environment variables for that but you can do this and more with the hosting api (the CLR calls you to get memory and so forth so that it can be hosted in more exotic processes like say SQL Server where you don't want us to go and get memory directly)

    http://www.gotdotnet.com/team/clr/about_clr_Hosting.aspx

  • Anonymous
    January 06, 2005
    Hi Rico - thanks for the pointer to the CLR Hosting articles. From what I can tell, the hosting APIs don't provide a way to limit the amount of memory used by the CLR beyond ICorRuntimeHost's Start() and Stop() methods. That seems like a strange way to manage CLR resource use - hard stopping it which unloads it from the current process and then restarting it in a new process. Is stopping the CLR the only way to release resources back to the system, and is the CLR team considering any enhancement to the CLR startup shim to allow the min/max managed heap size to be defined at application start?

    Thanks again for all your excellent posts over this past year.

  • Anonymous
    January 06, 2005
    Scratch that - I had overlooked ICorConfiguration 's SetGCHostControl method:)

  • Anonymous
    February 24, 2005
    Tracking down managed memory leaks (how to find a GC leak)

    A number of resource for locating GC leaks:
    You might find this blog entry worth reading:
    http://weblogs.asp.net/ricom/archive/2004/12/10/279612.aspx

    SciTek's...

  • Anonymous
    March 27, 2005
      How to track managed memory leak, also how to use windbg and sos extension for managed debugging in...

  • Anonymous
    February 21, 2006
    PingBack from http://blog.actapps.com.au/2006-02-22/garbage-collection/

  • Anonymous
    April 03, 2006
    I was just going through some memory leak information and I stumbled across a newish posting from Tess:...

  • Anonymous
    April 13, 2006
    In recent builds, we have been having an awful memory leak in our system. Silvio was debugging it and...

  • Anonymous
    April 13, 2006
    In recent builds, we have been having an awful memory leak in our system. Silvio was debugging it and...

  • Anonymous
    January 06, 2007
    Managed code makes memory management much easier, but it's still possible to have unintended memory leaks.

  • Anonymous
    January 09, 2007
    This problem actually comes up pretty often so I thought I'd write a little article about it, and a couple

  • Anonymous
    February 10, 2007
    You've been kicked (a good thing) - Trackback from DotNetKicks.com

  • Anonymous
    November 03, 2007
    Here is a little interchange I had a few days ago, "Nick From Chicago" graciously allowed me to share

  • Anonymous
    November 03, 2007
    PingBack from http://msdnrss.thecoderblogs.com/2007/11/04/file-open-performance-beware-of-extensions/

  • Anonymous
    November 03, 2007
    PingBack from http://msdnrss.thecoderblogs.com/2007/11/04/file-open-performance-beware-of-extensions/

  • Anonymous
    November 03, 2007
    Here is a little interchange I had a few days ago; &quot;Nick From Chicago&quot; graciously allowed me

  • Anonymous
    February 04, 2008
    &#160; There are numbers of blogs that folks wrote about memory leaks in Microsoft .Net Framework managed

  • Anonymous
    February 04, 2008
    There are numbers of blogs that folks wrote about memory leaks in Microsoft .Net Framework managed code

  • Anonymous
    February 04, 2008
    PingBack from http://msdnrss.thecoderblogs.com/2008/02/04/finding-memory-leaks-in-wpf-based-applications/

  • Anonymous
    May 24, 2008
    PingBack from http://blog.paulbetts.org/index.php/2008/05/24/debugging-net-exceptions-without-vs-installed-using-windbg/

  • Anonymous
    June 03, 2008
    Last week one of my customers called me to help him resolve a big problem on an asp.net application: a memory leak. During the application stress test, the w3wp process memory increased abnormally as a result of an application crash (application pool

  • Anonymous
    July 20, 2008
    PingBack from http://www19.a2hosting.com/~tarasn/devintelligence.com/?p=597

  • Anonymous
    October 10, 2008
    Tracking down managed memory leaks (how to find a GC leak)...

  • Anonymous
    October 17, 2008
    PingBack from http://rhnatiuk.wordpress.com/2008/10/18/tracking-down-managed-memory-leaks/

  • Anonymous
    January 18, 2009
    PingBack from http://www.keyongtech.com/683612-memory-leak

  • Anonymous
    January 20, 2009
    PingBack from http://www.hilpers.com/1040753-speicherverbrauch-von-net-anwendungen

  • Anonymous
    February 21, 2009
    PingBack from http://rusanu.com/2009/01/19/clr-memory-leak/

  • Anonymous
    March 11, 2009
    In my last post, I explained how it was possible for "hidden" event handlers to introduce memory leaks

  • Anonymous
    March 21, 2009
    It's been a while since the last post was online. We have been very busy in working on one of the very