Narrowing Down Performance Problems in Managed Code
My last entry was some generic advice about how to do a good performance investigation. I think actually it's too generic to be really useful -- in fact I think it fails my Peanut Butter Sandwich Test.
Digression to discuss the Peanut Butter Sandwich Test
I review a lot of documents and sometimes they say things that are so obvious as to be uninteresting. The little quip I have for this situation is, "Yes what you are saying is true of [the system] but it's also true of peanut butter sandwiches." Consider a snippet like this one, "Use a cache where it provides benefits," and compare with, "Use a peanut butter sandwich where it provides benefits." Both seem to work... that's a bad sign.
You certainly don't want to get an F on the Peanut Butter Sandwich Test but hopefully you won't settle for just a C-.
Back on topic
I thought it would be good to follow up the generic advice with some specific suggestions for things to look at. These are things I look at in step 2 or 3 of the investigation.
Under .NET CLR Memory, check "% Time in GC" if it's getting near 10% or higher you may have some memory issues, consider these secondary tests:
- is the raw allocation rate "Allocated Bytes/sec" too high? -> reduce total allocations
- is the promotion rate "Promoted Memory from Gen 1" too high? -> be careful about object lifetimes, avoid "mid-life crisis"
- is the finalization rate "Finalization Survivors" too high? -> make sure you are disposing the key objects
- is the heap growing when it shouldn't "# Bytes in all Heaps" -> check for reference leaks
Is the CPU not saturated when it should be? Look under .NET CLR LocksAndThreads
- is the "Contention Rate / sec" counter high compared to your throughput rate? -> you should re-examine your locking strategy
- is the "# of current physical Threads" too low for the problem? -> (ammended) more parallelism may be helpful, consider using the ThreadPool if not already in use, possibly adjust ThreadPool parameters to get more threads (not usually needed)
- in the "Thread" category examine "Context Switches / sec", is this high compared to your throughput rate? -> perhaps the workitem you are giving threads in the thread pool is too small, consider something chunkier
Is the throughput rate low even though the CPU is saturated?
- look under ".NET CLR Exceptions", is "# of Excepts Thrown / sec" high compared to your throughput? -> consider reducing use of exceptions in common paths
- look under ".NET CLR Interop", is "# of marshalling" growing too fast? -> consider simplifying the arguments passed in interop cases so that marshalling is cheaper
- look under ".NET CLR Security", is "% Time in RT checks" significant? -> consider simplying the demands being placed on the security system to lower the cost of security checks
- look under ".NET CLR Jit", is "% Time in Jit" significant? This counter shouldn't stay high because jitting should settle out, if it remains high then perhaps there is dynamic code generation via reflection going on -> simply dynamic code cases
This just a taste of course, and each of these items would likely lead to further investigation with a profiling tool that is suitable to drilling into that particular kind of problem but these are examples of leading indicators that I use.
For more information on the GC Performance counters specifically see Maoni's blog entry on that subject. Her most recent article is on using the GC efficiently also very interesting, lots of good details there.
Comments
- Anonymous
May 25, 2005
That set of guidance sounds just prescriptive enough to be encodable, Rico. Do I small a PerfCop to watch running .NET apps in the future? - Anonymous
May 25, 2005
The comment has been removed - Anonymous
May 25, 2005
Awesome post, thanks a bundle! - Anonymous
May 25, 2005
The comment has been removed - Anonymous
May 26, 2005
Excellent addition to your advice! A couple quick "most likely" checks for CLRProfiler users would also be useful. - Anonymous
May 26, 2005
The comment has been removed - Anonymous
May 26, 2005
The comment has been removed - Anonymous
May 26, 2005
Robin's right on. I hardly ever have to touch those settings even though I often check the level of parallelism -- which is why it made my list in the first place.
Actually the most common reason I find for not enough parallelism is that the Thread Pool was not used when it could have been. So I think I'd like to ammend my advice to be: if paralleism seems low -> consider using our thread pool if it's not already in use - Anonymous
May 27, 2005
The comment has been removed - Anonymous
June 04, 2005
Visual Studio Team System
Bill Sheldon from InterKnowlogy has an item in the June 3rd edition of... - Anonymous
June 09, 2005
Thanks for great blog and this post. I seem it is most interesting blog about .Net performance on the net.
I have small comment and question :-)
Very often I feel like all performance advice been given for ASP-like application. Like "use thread pool". It is great idea to use thread pool if you are web server and have multiple clients. But I found out it is faster to have own task queue and own work thread(s) than use tread pool in my case: tons of small computation tasks.
I have to learn hard way how to make .NET to perform faster.
For example, I found out what I should avoid to use interfaces, because thay have double virtual calls inside and never inlined. Or, I should not use Math.Min or Math.Max for double - it is 3-4 times slower than write own Min/Max function. Surprise!
Or, reading PerformanceCounter.NextValue() is aparently very slow :-(
There is no information about details like this, which is crucial for me, but I found hundreds advices about using thread pool :-)
What should I do if my application: have no contention, a little context switches, no exceptions, almost no security checks, no Jit, no Gen 2 or 1 collections and GC usage less than 1%?
But I still have CPU maxed out... - Anonymous
June 10, 2005
>>What should I do if my application: have no contention, a little context switches, no exceptions, almost no security checks, no Jit, no Gen 2 or 1 collections and GC usage less than 1%? But I still have CPU maxed out...
Hmmm... Sounds like a good article :)