Thoughts on IIS Memory Recycling for 3rd party Applications
Sigh... it seems that the Application Health Monitoring features added in IIS6 are merely used by VARs to cover up for their own mistakes instead of leaving it in the user's control as a crutch to fall back on when 3rd party web applications fail.
Question:
We have a CRM application which runs on IIS6. The application has been crashing intermittently with the message "The server is temporary busy, please try again after a moment."
The manufacturer of the program had us adjust the default IIS AppPool properties as follows
[x] Recycle Worker processes (in minutes): 1440
[x] Recycle worker processes (number of requests): 35000
[x] Maximum virtual memory (in megabytes): 600
[x] Maximum used memory (in megabytes): 500
Changing these settings has tremendously improved the situation but we are still occasionally getting "The server is temporary busy" message.
I would go back to the manufacturer to ask for additional guidance but I got the impression they were just changing settings without knowing exactly what they were doing.
So my question is that given the server has 2GB of RAM, can we tweak these values further to eliminate this problem. What settings are recommended and why?
Also would adding additional RAM to the server help? Any assistance or documentation anyone could point me to regarding these settings would be greatly appreciated.
Thank You,
Answer:
Actually, if you want to eliminate this problem, you need to insist that your application manufacturer debug their CRM application to determine the cause of "the server is temporarily busy..." error. Only after determining the cause of the issue through debugging can one:
- produce a fix for the bug
OR - determine a work-around for the bug
<soapbox>
In other words, without knowing the cause of the issue, you have no idea whether tweaking any server parameters will help, much less eliminate, the issue. This means that it does not help to ask for "recommended values" because there are never any generally recommended values for an application - it all depends on the application requirements, status, resource footprint, etc - and since the bug is "unknown", the recommendation is also "unknown". The same goes for changing system hardware like adding more RAM - once again, without understanding the issue, making changes is like gambling - and why gamble when the application manufacturer is supposed to support you in getting their CRM application working?
And if your concern is that the application manufacturer has no idea what they are doing with tuning/debugging their own application... then I wonder how you can trust their CRM application at all. The same group of people supporting the CRM application probably also wrote the CRM application - and if they cannot get you proper support for their application, then how can you rely on them long term?
In general, if someone tells you that "to run my application, you have to tweak THESE Application Pool Health Monitoring metrics", it means that their application is not written well enough to stay running.
</soapbox>
Right now, what it sounds like is that either this CRM application has some bug inside of it, either inside the program logic or its related configuration, and this bug eventually leads to the "server is temporarily busy..." error. The vendor either has no idea about the bug or knows about it but does not want to fix it.
There are two general ways to "resolve" a bug:
- Find a work-around to avoid the bug
- Fix the actual bug
Right now, it sounds like the vendor does not want to do #2 to truly eliminate the problem, so they distract you by telling you to tweak IIS AppPool Properties in an attempt to do #1. The problem with #1 is that it does NOT eliminate the problem - the bug is still there in the application you are running - and without debugging the issue, you cannot identify the problem and has no idea whether setting changes actually work-around the problem.
But... the vendor managed to get you distracted and wishfully thinking that YOU are actually empowered to resolve THEIR bug by simply tweaking IIS settings, adding more RAM, or otherwise tuning your system - when you have no idea what issue is being worked around. Meanwhile, the bug is still there in the application you are running...
Does this make any sense?
For example, adding more RAM does not solve logical bugs in an application - at best, it delays the problem. Suppose the issue is a memory leak. By adding more RAM, it simply means it takes longer for the application to consume all of your server's memory; it does not fix the leak such that the application never consumes more memory than it needs. Sure, you can recycle the process more frequently to periodically "free" up the memory, but recycling destroys in-process state and caches which have down-stream performance and reliability effects... all depends on the application's architecture. You merely substitute one unknown problem for another and do not make progress towards eliminating problems.
Remember, AppPool Health Monitoring Metrics are just that - generic metrics to determine the health of the application, and if deemed unhealthy by those metrics, recycle the worker process to clear away stale application state and start afresh. It is like rebooting the application.
In other words, it is meant as a possible crutch to keep the application running while support personel debug the issue and provide a fix. It is not meant as the "solution" to buggy software because it simply consumes one of YOUR defensive trump cards against buggy applications.
In general, the best solution to buggy software is to identify the issue and get it fixed, and this is what I suggest you insist your application manufacturer perform. Here are some useful blog entries for this endeavor:
- Basics of IIS6 Troubleshooting
- HOWTO: Understand and Diagnose an Application Pool Crash
- HOWTO: Thoughts on Applications Running out of Threads
//David
Comments
- Anonymous
April 10, 2006
The comment has been removed - Anonymous
April 11, 2006
The comment has been removed - Anonymous
April 12, 2006
The comment has been removed - Anonymous
April 20, 2006
The comment has been removed - Anonymous
April 20, 2006
The comment has been removed - Anonymous
April 20, 2006
To be fair I'm a massive Microsoft supporter and I'd look at everything possible before blaming IIS or dotnet.
This problem was something that as a dotnet developer I'd have blamed on the app 99 times out of 100... all I'm saying is it's something to consider especially on heavily hit sites.
The KB article is http://support.microsoft.com/?id=841195
Basically the response buffers fragment the heap and the CLR can't allocate another 64MB contigious block resulting in an OutOfMemoryException even with shed loads of free memory.
Jon - Anonymous
April 20, 2006
I take your point though that dotnet is not IIS but then neither is the worker process but that doesn't stop there being a recycle worker process setting in IIS6 as discussed at the top of this thread ;) - Anonymous
April 20, 2006
Jon - Thanks for providing the KB article.
Yup, this issue is with ASP.Net and not IIS.
The problem is that ASP.Net ends up using memory in a way that fragments memory, and having that fragmented memory pinned on a native-call to IIS (such as sending a response over a slow connection) simply exacerbates the implementation flaw in ASP.Net. IIS is just the helpless victim here. :-)
//David - Anonymous
April 20, 2006
.... and maybe the third party app ;)
Thanks for the discussion and your insight, I'll be tuning into your blog more often.
Jon - Anonymous
April 20, 2006
Jon - hehe, ah, well, w3wp.exe is almost always the helpless victim on IIS6. ;-)
Just look at its work contract: It is told to run everything with NO control of what that actually is... and Windows always report "w3wp.exe" as the Faulting Process, not the code that is run. It is like the ultimate scapegoat.
You should see our Watson OCA buckets when we debug those crash reports you send in for w3wp.exe...
//David - Anonymous
October 23, 2009
Please install Service PACK 2 in server you problem of IIS memory will be eradicated. Try and let me know even we were facing same issues. Now it is sorted out.