次の方法で共有


The case of the non-responsive MVC Web Application

 

A colleague (and friend) from Azure team asked me to help narrow down a strange case of a MVC Entity Framework application (hosted in Virtual machine) which occasionally stopped responding, requiring a IIS reset to work again. The products I support do not include IIS, MVC nor Entity Framework. I was invited to help as consultant for NetExt extension.

This case study uses WinDbg with NetExt extension. See this to get started with WinDbg and NetExt:
https://blogs.msdn.com/b/rodneyviana/archive/2015/03/10/getting-started-with-netext.aspx

Verifying how many web application are running under the worker process

Listing the .NET Application Domains can gives the number of web applications under the application pool. Just get the number and subtract 3. This technique was discussed here:: https://blogs.msdn.com/b/rodneyviana/archive/2015/03/12/the-strange-case-of-the-application-pool-recycling-causing-high-cpu.aspx

 

 0:000> !wdomain
 Address  Domain Name                                 Modules Base Path & Config
 747e0c08 System                                           43 
 747e0f60 Shared                                                          
 01804078 DefaultDomain                                    10 Base Path: D:\Windows\SysWOW64\inetsrv\ Config: w3wp.exe.config 
 0184a7b0 /LM/W3SVC/2033431354/ROOT-1-130705968419350748   75 Base Path: D:\home\site\wwwroot\ Config: web.config 

Having only one web application to the application pool ruled out the problem of another web site locking this one. Ruling out is as imported as ruling in. Notice that default domain points to SysWOW64 which means that the w3wp process is running as 32-bits. To debug this dump file I had to use WinDbg 32-bits and the 32-bits version of NetExt. You should never run a Web Application as 32-bits unless it is necessary (i.e. if the application depends on a native 32-bits dll). IIS application  running as 32-bits will always run out of memory because of memory fragmentation. This is a frequent problem in 32-bit processes that run for long time. Memory is allocated and de-allocated all the time leaving “holes” in memory. In order to allocate, let’s say 10MB of memory, it is necessary at least one 10MB block of “contiguous” memory available. If there is 100 x 1MB free memory segments but not a single segment with 10MB free, your application will not be able to allocate 10MB of contiguous memory and will yield a “Out of Memory” exception. The maximum memory available for a 32-bits process is 4GB (because the OS is 64-bits, otherwise it would be only 2GB). However, memory fragmentation was not the problem in this case.

Verifying how many requests were active

To run commands requiring heap look up, you need to index the heap first.

 0:000> !windex
 Starting indexing at 20:29:08 PM
 1000000 objects...
 Indexing finished at 20:29:10 PM
 97,208,569 Bytes in 1,490,134 Objects
 Index took 00:00:02
  
  
 0:000> !wruntime
 Runtime Settings per Application Pool
  
 =========================================================================
 Address         : 066C5404
 First Request   : 3/24/2015 1:14:35 PM
 App Pool User   : IIS APPPOOL\contosofarm
 Trust Level     : Full
 App Domnain Id  : /LM/W3SVC/1264340133/ROOT-1-130716764614361954
 Debug Enabled   : False
 Active Requests : 0n548
 Path            : D:\home\site\wwwroot\ (local disk)
 Temp Folder     : D:\local\Temporary ASP.NET Files
 Compiling Folder: D:\local\Temporary ASP.NET Files\root\7135b59b\46bce28c
 Shutdown Reason : Not shutting down

We have 548 active connections. This is just another information we will need to find the problem, but the fact of having that many requests is not a problem per se.

Checking the running requests pattern

 0:000> !whttp -running
 HttpContext    Thread Time Out Running  Status Verb     Url
 02f1b334           -- 00:01:00 00:20:04    200 GET      https://contosoapp.com:443/OrderSystem/Order
 03095104           -- 00:01:00 00:21:22    200 GET      https://contosoapp.com:443/OrderSystem/Order
 030968ec           -- 00:01:00 00:21:03    200 GET      https://contosoapp.com:443/OrderSystem/Order
 0309873c           -- 00:01:00 00:21:11    200 GET      https://contosoapp.com:443/OrderSystem/Order
 030990b4           -- 00:01:00 00:21:11    200 GET      https://contosoapp.com:443/OrderSystem/Order
 0309f154           -- 00:01:00 00:21:40    200 GET      https://contosoapp.com:443/OrderSystem/Order
 030a2a70           -- 00:01:00 00:21:30    200 GET      https://contosoapp.com:443/OrderSystem/Order
 030a3404           -- 00:01:00 00:21:39    200 GET      https://contosoapp.com:443/OrderSystem/Order
 030a6d20           -- 00:01:00 00:21:39    200 GET      https://contosoapp.com:443/OrderSystem/Order
 030aa63c           -- 00:01:00 00:21:39    200 GET      https://contosoapp.com:443/OrderSystem/Order
 030adf58           -- 00:01:00 00:21:39    200 GET      https://contosoapp.com:443/OrderSystem/Order
 030b1874           -- 00:01:00 00:21:38    200 GET      https://contosoapp.com:443/OrderSystem/Order
 030b5190           -- 00:01:00 00:21:38    200 GET      https://contosoapp.com:443/OrderSystem/Order
 030b8aac           -- 00:01:00 00:21:38    200 GET      https://contosoapp.com:443/OrderSystem/Order
 030bc3c8           -- 00:01:00 00:21:38    200 GET      https://contosoapp.com:443/OrderSystem/Order
 030bfce4           -- 00:01:00 00:21:38    200 GET      https://contosoapp.com:443/OrderSystem/Order
 030c3600           -- 00:01:00 00:21:38    200 GET      https://contosoapp.com:443/OrderSystem/Order
 030c6f1c           -- 00:01:00 00:21:38    200 GET      https://contosoapp.com:443/OrderSystem/Order
 030dbc34           -- 00:01:00 00:20:16    200 GET      https://contosoapp.com:443/OrderSystem/Order
 030dcaa0           -- 00:01:00 00:20:18    200 GET      https://contosoapp.com:443/OrderSystem/Order
 030df8a8           -- 00:01:00 00:21:37    200 GET      https://contosoapp.com:443/OrderSystem/Order
 (... lines removed for brevity ...)
 0f8461b0           -- 00:01:00 00:20:06    200 GET      https://contosoapp.com:443/OrderSystem/Order
 0f8471f4           -- 00:01:00 00:20:06    200 GET      https://contosoapp.com:443/OrderSystem/Order
 0f8503b0           -- 00:01:00 00:20:05    200 GET      https://contosoapp.com:443/OrderSystem/Order
 0f857058           -- 00:01:00 00:20:05    200 GET      https://contosoapp.com:443/OrderSystem/Order
 0f85e0e4           -- 00:01:00 00:20:05    200 GET      https://contosoapp.com:443/OrderSystem/Order
 0f8678f4           -- 00:01:00 00:20:05    200 GET      https://contosoapp.com:443/OrderSystem/Order
 0f871104           -- 00:01:00 00:20:04    200 GET      https://contosoapp.com:443/OrderSystem/Order
 0f874a40           -- 00:01:00 00:20:04    200 GET      https://contosoapp.com:443/OrderSystem/Order
 0f87b2c8           51 00:01:00 00:20:04    200 GET      https://contosoapp.com:443/OrderSystem/Order
  
 548 HttpContext object(s) found matching criteria
  
 13 HttpContext object(s) skipped by filter

Note that only one of the requests has an associated thread. The others have “---“ as thread which means that either IIS is waiting for a thread to be created or the thread was aborted and IIS was not informed yet. Normally it is because the request is queued. As you can see, the requests are running for over 20 minutes even though the time out is set to 1 minute. HttpContext the object wrapping an IIS request includes a field to tell the timeout state. You can verify most of .NET framework source code here: https://referencesource.microsoft.com/

 HttpContext.cs
         private int        _timeoutState;   // 0=non-cancelable, 1=cancelable, -1=canceled

 

Running HttpContext are in a non-cancelable state (_timeoutState = 0)

 

 0:000> !wselect * from 0f87b2c8
 [System.Web.HttpContext]
 (int64)System.Int64 _timeoutStartTimeUtcTicks = 8d2347382c7e3c2 (0n635628168132682690)
 (int64)System.Int64 _timeoutTicks = 4190ab00 (0n1100000000)
 System.Web.IHttpAsyncHandler _asyncAppHandler = 00000000
 System.Web.HttpApplication _appInstance = 0F878910
 System.Web.IHttpHandler _handler = 0F22B824
 System.Web.HttpRequest _request = 0F87B3C4
 System.Web.HttpResponse _response = 0F87B48C
 System.Web.HttpServerUtility _server = 0717B2D8
 System.Collections.Stack _traceContextStack = 00000000
 System.Web.TraceContext _topTraceContext = 00000000
 System.Collections.Hashtable _items = 0F165538
 System.Collections.ArrayList _errors = 00000000
 System.Exception _tempError = 00000000
 System.Web.IPrincipalContainer _principalContainer = 0F87B5D4
 System.Web.Profile.ProfileBase _Profile = 00000000
 System.Web.HttpWorkerRequest _wr = 0F87B050
 System.Web.VirtualPath _configurationPath = 00000000
 System.Globalization.CultureInfo _dynamicCulture = 026D8EF8
 System.Globalization.CultureInfo _dynamicUICulture = 026D8EF8
 System.Collections.Stack _handlerStack = 00000000
 System.Web.Instrumentation.PageInstrumentationService _pageInstrumentationService = 00000000
 System.Collections.ObjectModel.ReadOnlyCollection _webSocketRequestedProtocols = 00000000
 System.Web.Util.CancellationTokenHelper _timeoutCancellationTokenHelper = 038239C4
 System.Web.Util.DoubleLink _timeoutLink = 0F1653E0
 System.Threading.Thread _thread = 02FCE050
 System.Web.CachedPathData _configurationPathData = 00000000
 System.Web.CachedPathData _filePathData = 036B34C8
 (string)System.String _sqlDependencyCookie = NULL
 System.Web.SessionState.SessionStateModule _sessionStateModule = 0F878AE0
 System.Web.UI.TemplateControl _templateControl = 00000000
 System.Web.NotificationContext _notificationContext = 038D7118
 System.Web.ThreadContext IndicateCompletionContext = 038D6C90
 System.Threading.Thread ThreadInsideIndicateCompletion = 02FCE050
 System.Object ThreadContextId = 0f87b3b8
 System.Web.AspNetSynchronizationContextBase _syncContext = 0F87B940
 System.Threading.Thread _threadWhichStartedWebSocketTransition = 00000000
 (string)System.String _webSocketNegotiatedProtocol = NULL
 System.Web.IHttpHandler _remapHandler = 0F22B824
 System.Web.IHttpHandler _currentHandler = 0F22B824
 System.Web.RootedObjects _rootedObjects = 0F87B5D4
 System.Web.Security.CookielessHelperClass _CookielessHelper = 0F87B55C
 System.Security.Principal.IPrincipal k__BackingField = 00000000
 (int32)System.Web.Configuration.AsyncPreloadModeFlags _asyncPreloadModeFlags = 0 (0n0) None
 (int32)System.Int32 _serverExecuteDepth = 0 (0n0)
 (int32)System.Int32 _timeoutState = 0 (0n0)
 (native int)System.IntPtr _rootedPtr = 0 (0n0)
 (int32)System.Web.SessionState.SessionStateBehavior <SessionStateBehavior>k__BackingField = 0 (0n0) Default
 (bool)System.Boolean _asyncPreloadModeFlagsSet = 1 (True)
 (bool)System.Boolean _errorCleared = 0 (False)
 (bool)System.Boolean _skipAuthorization = 0 (False)
 (bool)System.Boolean _preventPostback = 0 (False)
 (bool)System.Boolean _runtimeErrorReported = 0 (False)
 (bool)System.Boolean _threadAbortOnTimeout = 1 (True)
 (bool)System.Boolean _delayedSessionState = 0 (False)
 (bool)System.Boolean _isAppInitialized = 1 (True)
 (bool)System.Boolean _isIntegratedPipeline = 1 (True)
 (bool)System.Boolean _finishPipelineRequestCalled = 0 (False)
 (bool)System.Boolean _impersonationEnabled = 0 (False)
 (bool)System.Boolean HideRequestResponse = 0 (False)
 (bool)System.Boolean InIndicateCompletion = 1 (True)
 (uint8)System.Web.WebSocketTransitionState _webSocketTransitionState = 0x0 (0n0) 
 (bool)System.Boolean _requiresSessionStateFromHandler = 1 (True)
 (bool)System.Boolean _readOnlySessionStateFromHandler = 0 (False)
 (bool)System.Boolean InAspCompatMode = 0 (False)
 (bool)System.Boolean _ProfileDelayLoad = 1 (True)
 (bool)System.Boolean <FirstRequest>k__BackingField = 0 (False)
 (bool)System.Boolean <DisableCustomHttpEncoder>k__BackingField = 0 (False)
 System.DateTime _utcTimestamp = -mt 725719B8 00000000 3/24/2015 5:40:12 PM
 System.Web.Util.SubscriptionQueue> _requestCompletedQueue = -mt 706898A8 00000000
 System.Web.Util.SubscriptionQueue _pipelineCompletedQueue = -mt 70689990 00000000
 static System.Reflection.Assembly SystemWebAssembly = 066C1B34
 static (bool)System.Boolean s_eurlSet = 0 (False)
 static System.String s_eurl = 00000000 00000000
 static System.Action CS$9__CachedAnonymousMethodDelegate3 = 00000000
 static System.Func CS$9__CachedAnonymousMethodDelegate6 = 00000000
 static System.Func CS$9__CachedAnonymousMethodDelegate8 = 0A92BE14
 static System.Func CS$9__CachedAnonymousMethodDelegatea = 0F0449D4

So, this is why the request does not times out. The command below will show all requests that cannot be cancelled (i.e. it will not time out):

 0:000> !wfrom -type System.Web.HttpContext where (_timeoutState==0) $addr()
 calculated: 02AE9C60
 calculated: 02F1B334
 calculated: 03095104
 calculated: 030968EC
 calculated: 0309873C
 calculated: 030990B4
 calculated: 0309F154
 (...)
 calculated: 0FAF7AE4
 calculated: 0FAFE02C
  
 561 Object(s) listed

So, that is it. Something is preventing the requests to go away, Looking at the stack trace of thread 51 (the only one running a http request) we found that the thread was waiting for a SQL request to finish. We also saw that this was happening in the custom Global.asax handler method Application_BeginRequest. In the HttpContext fields we can also see that this application is using session (_requiresSessionStateFromHandler = 1 (True)).

Only one request per session can run at a time

Now comes the theory. The request has timeout, is in pipeline compatibility mode and is relying on ASP.NET session. The session cookie is ASP.NET_SessionId. NetExt includes a command to check and summarize cookies. Let’s summarize cookies by the key and value pair.

 0:000> !wcookie -summary
 Action Total Finished Cookie=Value
 ======================================================================================
 (list)   561        13 ARRAffinity=7c89acbf92e4f1bf14a78494f66466c71fcb071249a85f5e303bf5149086b5bc
 (list)   548         0 ASP.NET_SessionId=0odyr1kc0cdqoaaomed1v4pe
 (list)     9         9 ASP.NET_SessionId=21oqjz1amz4bfkfgru5bheth
 (list)     1         1 ASP.NET_SessionId=ew1qbqsugeiglq5uisf0csw1
 (list)     1         1 ASP.NET_SessionId=hwxatvsprf4c2nutxlanzcxl
 (list)     1         1 ASP.NET_SessionId=xqmy5iteqgosfmagesegpk1h
 (list)   559        11 SBFtest_cookie=test_cookie_value
 (list)     9         9 __RequestVerificationToken=IyKw3PREthJanOgCZLqTKzYAtXMyFgYFduNaXmT6xW42djDGVlH7HFGnB1ttFlnlUhz0eXHNOfJpWoBnG6g6Rp4PSGly0T6OFuXCNM0CgGMs_jxgyW-zSjXDVfMnr3vJn2aUEbAeEK8-RlTFgY2CGZmCC-F9FDRKgZHRIfE-tT01
 (list)     1         1 __RequestVerificationToken=ToT-glFP2dW7U7AYtPjcjAT-7hWBe3iIiUYjRg4tMinWUhBN4Be8Qo36FyC-qL8_mOfY7MWSgET2M8kIpuX0MleYmLJn3-A6_E9FveN8j_36rcTjmuIV2pkZ9892luiRlHK2HgXepUPFDGqcVJmmdn-HCzC9YEbHEPH1OyDSWs41
 (list)   548         0 __RequestVerificationToken=gbM2YmFRGi7yYZ0ic-ipw0umX3RtSz4Kn7y9yTsXbMRExnv-wX40MKKbgygNH01aPflh7rybsyYpIu9v1g-s1Hs-IIrpqPjw5ZndQNaUNfD_2Yx3XaaKUzl808dusffa7bpjVQa1HZ8VAz55L5w_7kUkYUQSfg2BmJaWM7aDTxU1
 (list)     1         1 __RequestVerificationToken=mec5uGE2UTVP-iCM6mEMbYVsElHdeUZD7ay9XBkYfSzYYHkwqzeG3pbboAQQ7hiU2rZBh-wDuoLifMuGG6HYnr5Byo0A1EKaD6bwfIvk710pHGpq4A9QZrjWb2S7tPbpOHYqO3VB5ItaetJPrEHnlanToXD6wisV8s-d2Czo7-o1
 (list)     1         1 __RequestVerificationToken=z6UahlNkt3UlEiCW5TnGmoORlues-U-O1cs7FihQQPlG52L7qLmWG3vkxQY6uQQRL6F4xzuet8zmZ-8I4aD9cMTYxfa64pf5sktHofl7p9ezJJ4cK2pDJqWrImK5gBTUkGrd1M3d-OAE6lwDx0gWuIeHCSOtDIEvVX3v2LeU_lU1
 (list)     1         1 __ar_v4=NPVOPR4AB5FCPB2G4EOFPH%3A20150212%3A2%7CMXNG24JOKNCSXAEC45BD3K%3A20150212%3A2%7CNBPDN5WPCBGR3ATBF2EEHC%3A20150212%3A2
 (list)     9         9 __ar_v4=UMR4R6WNO5A5XDYLDDQJ5W%3A20150008%3A2%7CNPVOPR4AB5FCPB2G4EYREW%3A20141117%3A6%7CMXNG24JOKNCSXAEC45BD3K%3A20141117%3A12%7CNBPDN5WPCBGR3ATBF2EEHC%3A20141117%3A12%7CHR63AHCPWBHH5LZ72UPYIS%3A20141117%3A3
 (list)     1         1 __utma=35909538.1267946845.1427218502.1427218502.1427218502.1
 (list)     1         1 __utma=35909538.154579644.1427202773.1427202773.1427202773.1
 (list)     1         1 __utma=35909538.156555049.1423582295.1423582295.1423586012.2
 (list)     9         9 __utma=35909538.360373432.1406151229.1425847161.1427219991.14
 (list)     1         1 __utma=35909538.432412459.1427202844.1427202844.1427202844.1
 (list)   548         0 __utma=35909538.596493886.1426607592.1426607592.1427218695.2

 

Bingo! There is a single session associated with 548 requests, the exact number of running requests. And unlike the other sessions, no request has finished. Clicking on (list) we see the very same 548 contexts we see in !whttp -running.

 0:000> !wcookie -name ASP.NET_SessionId -value 0odyr1kc0cdqoaaomed1v4pe
 02f1b334 https://contosoapp.com:443/OrderSystem/Order (200 NULL) Running (00:20:04)
 ======================================================================================
 ASP.NET_SessionId=0odyr1kc0cdqoaaomed1v4pe
  
 1 printed
  
 ======================================================================================
 03095104 https://contosoapp.com:443/OrderSystem/Order (200 NULL) Running (00:21:22)
 ======================================================================================
 ASP.NET_SessionId=0odyr1kc0cdqoaaomed1v4pe
  
 1 printed
  
 ======================================================================================
 030968ec https://contosoapp.com:443/OrderSystem/Order (200 NULL) Running (00:21:03)
 ======================================================================================
 ASP.NET_SessionId=0odyr1kc0cdqoaaomed1v4pe
  
 1 printed
  
 ======================================================================================
 0309873c https://contosoapp.com:443/OrderSystem/Order (200 NULL) Running (00:21:11)
 ======================================================================================
 ASP.NET_SessionId=0odyr1kc0cdqoaaomed1v4pe
  
 1 printed
  
 ======================================================================================
 030990b4 https://contosoapp.com:443/OrderSystem/Order (200 NULL) Running (00:21:11)
 ======================================================================================
 ASP.NET_SessionId=0odyr1kc0cdqoaaomed1v4pe
  
 1 printed
  
 ======================================================================================
 0309f154 https://contosoapp.com:443/OrderSystem/Order (200 NULL) Running (00:21:40)
 ======================================================================================
 ASP.NET_SessionId=0odyr1kc0cdqoaaomed1v4pe
  
 1 printed
  
 ======================================================================================
 030a2a70 https://contosoapp.com:443/OrderSystem/Order (200 NULL) Running (00:21:30)
 ======================================================================================
 ASP.NET_SessionId=0odyr1kc0cdqoaaomed1v4pe
  
 1 printed
  
 ======================================================================================
 030a3404 https://contosoapp.com:443/OrderSystem/Order (200 NULL) Running (00:21:39)
 ======================================================================================
 ASP.NET_SessionId=0odyr1kc0cdqoaaomed1v4pe
  
 1 printed
  
 (...)

All the calls were coming from the same IP address. So, there are only two possibilities: this site is under a DoS (Denial of Service) attack or a cat is walking all over the keyboard. Let me know your thoughts in the comments.

Comments

  • Anonymous
    March 29, 2015
    A few years ago we had something like this, one worker process would randomly go out of control. A recycle would correct it and it would eventually come back up. The incident would last about 1 hour or so. Turns out we had a few users that would flip their keyboards up to eat lunch. This action would have the benefit of mashing a few keys; namely the F5 key. We would have to track the user down by IP address (we're not using Windows auth) and ask them to move their keyboard. I can only imagine the "big-brother"/NSA feeling that person must have had. FYI, when we started doing the analysis on the logs we noticed about 30 - 33 requests per second from the IP address. The default per second repeat rate is 31. [HKCUControl PanelKeyboardKeyboardSpeed]