The magical 2 minute logon delay mystery
Some time ago I had an interesting escalation where the problem description was as following:
I'm running a Citrix Metraframe Presentation Server farm with around 20 servers in it, after about 2 days my users start getting a logon delay of *exactly* 2 minutes.
If I reboot the server everything returns back to normal but the problem resurfaces within 48 hours, sometimes earlier.
Perfmon analysis showed that no CPU load was present on the server during that time and no resource leak either that we could see.
We asked the customer to install a checked build of Winlogon.exe and enable Winlogon debug logging, this showed us that a timeout condition was being hit.
So, taking a page out of Mark Russinovich's book I asked the customer to run Procmon during a logon attempt. After filtering through the massive amount of data I found that Winlogon.exe was calling mpnotify.exe which then finished after exactly 2 minutes.
Looking through the Specs for Winlogon I determined that mpnotify was being called by Winlogon to talk to each of the registered Network Providers on the machine. If one of the NP's doesn't respond Winlogon will be sitting around until the timeout of 120 seconds is hit, and then move on.
After this, we looked at the registered NP's on one of the Citrix boxes with the intent of removing them one by one until we found the culprit. As it turned out, the first one we removed gave immediate results and the user was able to logon without a delay once the reference to the NP had been removed from the registry.
To summarize; here's what was happening:
On any platform that has installed the Lotus Notes client with the 'Single Sign-on' option chosen during the setup, a network provider called npnotes is added to the list of NP's that Winlogon should notify during logon and password changes. When this NP failed (usually within 48 hours on a busy TS), Winlogon started timing out waiting for an answer from it (default timeout is 2 minutes).
The workaround is to remove the npnotes entries from the Network Provider registry keys which prevents it from loading (no reboot required). In this case the npnotes entry was the problem and removing it resolved the issue.
The following registry key is taken from a TS running Citrix Presentation Server 4.5:
[HKEY_LOCAL_MACHINESYSTEMCurrentControlSetControlNetworkProviderHwOrder]
"ProviderOrder"="Cwbnetnt,CdmService,RDPNP,LanmanWorkstation,WebClient,PnSson,npnotes"
[HKEY_LOCAL_MACHINESYSTEMCurrentControlSetControlNetworkProviderOrder]
"ProviderOrder"="Cwbnetnt,CdmService,RDPNP,LanmanWorkstation,WebClient,PnSson,npnotes"
Using a checked build of Winlogon, the following is seen when the problem occurs:
15:05:46.881: 11472.21432> Winlogon-Trace-Timeout: Enabling timeout after 120 seconds
15:07:46.895: 11472.21432> Winlogon-Trace-Timeout: Input timer went off, sending TIMEOUT
Using Procmon you see Winlogon spawn an instance of mpnotify.exe to notify the NP's, which would normally terminate immediately after notifying all the NP's.
When the problem occurs mpnotify.exe terminates after 2 minutes and the logon proceeds normally. Additional symptoms are that the logon is normal for a period of time after a reboot, however when the problem occurs (usually within 48 hours) all logons on that server are affected (since NPNotes stops responding).
On a Citrix box the logon session will be waiting for 120 seconds while the "Checking your credentials" prompt is displayed within the TS client (with the Citrix GINA installed). Replacing the GINA changes the message but doesn't otherwise make any difference (since the problem is with the faulty NP).
Note that this may also affect normal workstations, however the symptoms will be much more apparent on a Terminal Server since it is less likely to be rebooted regularly and has a greater number of users logging on to it.
Comments
Anonymous
January 01, 2003
There's an option in Prcomon somewhere called 'Log Boot' or 'Log startup' or similar, selecting that maintains the Procmon filter driver over the next boot until you start Procmon again (you should get a message to save the log file or view it if I remember correctly).Anonymous
January 01, 2003
Checked builds of Winlogon should be available for download through an MSDN subscription - the version you install should match the service pack of the system and preferably be as close to the build number of the original binary as possible. For additional details, see http://blogs.msdn.com/ntdebugging/archive/2007/06/09/how-windows-shuts-down.aspxAnonymous
January 01, 2003
Checked Builds of Windows: Obtaining the Checked Build http://msdn.microsoft.com/en-us/library/ms792429.aspx If you don't find the files you're looking for there then you need to contact your MSDN repAnonymous
January 01, 2003
Hi Costin, going through procmon logs to look for a problem is a time-consuming task that is beyond the scope of this blog. The way I would tackle it is to simplify as suggested on http://blogs.technet.com/instan/archive/2008/04/17/troubleshooting-the-intermittent-slow-logon-or-slow-startup.aspx Diagnosing the issue is a combination of parsing Userenv logs, Procmon logs, network traces and any other useful data that you can get. Procmon will only show you the I/O operations going on, it doesn't really tell you what each I/O was intended to do. If you get an exact timeout of 120 seconds you're probably looking at a WMI query or a network provider timing out.Anonymous
May 01, 2009
hi, Your article is the most advanced I could find on this subject. (along with this analysis done by someone in a forum:http://is.gd/vQqb) Can you take a look please at my procmon output too ? It seems that lanmanworkstation doesn't start until the 120-second timeout expires... Thank you very much in advance! i've posted the pml files here: http://costinel.googlepages.com/procmon.7zAnonymous
February 16, 2010
Exactly my problem. I had 3 times npnote in the provider list on one server making that one 19 seconds to log on, compared to 10 seconds on the other servers. Great Thank youAnonymous
October 15, 2010
Okay, dumb question from a network newb: How do I keep ProcMon'ing during a logon? Won't it quit when I log off the account?