Error 18056 can be unwanted noise in certain scenarios

I saw a lot of hits on the web when I searched for the Error message 18056 with State 29. I even saw two Microsoft Connect items for this issue filed for SQL Server 2008 instances:

https://connect.microsoft.com/SQL/feedback/ViewFeedback.aspx?FeedbackID=468478

https://connect.microsoft.com/SQLServer/feedback/details/540092/sql-server-2008-sp1-cu6-periodically-does-not-accept-connections

So, I thought it was high time that we pen a blog post on when this message can be safely ignored and when it is supposed to raise alarm bells. Before I get into the nitty-gritty details, let me explain under what condition is 18056 raised with the state = 29.

Most applications today make use of connection pooling to reduce the number of times a new connection need to be opened to the backend database server. When the client application reuses the connection pool to send a new request to the server, SQL Server performs certain operations to facilitate the connection reuse. During this process (we shall call it Redo Login for this discussion) if any exception occurs, we report an 18056 error. The state numbers like the famous 18456: Login Failed error message give us more insight into why the Redo Login task fails. State 29 occurs when there is an Attention received from the client while the Redo Login code is being executed. This is when you would see the message below which has plagued many a mind till date on SQL Server 2008 instances:

2009-02-19 04:40:03.41 spid58 Error: 18056, Severity: 20, State: 29.

2009-02-19 04:40:03.41 spid58 The client was unable to reuse a session with SPID 58, which had been reset for connection pooling. This error may have been caused by an earlier operation failing. Check the error logs for failed operations immediately before this error message.

Is this a harmful message?

The answer that always brings a smile to my face: It depends! The dependency of this error message being just plain noise to something that should send all the admins in the environment running helter-skelter can be summarized in one line.

If the above error message (note that the state number should reflect 29) is the only message in the SQL Server Errorlog along with no other errors noticed in the environment (connectivity failures to the SQL instance in question, degraded performance, high CPU usage, Out of Memory errors), then this message can be treated as benign and safely ignored.

Why is this message there?

Well our intentions here were noble and we didn’t put the error message out there to create confusion. This error message is just reporting that a client is reusing a pooled connection and when the connection was reset, the server received an attention (in this case, a client disconnect) during the connection reset processing on the server side. This could be due to either a performance bottleneck on the server/environment or a plain application disconnect. The error message is aimed at helping in troubleshooting the first category of problems. If you do see some other issues at the same time though, these errors may be an indicator of what is going on at the engine side.

What should you do when you see your Errorlog bloating with these error messages?

a. The foremost task would be to scan the SQL Errorlog and determine if this error message is accompanied before/after by some other error message or warning like Non-yielding messages, Out of Memory (OOM) error message (Error 701, Failed Allocate Pages etc.).

b. The next action item would be to determine if there is high CPU usage on the server or any other resource bottleneck on the Windows Server. Windows Performance Monitor (Perfmon) would be your best friend here.

c. Lastly, check if the Network between the Client and Server is facing any latency issues or if network packets drops are occurring frequently. A Netmon trace should help you here.

 

Tejas Shah

Escalation Engineer - Microsoft

Comments

  • Anonymous
    May 08, 2010
    The comment has been removed

  • Anonymous
    May 11, 2010
    How about state 23? This is my case...

  • Anonymous
    June 15, 2010
    The comment has been removed

  • Anonymous
    August 04, 2010
    Thanks Tejas Brilliant explanation, well written and easy to understand. Thankyou

  • Anonymous
    October 04, 2010
    The comment has been removed

  • Anonymous
    October 17, 2010
    Hello, We have been seeing the same thing Glenn reported as well.  Everything else seems to be running fine (resources, network, etc.) Our clients just have a problem connecting to SQL when we see these errors in the error log. Any help would be greatly appreciated. Thanks, Jason

  • Anonymous
    October 17, 2010
    Hello, We have been seeing the same thing Glenn reported as well.  Everything else seems to be running fine (resources, network, etc.) Our clients just have a problem connecting to SQL when we see these errors in the error log. Any help would be greatly appreciated. Thanks, Jason

  • Anonymous
    November 05, 2010
    The comment has been removed