How to prepare for a SQL Developer / Connectivity case

Today I thought I give you some tips on how to prepare for a SQL Developer / Connectivity case.

Sometimes when a case gets logged the information provided is not that great or the information provided is not very informational to me.

Let’s illustrate with a fictional example; a new case comes in and the error description is as follows:

URGENT: Clients can no longer connect to server, this used to work.

This is not very useful, even though I understand the general problem.

This is obviously urgent to the customer, but right away we are going to need to spend time on getting more useful information, for example,

what is the client (.Net/Standalone/Webapplication etc)?, what server is being connected to (Sql Server, Oracle, etc)?,

what provider is used (SqlClient, ODBC, etc), what is the error message? etc, etc.

So in this entry I thought I give some tips on what you can do in preparation for a case like this (we are going to ask you for it anyway if you haven’t J) and why it may be useful.

.1 The Connectionstring

This is very useful because a lot of information can be gathered from it. Let’s look at the following example:

String cs = @"Provider=SQLNCLI10;Data Source=tcp:MySqlServer\TestServer,56644;User ID=<user>;Password=<password>;Initial Catalog=MyDatabase;Pooling=False";

 

This tells me the following,

. You are using the SQLNCLI10 provider, this comes with SQL Server 2008, so you are likely to use Sql Server 2008.

. Since you are using this provider, you are most likely to use the classes in the System.Data.OleDb namespace (OleDbConnection,OleDbCommand, etc).

. You are connecting via tcp, ie. the connection is most likely to a remote server rather than a local one.

. The port you are connecting to is 56644

. You are not using Windows Authentication (Integrated Security=SSPI) since the User ID and Password is provided.

. You are trying to connect to the database called MyDatabase.

. You are not using pooling.

and so on.

The connectionstring may seem irrelevant, but it contains a lot of useful information. More on connection strings here:

".NET Framework Class Library - SqlConnection.ConnectionString Property"

https://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlconnection.connectionstring.aspx

".NET Framework Class Library - OleDbConnection.ConnectionString Property"

https://msdn.microsoft.com/en-us/library/system.data.oledb.oledbconnection.connectionstring.aspx

2. The SQL Server log.

This is found here (if you have not changed the default location):

C:\Program Files\Microsoft SQL Server\MSSQL10.<your sql server instancename>\MSSQL\Log\errorlog

This is very useful because it also tells us lot about your system. Let’s look at the following example (some parts removed because they are not relevant here):

2009-01-14 15:24:48.33 Server Microsoft SQL Server 2008 (RTM) - 10.0.1600.22 (X64)

            Jul 9 2008 14:17:44

            Copyright (c) 1988-2008 Microsoft Corporation

            Enterprise Edition (64-bit) on Windows NT 6.0 <X64> (Build 6001: Service Pack 1)

2009-01-14 15:24:48.35 Server (c) 2005 Microsoft Corporation.

2009-01-14 15:24:48.35 Server All rights reserved.

2009-01-14 15:24:48.35 Server Server process ID is 4100.

2009-01-14 15:24:48.35 Server System Manufacturer: 'Dell Computer Corporation', System Model: 'PowerEdge 830'.

2009-01-14 15:24:48.35 Server Authentication mode is MIXED.

2009-01-14 15:24:48.35 Server Logging SQL Server messages in file 'C:\Program Files\Microsoft SQL Server\MSSQL10.<instancename>\MSSQL\Log\ERRORLOG'.

2009-01-14 15:24:48.35 Server Registry startup parameters:

            -d C:\Program Files\Microsoft SQL Server\MSSQL10.<instancename>\MSSQL\DATA\master.mdf

            -e C:\Program Files\Microsoft SQL Server\MSSQL10.<instancename>\MSSQL\Log\ERRORLOG

            -l C:\Program Files\Microsoft SQL Server\MSSQL10.<instancename>\MSSQL\DATA\mastlog.ldf

2009-01-14 15:24:48.36 Server SQL Server is starting at normal priority base (=7). This is an informational message only. No user action is required.

2009-01-14 15:24:48.36 Server Detected 2 CPUs. This is an informational message; no user action is required.

...

2009-01-14 15:24:49.37 spid7s Server name is 'servername\instancename'. This is an informational message only. No user action is required.

...

2009-01-14 15:24:49.87 Server Server is listening on [ 'any' <ipv6> 56644].

2009-01-14 15:24:49.87 Server Server is listening on [ 'any' <ipv4> 56644].

2009-01-14 15:24:49.87 Server Server local connection provider is ready to accept connection on [ \\.\pipe\SQLLocal\<instancename> ].

2009-01-14 15:24:49.87 Server Server local connection provider is ready to accept connection on [ \\.\pipe\MSSQL$<instancename>\sql\query ].

2009-01-14 15:24:49.87 Server Server is listening on [ ::1 <ipv6> 56645].

2009-01-14 15:24:49.87 Server Server is listening on [ 127.0.0.1 <ipv4> 56645].

...

2009-01-14 15:25:19.52 Logon Error: 18456, Severity: 14, State: 8.

2009-01-14 15:25:19.52 Logon Login failed for user 'User'. Reason: Password did not match that for the login provided. [CLIENT: xxx.xxx.xxx.xxx]

I’ve highlighted some of the interesting parts. What we can tell from this is the following:

. You are running SQL Server 2008, RTM version, no CUs applied, 64bit version.

. The process ID

. The hardware that the server is running on.

. The server can be connected to using both Windows Authentication and Sql Authentication (MIXED mode).

. The server is running on a dual CPU machine.

. Server and instance names

. The server is listening on port 56644 if using TCP/IP or on “\\.\pipe\MSSQL$<instancename>\sql\query” if using Named Pipes.

. And in this case, we have a proper error message with an explanation (the login exists, however, the user have used an invalid password).

And so on, the Sql Server error log contains a lot of useful information, if I have this, I can find answers to questions I may have without having to contact you. I.e. save time.

If you find a lot of errors in the Sql Server log, head over to the blog of my colleague Graham Kent:

https://blogs.msdn.com/grahamk/

3. The error message, perhaps the most underrated piece of information J

Rather than having to guess what is happening when “the client can’t connect” the error message provides a lot of useful information, for example:

System.InvalidOperationException: Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool.

This may have occurred because all pooled connections were in use and max pool size was reached.

   at System.Data.ProviderBase.DbConnectionFactory.GetConnection(DbConnection owningConnection)

   at System.Data.ProviderBase.DbConnectionClosed.OpenConnection(DbConnection outerConnection, DbConnectionFactory connectionFactory)

   at System.Data.SqlClient.SqlConnection.Open()

   ...

This tells me that I do not have to focus on Login errors, Network errors, Sql Server has stopped errors.

From this error message it is clear that there is a problem with the Connection Pooling, so the investigation should start focusing on that.

To resolve this particular error, have a look at:

"Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool."

https://blogs.msdn.com/spike/archive/2008/08/25/timeout-expired-the-timeout-period-elapsed-prior-to-obtaining-a-connection-from-the-pool.aspx

Error messages = good.

4. Network Monitor traces.

This is also very useful, so if possible create these as well.

. First download and install it on both the client and server (at least on the client). You’ll find it here:

"Microsoft Network Monitor 3.2"

https://www.microsoft.com/downloads/details.aspx?FamilyID=f4db40af-1e08-4a21-a26b-ec2f4dc4190d&DisplayLang=en

. Then start it on the client and server (at least on the client) and reproduce the error.

. Save the trace(s), compress it and give it to me.

For example, if we have a trace that looks as follows:

...

88 xxx.xxx.xx.78 50313 xxx.xxx.xx.187 1234 TCP TCP: Flags=.S......, SrcPort=50313, DstPort=1234, Len=0, Seq=3060051214, Ack=0, Win=8192 (scale factor not found)

89 xxx.xxx.xx.187 1234 xxx.xxx.xx.78 50313 TCP TCP: Flags=..R.A..., SrcPort=1234, DstPort=50313, Len=0, Seq=0, Ack=3060051215, Win=0 (scale factor not found)

...

100 xxx.xxx.xx.78 50313 xxx.xxx.xx.187 1234 TCP TCP: Flags=.S......, SrcPort=50313, DstPort=1234, Len=0, Seq=3060051214, Ack=0, Win=8192 (scale factor not found)

101 xxx.xxx.xx.187 1234 xxx.xxx.xx.78 50313 TCP TCP: Flags=..R.A..., SrcPort=1234, DstPort=50313, Len=0, Seq=0, Ack=3060051215, Win=0 (scale factor not found)

...

we can see that the server ( xxx.xxx.xxx.187) is immediately resetting (..R.A…) connection attempts on port 1234 from the client (xxx.xxx.xxx.187) that is trying to connect from port 50313.

So here the investigation should start focusing on the server and why it is resetting the connections. Once we know this, then we can troubleshoot the client.

In this particular case (and a common one as well) the problem comes from the fact that the server is not listening on the tcp/ip port the client is trying to connect to.

A little bit more on this here:

"(provider: TCP Provider, error: 0 - No connection could be made because the target machine actively refused it.)"

https://blogs.msdn.com/spike/archive/2008/12/19/provider-tcp-provider-error-0-no-connection-could-be-made-because-the-target-machine-actively-refused-it.aspx

5. Crash dumps.

Sometimes there may be no error message and you may not have the source code and the application just dies.

Then the way forward is to create a crash dump.

If you have an exception for when the error/crash/etc occurs, for example a System.Data.SqlClient.SqlException, then it is
very simple to use DebugDiag in order to create/collect dumps.

This is done like so:
.1 Download and install DebugDiag from here:
Debug Diagnostic Tool v1.1
https://www.microsoft.com/downloads/details.aspx?FamilyID=28bd5941-c458-46f1-b24d-f60151d875a3&displaylang=en
.2 Make sure that your application/process is up and running. Then start DebugDiag.
.3 Now you should see the "Select Rule Type" dialog, select "Crash" and then Next.
.4 Now you should see the "Select Target Type" dialog, select "A Specific Process" and then Next.
.5 Now you should see the "Select Target" dialog, select your process and then Next.
.6 Now you should see the "Advanced Options" dialog, select Exceptions under "Advanced Settings".
.7 Select "Add Exception", select "CLR (.Net) Exception" (should be at top of the list).
.8 In the ".Net Exception Type" textbox, enter "System.Data.SqlClient.SqlException" (no quotes, also, not that this is casesensitive).
If you have a different exception, then of course you should replace the above with that exception.
.9 In the "Action Type" drop down, select "Full Userdump" and then OK. Then "Save & Close" and then Next.
.10 Now you should see the "Select Dump Location And Rule Name" dialog, change the userdump location to C:\Dumps (this dir will be create for you) and then Next.
.11 Now you should see the "Rule Completed" dialog, select "Activate the rule now" and then Finish.
.12 Crash you application, ie. repro the problem. Note that this may kill your process.

Presto, you should now have a *.dmp file in the C:\Dumps directory with a filename along the lines:
<ProcessName>__PID__<ProcessId>__Date__<Date>__Time_<Time>__First Chance System.Data.SqlClient.SqlException.dmp
Compress it and send.

-- OR --

. Download the "Debugging Tools for Windows" (choose the latest release for your system).

https://www.microsoft.com/whdc/devtools/debugging/default.mspx

. Install it. My suggestion is to install it to C:\Debuggers\

. Start your application.

. Start a command prompt (Start->Run->cmd)

. Go to where you installed the debuggers, ie. run ‘cd C:\Debuggers’ (no quotes).

. Type ‘tlist’ (no quotes), this will list all your running process and their id. So take notice of the pid (process id) for your application, for example: 4140

. Then run a script called AdPlus that comes with the debugger installation, this is done with the following syntax (replace #### with the PID for your application):

“adplus -crash -p #### -o C:\Debuggers\Dumpfiles” (no quotes).

. Ignore the Adplus warning and informal messages if any.

. Reproduce the error, ie. make your application fail.

. This should now have created a directory named Crash_Mode__Date_<date>__Time_<time>PM under the C:\Debuggers\Dumpfiles directory.

. Zip this up and pass send this to us in order to analyze it.

If you are tempted to do analyze it yourself, you’ll find lots of information on this subject on the blog of my colleague Tess Ferrandez:

https://blogs.msdn.com/Tess/

And some more info on Adplus:

"How to use ADPlus to troubleshoot "hangs" and "crashes""
https://support.microsoft.com/default.aspx?scid=kb;EN-US;286350

6. All versions of all files involved.

This seems like a major task, but fear not. It is as simple as follows.

. Download and install “Process Explorer”, found here:

https://technet.microsoft.com/en-us/sysinternals/bb896653.aspx

. Uncompress it and start it.

. Start your application and find it in the upper pane in Process Explorer.

. Make sure that the lower pane is visible (View->Show Lower Pane)

. Make sure that the lower pane is showing the DLLs (CTRL+D)

. Now just save it and send it.

For example:

Process: MyApplication.exe Pid: 3228

Name Description Version Path

...

ole32.dll Microsoft OLE for Windows 6.00.6001.18000 C:\Windows\system32\ole32.dll

System.Data.dll .NET Framework 2.00.50727.3053 C:\Windows\assembly\GAC_64\...

System.Data.ni.dll .NET Framework 2.00.50727.3053 C:\Windows\assembly\NativeImages_v2.0.50727_64\...

System.ni.dll .NET Framework 2.00.50727.3053 C:\Windows\assembly\NativeImages_v2.0.50727_64\...

...

Here we can see that the application uses version 2.00.50727.3053 of System.Data.dll, this tells us that .Net 3.5 is being used.

It also tells us where the dll’s are loaded from and so on. So by doing this, we quickly have all dll’s used by your application and all the versions used.

So to summarize, if you spend some time collecting and preparing as many as possible of the suggestions above, me (and my team) will have a lot of useful information to work on.

It is also very useful to have if we need move the case to another team.

Comments

  • Anonymous
    January 23, 2009
    Spike in our data access support team has a really good blog talking about data access related issues

  • Anonymous
    January 24, 2009
    Thank you for submitting this cool story - Trackback from DotNetShoutout

  • Anonymous
    January 25, 2009
    This is a fantastic article! Thank you for contributing

  • Anonymous
    January 25, 2009
    Troubleshooting Data Connectivity

  • Anonymous
    May 30, 2012
    The comment has been removed