Unable to failover a Named instance of SQL server 2005 in cluster or unable to bring a Named Instance of SQL server 2005 Online
The problem is that we have a Named instance of SQL Server 2005 in a 2 node cluster. While trying to bring the SQL Services online we get these error messages in the Application Event Log.
Application Event Log:
Event Type: Error
Event Source: MSSQL$SQL2k5
Event Category: (3)
Event ID: 19019
Date: 7/17/2009
Time: 7:12:00 AM
User: N/A
Computer: BLRS2R17-3
Description:
[sqsrvres] ODBC sqldriverconnect failed
Event Type: Error
Event Source: MSSQL$SQL2k5
Event Category: (3)
Event ID: 19019
Date: 7/17/2009
Time: 7:12:00 AM
User: N/A
Computer: BLRS2R17-3
Description:
[sqsrvres] checkODBCConnectError: sqlstate = 08001; native error = ffffffff; message = [Microsoft][SQL Native Client]SQL Network Interfaces: Error Locating Server/Instance Specified [xFFFFFFFF].
Event Type: Error
Event Source: MSSQL$SQL2k5
Event Category: (3)
Event ID: 19019
Date: 7/17/2009
Time: 7:12:00 AM
User: N/A
Computer: BLRS2R17-3
Description:
[sqsrvres] ODBC sqldriverconnect failed
Event Type: Error
Event Source: MSSQL$SQL2k5
Event Category: (3)
Event ID: 19019
Date: 7/17/2009
Time: 7:12:00 AM
User: N/A
Computer: BLRS2R17-3
Description:
[sqsrvres] checkODBCConnectError: sqlstate = HYT00; native error = 0; message = [Microsoft][SQL Native Client]Login timeout expired
Cause:
The cluster service is not able to Connect with Clustered Named instance of SQL Server. SQL Browser even though running is not actually listening for requests.
Note: The SQL Server Browser program runs as a Windows service. SQL Server Browser listens for incoming requests for Microsoft SQL Server resources and provides information about SQL Server instances installed on the computer.
Resolution:
We started the SQL Server Browser Service with the -c parameter from command prompt it gave us the below response.
F:\Program Files (x86)\Microsoft SQL Server\90\Shared>sqlbrowser -c
SQLBrowser: starting up in console mode
We checked the same in a working cluster and found that when a browser is started it should have been like below,
F:\Program Files (x86)\Microsoft SQL Server\90\Shared>sqlbrowser -c
SQLBrowser: starting up in console mode
SQLBrowser: starting up SSRP redirection service
SQLBrowser is successfully listening on 0.0.0.0[1434]
[3728]: Waiting for next request...
[3728]: Received request...
[3768]: Waiting for next request...
[0532]: Waiting for next request...
[4612]: Waiting for next request...
[3728]: Waiting for next request...
[3768]: Received request...
[3768]: Waiting for next request...
We found that the SsrpListener key was 0
Path Of The Registry SsrpListener Key :
HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Microsoft SQL Server\90\SQL Browser (x64 Machines)
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\90\SQL Browser (x86 Machines)
Note: The SsrpListener registry value allows or disallows the SQL Browser to expose Engine instance discovery information, So if the value is 0 it disallows the SQL Browser to expose Engine instance discovery.
After we changed value to 1 and restarted the browser service we were getting the response as expected.
We stopped the browser service from the command prompt and then started it from services console and then we were able to bring the SQL Server Services online.
Another Workaround for this issue is to manually create a TCP alias with port number in all the nodes. In that case we need to make sure that SQL is not set to work with dynamic ports.
More Information:
In x64 bit machines you may get into another issue. Once after you change the SsrpListener value to 1 and then when you try to start the browser service through command prompt you may get this response.
C:\Program Files (x86)\Microsoft SQL Server\90\Shared>sqlbrowser -c
SQLBrowser: starting up in console mode
SQLBrowser: starting up SSRP redirection service
SQLBrowser is successfully listening on 0.0.0.0[1434]
SQLBrowser: failed starting SSRP redirection services -- shutting down.
This is caused by the entry in the Wow6432Node registry pertaining to this installation eventhough this was a 64 bit installation. This happens only in x64 machines.
You can remove the checkpoint and then delete this key from both the nodes and then add the checkpoint back.
HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Microsoft SQL Server\MSSQL.2\MSSQLServer
If registry checkpoint is not removed properly before deleting this key you may get this error message continously in the Event Logs.
Cluster service could not write to a file (C:\DOCUME~1\SVC~1.WES\LOCALS~1\Temp\CLS2AF.tmp). The disk may be low on disk space, or some other serious condition exists.
Ashwin Menon
SE, Microsoft Sql Server
Reviewed By
Karthick Krishnamurthy
Technical Lead, Microsoft SQL Server
Comments
- Anonymous
January 07, 2010
Hi, I have exactly got the same issue. In dev environment, we have 2 node Active/Active cluster setup with SQL Server 200564 bit with SP3 on Windows 2003 EE x64 with SP2 On Node1: We have 2 SQL Server 2005 instances VS1Ins1 VS2ins2 On Node 2:We have 2 SQL Server 2005 instances VS3Ins3 VS4ins4 All 4 instances are running fine, before I start the SQL Sevrer 2008 installation. I tried to upgrade Ins1 to SQL Server 2008. The installation went well but SQL resource group(SQL Service & SQLAgent did not come online). I have noticed the below errors in event viewer. 1st error: Event Type: Error Event Source: MSSQL$INS1 Event Category: (3) Event ID: 19019 Date: 1/7/2010 Time: 9:52:14 AM User: N/A Computer: ABC Description: [sqsrvres] ODBC sqldriverconnect failed 2nd error: Event Type: Error Event Source: MSSQL$INS1 Event Category: (3) Event ID: 19019 Date: 1/7/2010 Time: 9:52:14 AM User: N/A Computer: ABC Description: [sqsrvres] checkODBCConnectError: sqlstate = IM002; native error = 0; message = [Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified The above error occurred continuously & then the below error 3rd error: Event Type: Error Event Source: MSSQL$INS1 Event Category: (3) Event ID: 19019 Date: 1/7/2010 Time: 9:52:14 AM User: N/A Computer: ABC Description: [sqsrvres] OnlineThread: Error connecting to SQL Server. and then I can see the below information message and the error Event Type: Information Event Source: MSSQL$INS1 Event Category: (2) Event ID: 26038 Date: 1/7/2010 Time: 9:52:36 AM User: N/A Computer: ABCVS1 Description: The SQL Network Interface library could not deregister the Service Principal Name (SPN) for the SQL Server service. Error: 0x2098. Administrator should deregister this SPN manually to avoid client authentication errors. Event Type: Error Event Source: MSSQL$INS1 Event Category: (3) Event ID: 19019 Date: 1/7/2010 Time: 9:52:36 AM User: N/A Computer: ABC Description: [sqsrvres] CheckServiceAlive: Service is dead Event Type: Error Event Source: MSSQL$INS1 Event Category: (3) Event ID: 19019 Date: 1/7/2010 Time: 9:52:36 AM User: N/A Computer: ABC Description: [sqsrvres] OnlineThread: service stopped while waiting for QP. So I have Uninstalled all 4 instances from both nodes. Then I tried to install SQL Server 2005 and got the below message while installation is running "All resources did not come online and therefore you will need to manually set the cluster restart" I have clicked OK and continued with installation & completed succussfully. But again same issue SQL resource group(SQL Service & SQLAgent did not come online) and same error in event viewer. Again I have Uninstalled everything related to SQL Server from both nodes. Then I went to the below location in registry HKEY_LOCAL_MACHINESOFTWAREWow6432NodeMicrosoftMicrosoft SQL Server Here, I can see the folders 100 Ins2 RefCount. I did not see any browser folder. Is deleting the 'Microsoft SQL Server' folder from above registry location allows me to install SQL Server 2005 successfully? Is it safe to remove this 'Microsoft SQL Server' folder from HKEY_LOCAL_MACHINESOFTWAREWow6432NodeMicrosoft thanks