SQL Server Resource IsAlive Troubleshooting
When SQL Server is deployed on cluster environment there is an associated IsAlive that is responsible to check if SQL Server is online accepting new connections.
cluster.log:
(…)[sqsrvres] checkODBCConnectError: sqlstate = 08001; native error = 6; message = [Microsoft][ODBC SQL Server Driver][TCP/IP Sockets](…)
On multi SQL Server instance environments with mixed versions of SQL Server deployed SQL Is Alive dll is unique and match highest version of SQL Server deployed on cluster environment. For example:
SQL Server 2008 – c:\windows\system32\sqsrvres.dll should be 10.00.xxx
SQL Server 2005 – c:\windows\system32\sqsrvres.dll should be 9.00.xxx
SQL Server 2000 – c:\windows\system32\sqsrvres.dll should be 8.00.xxx
Starting with SQL Server 2005 is possible to enable a registry key to enable Is Alive to log extended info to cluster.log in order to troubleshoot SQL Check Alive problems.
Be aware that in case you have previously deployed SQL Server 2008 in case you remove it don’t remove SQL Server Native Client 10.0 or Is alive will start failing. sqsrvres.dll version does not rollback so it’s always directly linked to SQL Native Client Provider (in SQL 2005) and SQL Server Native Client 10.0 (in SQL Server 2008), so ensure that these providers aren’t removed.
In case check alive only fails on specific node and you want to troubleshoot that problem you can manually change default Looks Alive and Is Alive so SQL keeps trying to startup on that node for long time so you troubleshoot that problem.
After that you can open command prompt using same credentials than cluster service account (because IsAlive is a children process of cluster service) and try to connect to SQL Server using:
sqlcmd.exe –S server_name –E (for SQL 2005 and 2008)
osql.exe –S server_name –E (for SQL Server 2000)
Hope this helps understanding some techniques how to troubleshoot SQL Server Cluster installation problems due IsAlive failures.