SQL cluster cannot failover. SQL services cannot start

Chong 866 Reputation points
2024-08-12T04:36:02.9333333+00:00

Hi,

I have a 2 node SQL server cluster which cannot failover. The SQL program and DB are located in the shared disk

When we try to failover the resource, it cannot auto failover from node A to B. We need to stop node A, manual active all resource in node B. Then the SQL can start normally

Then we try to failback from node B to A. When we failover to node A, the disk can be found in node A, but the SQL service cannot start. We tried to stop node B and manual active all resource in node A, it still failed to start.

SQL.log

Thanks

SQL Server
SQL Server
A family of Microsoft relational database management and analysis systems for e-commerce, line-of-business, and data warehousing solutions.
13,999 questions
Windows Server Clustering
Windows Server Clustering
Windows Server: A family of Microsoft server operating systems that support enterprise-level management, data storage, applications, and communications.Clustering: The grouping of multiple servers in a way that allows them to appear to be a single unit to client computers on a network. Clustering is a means of increasing network capacity, providing live backup in case one of the servers fails, and improving data security.
1,008 questions
{count} votes

1 answer

Sort by: Most helpful
  1. MANISH RAJDOOT 0 Reputation points
    2024-08-15T06:00:31.5433333+00:00

    A SQL cluster unable to failover or start its services indicates a critical issue. Potential causes include network problems, storage failures, cluster configuration errors, or SQL Server service-specific problems.

    Initial Steps:

    1. Check Cluster Health: Verify all nodes are online and communicating. Inspect the cluster network and storage resources for issues.
    2. Review Event Logs: Examine Windows and SQL Server logs for error messages related to the failover attempt or service startup.
    3. Validate Cluster Configuration: Run a cluster validation report to identify potential configuration issues.
    4. Check SQL Service Dependencies: Ensure all necessary services (like SQL Server Agent, Browser, etc.) are started and dependent on the SQL Server service.
    5. Verify Disk Configuration: Confirm that all disk resources are online, accessible, and properly configured in the cluster.
    6. Test Network Connectivity: Verify network connectivity between cluster nodes and the SQL Server instance.

    Potential Issues and Solutions:

    • Network Problems: If network connectivity is disrupted, failover and service startup will fail. Check network adapters, cables, switches, and firewalls.
    • Storage Failures: Disk failures or resource group issues can prevent failover. Check disk status, rebuild failed disks, and verify resource group configuration.
    • Cluster Configuration Errors: Incorrectly configured cluster resources or dependencies can cause problems. Review cluster configuration and run validation tests.
    • SQL Server Service Issues: Issues with SQL Server service startup might be due to configuration errors, resource conflicts, or file system permissions. Check service configuration, resolve conflicts, and verify permissions.

    Additional Considerations:

    • Manual Failover: If automatic failover is not working, try a manual failover to isolate the issue.
    • Third-Party Tools: Consider using specialized tools for cluster and SQL Server troubleshooting.
    • Support Involvement: If the issue persists, contact Microsoft SQL Server support for assistance.

    By systematically addressing these areas, you can increase the chances of resolving the SQL cluster failover and service start issues effectively.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.