SQL Server 2012 AlwaysOn High Availability and Disaster Recovery Design Patterns

This is one of the popular blog posts on sqlcat.com. Reposting (with updated links) here after sqlcat.com is de-commissioned.

 

Author: Sanjay Mishra
Contributors: Justin Erickson, Mike Weiner
Reviewers: Prem Mehra, Juergen Thomas, Steve Howard, Chuck Heinzelman, Jimmy May

 

SQL Server 2012 AlwaysOn provides flexible design choices for selecting an appropriate high availability and disaster recovery solution for your application. SQL Server AlwaysOn was developed for applications that require high uptime, need protection against failures within a data center (high availability) and adequate redundancy against data center failures (disaster recovery). https://msdn.microsoft.com/en-us/library/hh781257.aspx provides an overview of high availability and disaster recovery solutions available in SQL Server 2012 AlwaysOn.

Through working with customers who are deploying SQL Server 2012 AlwaysOn currently, we have seen the following design patterns emerge as end-to-end HA+DR solutions:

  • Using Multi-site Failover Cluster Instance (FCI) for local high availability and disaster recovery solution
  • Using Availability Groups (AG) for local high availability and disaster recovery solution
  • Using Failover Cluster Instance (FCI) for local high availability, and Availability Groups (AG) for disaster recovery solution

We expect most of the SQL Server 2012 AlwaysOn deployments to match one of these design patterns or contain slight variations.

So, how do these three design patterns compare and contrast? This blog highlights the salient features of each of these design patterns. A link to a detailed whitepaper on each of these is provided.

Multi-site Failover Cluster Instance (FCI) for HA and DR

The ability to implement a multi-site FCI as a HA and DR solution has been available in the SQL Server product for a number of previous releases, and many customers have been successfully using the solution (example: https://download.microsoft.com/download/d/9/4/d948f981-926e-40fa-a026-5bfcf076d9b9/SQLServer_HADR_QR.docx). In earlier versions of SQL Server, multi-site FCI required a stretch VLAN. SQL Server 2012 removes that requirement (along with a number of other improvements to the failover cluster instance technology) enabling multi-site FCI to be more commonly adopted as a HA and DR solution.

Multi-site FCI requires storage level replication (provided by the storage vendor) to maintain a copy of the databases at the DR site. Even though there are separate storage volumes at each site, to SQL Server, this looks like a Shared Storage solution. Other important attributes of this solution are:

  • The unit of failover for both local HA, and remote DR is SQL Server instance.
  • No requirement on specific database recovery model, as storage level replication is used for maintaining the remote copy of the data.
  • The DR copy of the data is not readable.

The whitepaper https://download.microsoft.com/download/D/2/0/D20E1C5F-72EA-4505-9F26-FEF9550EFD44/SQLServer2012_MultisiteFailoverCluster%20(2).docx provides architecture details and best practices for this solution.

Availability Group for HA and DR

Using Database Mirroring for local high availability, and combining it with Log Shipping for a disaster recovery solution is a popular deployment architecture prior to SQL Server 2012 (example: https://download.microsoft.com/download/d/9/4/d948f981-926e-40fa-a026-5bfcf076d9b9/MSIT_SAP_Data_Tier_HA_DR.docx).

With SQL Server 2012, the Database Mirroring and Log Shipping solution can be replaced with an Availability Group solution with multiple secondaries.

This is considered a non-shared storage solution, as each SQL Server in the topology has its own copy of data and does not need to share storage.  Other important attributes of this solution are:

  • The unit of failover for local HA, and DR is the Availability Group (a group of one or more databases).
  • The database is required to be in the FULL recovery model.
  • The DR replica can be utilized as an Active Secondary (Readable Secondary Replicas, Backup on Secondary Replicas).

The whitepaper https://msdn.microsoft.com/en-us/library/jj191711.aspx provides architecture details and best practices for this solution.

Failover Cluster Instance for local HA and Availability Group for DR

Using Failover Cluster Instance for local high availability, and combing it with database mirroring for a disaster recovery solution is a popular deployment architecture prior to SQL Server 2012 (example: https://msdn.microsoft.com/en-us/library/ee355221.aspx).

With SQL Server 2012, the Database Mirroring can be replaced with an Availability Group for the DR solution, while continuing to use Failover Cluster instance for local HA.

This architecture is a combined Shared Storage and Non-Shared Storage solution. Other important attributes of this solution are:

  • The unit of failover for local HA is the SQL Server instance.
  • The unit of failover for DR is the Availability Group (a group of one or more databases).
  • The database is required to be in the FULL recovery model.
  • The DR replica can be utilized as an Active Secondary (Readable Secondary Replicas, Backup on Secondary Replicas).

The whitepaper https://msdn.microsoft.com/en-us/library/jj215886.aspx provides architecture details and best practices for this solution.

Summary

A brief outline and comparison of three common HA/DR design patterns with SQL Server 2012 AlwaysOn is provided above. The detailed architecture guides on each of these design patterns are published as separate whitepapers and the links are provided above.

Comments

  • Anonymous
    February 12, 2014
    To recover the corrupt SQL database MDF file sufficiently the MS SQL Server database recovery software is the best and relaible solution which gives desirable solution.

  • Anonymous
    July 11, 2014
    I hope this will help you to recover database from disaster situation. www.sqlrecoverysoftware.net/sql-server-mdf-file

  • Anonymous
    August 07, 2014
    Sanjay, My question is specific to Multisite Failover Cluster using Win2012 hosting SQL 2012 FCI Instance. In one of your documents (SQL Server 2012 AlwaysOn: Multisite Failover Cluster Instance) you have mentioned Storage Replication in 'Figure 1'. I am confused as to why would one need Storage Replication across the site for the cluster. Can you please elaborate on this?

  • Anonymous
    July 29, 2015
    Sanjay, we have the scenario where we will deploy HA for Local Site and AG for DR. The setup will be like 2 Nodes SQL FCI in Site A which will host Instance A and 2 Node SQL FCI on Site B instance B. We want to replicate Instance A to Site B using AG and Instance B to Site A. Likewise we will have more instance for both the sites. If any node goes down for any of the instance, it will failover to the second node and if both the nodes goes down, it will switch to the AG available in the second site. Is this possible and what are the recommendation. Thanks!!!!!

  • Anonymous
    July 29, 2015
    Sanjay, we have 2 Sites, Site A & Site B. We want to implement 2 Instance, Instance A at Site A and Instance B at Site B. Site A will have 2 SQL FCI nodes and failover will work locally and AG to the Site B and same vice versa. All four servers will be part of same WFC. Will it work fine as Instance A will be active at Site A and Instance B will be active at Site B and their respective AG will be on the remote sites. Please suggest. Thanks.