Windows Server Failover Cluster on Azure IAAS VM – Part 1 (Storage)

Hello, cluster fans. This is Mario Liu and I am a Support Escalation Engineer on the Windows High Availability team in Microsoft CSS Americas. I have a good news for you that starting in April 2015, Microsoft will support Windows Server Failover Cluster (WSFC) on Azure IAAS Virtual Machines. Here is the supportability announcement for Windows Server on Azure VMs:

Microsoft server software support for Microsoft Azure virtual machines
https://support.microsoft.com/en-us/kb/2721672

The Failover Cluster feature is part of that announcement. The above knowledge base is subject to change once more improvements for WSFC on Azure IAAS VMs are made. Please check the above link for the latest updates.

Today, I’d like to share the main differences when you deploy WSFC on-premises as compared to within Azure. First, the Azure VM operating system must be Windows Server 2008 R2, Windows Server 2012, or Windows Server 2012 R2.  Please note that both Windows Server 2008 R2 and 2012 both require this hotfix to be installed.

At a higher level, the Failover Cluster feature does not change inside the VM and is still a standard Server OS feature. The challenges are outside and relate to Storage and Network. In this blog, I will be discussing Storage.

The biggest challenge to implementing Failover Clustering in Azure is that Azure does not provide native shared block storage to VMs, which is different than on-premises – Fiber Channel SAN, SAS, or iSCSI. That limits SQL Server AlwaysOn Availability Groups (AG) as the primary use case scenario in Azure as SQL AG does not utilize shared storage. Instead, it leverages its own replication at the application layer to replicate the SQL data across the Azure IaaS VMs.

 

 

Until now, we have a few more options to work around the shared storage limitation; and that is how we can expand the scenarios beyond SQL AlwaysOn.

Option 1: Application-level replication for non-shared storage

Some applications leverage replication through their own means at the application layer.  SQL Server AlwaysOn Availability Groups uses this method.

Option 2: Volume-level replication for non-shared storage

In other words, 3rdparty storage replication.

 

A common 3rdparty solution is SIOS DataKeeper Cluster Edition. There are other solutions on the market, but this is just one example.  For more details, please check SIOS’s website:

DataKeeper Cluster Edition: Real-Time Replication of Windows Server Environments
https://us.sios.com/products/datakeeper-cluster/

Option 3: Leverage ExpressRoute for remote iSCSI Target shared block storage for file based storage from an Azure IaaS VMs

ExpressRoute is an Azure exclusive feature. It enables you to create dedicated private connections between Azure datacenters and infrastructure that’s on your premises. It has high throughput network connectivity to guarantee that the disk performance won’t be degraded.

One of the existing examples is NetApp Private Storage (NPS).  NPS exposes an iSCSI Target via ExpressRoute with Equinix to Azure IaaS VMs.

Availability on Demand - ASR with NetApp Private Storage
https://channel9.msdn.com/Blogs/Windows-Azure/Availability-on-Demand-ASR-with-NetApp-Private-Storage

 

For more details about ExpressRoute, please see

ExpressRoute
https://azure.microsoft.com/en-us/services/expressroute/

 

Option 4 (Not supported): Use an Azure VM as iSCSI Target to provide shared storage to cluster nodes

The fourth option is similar to the concepts mentioned in Option 3. However, it is much more simpler and easier when compared to the third option. In this we just have to move the iSCSI target to Azure.

We do not recommend or support this option mainly due to the performance hit. However, if you'd like to set up a cluster in Azure VMs as a proof of concept, you are welcome to do so.

Note: Please limit this option for development and lab purpose and do not use it in production.

 

There will be more options to present “shared storage” to Failover Clusters as new scenarios present in the future. We will update this blog along with the KB once new announcements becomes available. As long as you fix the storage, you’ve built the foundation of the cluster.

In my next blog, Part 2,  I’ll go through the network part and creation of a cluster.

Stay tuned and enjoy Clustering in Azure!

Mario Liu

Support Escalation Engineer

CSS Americas | WINDOWS | HIGH AVAILABILITY

Comments

  • Anonymous
    June 17, 2015
    Hi Mario, Thank you for the great post. According to the announcement, support.microsoft.com/.../2721672, there is another option for presenting "shared storage" to the cluster and this is Azure Files. Is there a reason, that you haven't include Azure Files as an option to this post  ? Thanks Vaggelis

  • Anonymous
    June 18, 2015
    Hello Mario, Appreciate your timely information. We are attempting to create a two node Always On Failover Cluster Index with SQL 2014 Standard on Azure VMs (IaaS). Please advise if we can do so using Storage Spaces for shared storage as described here: blogs.msdn.com/.../using-storage-spaces-on-an-azure-vm-cluster-for-sql-server-storage.aspx Or, must we use a third party solution for data replication between nodes? Thank you for your quick reply. Joe

  • Anonymous
    June 25, 2015
    @Vaggelis: Azure Files is still pending SMB3 support in Azure. I'll check with the engineering team and we may update that KB annoucement.

  • Anonymous
    June 25, 2015
    @Joe: If this is a SQL Always On Availability group, you do not need have shared storage. SQL AG uses app level replication, which is Option 1 in this blog. The blog you mentioned is to resolve the IOPS limitation for Azure blob storage. It does not give you the shared storage option. You can build storage pool seperately on each node and use 3rd party solution to replicate the storage pool. You can refer to these two links. The first one is a one-stop script to automate the creation of storage spaces. If you need more breakdown of the scripts, the 2nd link is a good one to start with. gallery.technet.microsoft.com/.../Reviews blogs.technet.com/.../extending-sql-server-2014-alwayson-resource-group-with-storage-spaces-on-microsoft-azure.aspx

  • Anonymous
    September 30, 2015
    With the recent GA of Azure Files with SMB3 support, can you please confirm if shared disk functionality (using Azure files) is now supported in SQL Server FCI?

  • Anonymous
    October 14, 2015
    @Bobbie Couhbor: Yes you're right. SQL Server FCI is fully supported by Azure Files with SMB3. We will have more documents soon and update this blog.

  • Anonymous
    November 23, 2015
    There seems to be a conflict in the SLA of Virtual Machines and Storage. Effectively the best SLA for writes to Storage is 99.9, however, two or more VMs configured in the same Availability Set has an SLA of 99.95 Availability for "external connectivity". If I'm building a failover cluster with Azure Files as the backend storage I'm assuming my actual SLA for the overall availability of the solution is 99.9%, since having my VMs available doesn't amount to a hill of beans if I can't write to the underlying database. What about VMs that have premium storage attached? I suppose that premium storage also falls under the 99.9% availability, so effectively I can only assume my overall SLA for the entire solution (VM+Storage) is 99.9% (assuming reads are required). Do I understand that correctly?