Multi-Terabyte File Storage in the Cloud using DFS!
We recently had a question come to the team about DFS(Distributed File System) in Azure. The short answer is; it works as documented, simply follow the Windows documentation for setting up a DFS Namespace. Many links for you on DFS at the bottom of this post.
The concern came up because the largest VM currently supported today is Extra Large only allows for 16 x 1tb data disks. Well if you need 45tb file server what are you to do? Easy, attach as many disks to as many VMs as you need to get the amount of space you want and create a DFS namespace.
In my case I created three XL VMs with 16 drives each. Since I am lazy and did not want to share 16 separate drives on each server I used Raid 0 across the 16 drives to get one nice large 16tb drive and added that to the DFS Namespace. If you are worried about using Raid 0, don't be, you can read the details under Locally Redundant Storage . I also created an AD controller for the purpose of testing Domain-based DFS namespace and pretty much isolated my test, in the real work I likely would have extended my on premise domain or created a separate domain in the cloud and setup a trust relationship. In any case, the problems you will be solving will be less about DFS and more about how you want to expose it to your user base.
If you need less than 16tb per server you can use the table in the link below to determine the maximum number of disks that can be mounted to each VM size, if you need less, use less. Each VM size has different limits on the number of drives that can be mounted, so be sure to check here; Virtual Machines – under the section Defining the Size
If you want to get a domain started in the cloud, follow the instructions in the link below (SQLCAT - Create a Domain in Azure) to get you a few VMs setup on the same network in the cloud. You can skip over the SQL SSAS stuff as you will not need that for DFS, I may revisit this in a later post to give you exactly what you need minus the BI stuff, but essentially stop when you get to step 4.
SQLCAT - Create a Domain in Azure (Quick and Dirty Guide)
DFS Namespace Strategy
https://technet.microsoft.com/en-us/library/cc962137.aspx
DFS Namespaces and DFS Replication Overview
https://technet.microsoft.com/en-us/library/jj127250.aspx
DFS Step-by-Step Guide for Windows Server 2008
https://technet.microsoft.com/en-us/library/cc732863(v=WS.10).aspx
What's New in Distributed File System(2008 R2)
https://technet.microsoft.com/en-us/library/ee307957(v=ws.10).aspx
Guidelines for Deploying Windows Server Active Directory on Windows Azure Virtual Machines
https://msdn.microsoft.com/en-us/library/windowsazure/jj156090.aspx#BKMK_Safe
Virtual Machines (Azure)
https://msdn.microsoft.com/en-us/library/windowsazure/jj156003.aspx
Data Series: Introducing Locally Redundant Storage for Windows Azure Storage
Windows Azure Storage: A Highly Available Cloud Storage Service with Strong Consistency
https://sigops.org/sosp/sosp11/current/2011-Cascais/printable/11-calder.pdf
Comments
- Anonymous
October 04, 2013
Hi SqlShep, Thanks for the informative post. We are trying to replicate what you have done, by using a standalone DFS in Azure. We were able to create a standalone DFS in Azure VM. But we are not able to access is from a client machine with pattern like `\abcd.cloudapp.netpublic'. We always get error 'Network Path not found.'. We are able to create normal shares and access then from client machine using the above syntax. Are such shares only available within the network, which means we need to create a network ->domain. Then VPN into the network to access DFS. Thanks, Chandermani