Why does it take so long to create a fixed size virtual hard disk?

If you have ever created a fixed-size virtual hard disk that was larger than, oh - 2GB, you probably noticed that it takes quite a while to create.  The reason why this takes so long is that when we create a new fixed-size virtual hard disk we take the time to explicitly zero-out all of the disk space that is being assigned to the new file.

Now - we could do this practically instantaneously by not zeroing out the data - but this has an interesting potential security problem.

Imagine the following situation:

  • You have a virtual machine with a bunch of confidential data running on a central server (e.g. your company payroll).
  • This virtual machine gets moved to a new physical server in response to increased work load.
  • You create a new virtual machine which is given to someone on from the in-house dev team - but the virtual hard disk data was not zeroed out.
  • Developer then runs data recovery tools on his new, blank virtual machine and is able to recover data from the old payroll server (yikes!)

You see - data is never actually deleted from a disk when a file is moved or deleted (it is just dereferenced) so to avoid the above scenario - we must take the time to "do the right thing" and zero out the VHD contents.

Cheers,
Ben

Update: We have provided a tool to create quick, but not secure, fixed virtual hard disks. Details here.

Comments

  • Anonymous
    December 10, 2008
    It would be nice if there was a switch to bypass this for the more usual times when you are creating the fixed sized disks on a brand new server.

  • Anonymous
    December 10, 2008
    Doesn't NTFS already guarentee that sectors read from a new file will be zeroed?  I think you're just duplicating work the filesystem is doing for you.

  • Anonymous
    December 10, 2008
    Having an option to skip file zeroing out would be a great thing.

  • Anonymous
    December 10, 2008
    Kieran Walsh - We have discussed this, but the problem is that we would be providing a "do this in an insecure fashion if you know what you are doing checkbox" which would need a heck of a lot of text to try and explain to people why you do not want to do it - and then most people would not read the text anyway :) Jon - Actually - a couple of folk at Microsoft have just been emailing me on this too.  NTFS will zero out a blank file for you - but it zeroes from the begining of the file up to the point where you tried to write to - which would cause unexpected performance problems.  Alternatively it is possible to disable this behavior in NTFS (which is what I was referring to as part of "creating it quickly") which would cause the problem I highlighted above. Cheers, Ben

  • Anonymous
    December 11, 2008
    It's good idea not to trust developers ;-) But this approach hurt IT pros when there build new VMs on brand new HDD. Dmitri

  • Anonymous
    December 11, 2008
    The comment has been removed

  • Anonymous
    December 11, 2008
    Thanks for the reply Ben. All these replies show that it's certainly a pain point out in the field.

  • Anonymous
    December 11, 2008
    Is there a script for converting a dynamic disk to fixed?  Thanks.

  • Anonymous
    December 11, 2008
    NTFS will never return data from a previous file on disk.  That would violate the government security standards that it adheres to.  Alos, NTFS on Win2K8 only allocates blocks as you use them (it has had this capability since Win2K).  So, you will not see the performance problem you're mentioning.

  • Anonymous
    December 11, 2008
    Jsheehan - When you write to a new file in a location beyond the current valid data length (VDL) NTFS will zero fill the file up to that point and extend the VDL.  This isn't a problem on small files or files with sequential data access - but it is problematic on large files that are written to non-sequentially (like a VHD).  You can disable this behavior in NTFS by using SetFileValidData to set the VDL to the logical end of the file - see: http://msdn.microsoft.com/en-us/library/aa365544(VS.85).aspx Of course this will cause the problem I mentioned above. Cheers, Ben

  • Anonymous
    December 16, 2008
    Is there any official documenation or KBs on this behavior? This doesn't mix well with SAN based thin or on-demand provisioning and I'd like to include this a best practice doc that I'm working on. Thanks for the great post and discussion in this thread!

  • Anonymous
    December 16, 2008
    The comment has been removed

  • Anonymous
    December 24, 2008
    The right thing to do is to ask. If you think people should go a particular way set a default. Instead you waste every one else in the world's valuable time zeroing out what isn't payroll data 99.99% of the time. This "MS knows best" is what professionals hate about MS products.

  • Anonymous
    December 26, 2008
    The comment has been removed

  • Anonymous
    November 13, 2013
    Is there any way to calculate the time it will take based on the fixed drive size created? Just a ball park figure?

    • Anonymous
      September 01, 2016
      Dave: Just check the Disk section of Resource Monitor - you should be able to recognize the path to the created VHD. Then you can see the transfer rate (byte/sec) and calculate from there how long it should take.