Share via


Fun with Deduplication in Windows 8 Server…

One of the first things I wanted to play with in my Windows 8 lab was the new data deduplication feature. 

In my case, I decided to make a small volume and see how well it worked with VHD files.  Well, I’m happy to announce that it works pretty well!

First of all, you need to have the File and Storage Services role installed.  Make sure you drill down in there and get ‘data deduplication’ checked.

image

Once you get that going – you need to configure deduplication on a volume.  It cannot be the C:\ drive.  In my test server I have a giant RAID 5 array so I used disk management to peel off 100GB and created a F:\ drive that I named “DEDUP”.

You can now enable deduplication and configure the options to suit your environment.

image

It’s important to note that deduplication is not ‘real time’.  The optimization process runs every hour but you can force it to run manually using some simple PowerShell commands.  (This is great for demo’s when you want to copy a file in the directory, for example, and then immedately show the effect of dedup’ing)

You can trigger an optimization job on demand in PowerShell using the Start-DedupJob cmdlet. For example:

PS C:\> Start-DedupJob E: –Type Optimization

You can query the progress of the job on the volume by using the Get-DedupJob cmdlet:

PS C:\> Get-DedupJob

The Get-DedupJob command show current jobs that are running or are queued to run.You can query the key status statistics including the achieved savings on the volume by using the Get-DedupStatus cmdlet:

PS C:\> Get-DedupStatus

In my case, with all VHD files (and a mix of Windows 7, Windows 8 Client and Server), I saw some pretty significant space savings on the dedup’d volume.

Properties on the disk shows me:

image

What I actually have on the drive:

A Windows 7 VHD @ 7.4GB with 3 copies.  This would use ~22GB without dedup.

A Windows 8 Client CTP VHD @ 9.2GB

A Windows 8 Server VHD @ 9.0GB

So, total I would have seen ~40GB of space used without dedup.

With Windows 8 Deduplication enabled:

image

Nice job Windows Server team!

Worth noting…in a dual-boot scenario…what happens when you are in another OS and want to access that dedup’d volume?

  • Any file that was deduped with server will not be available (You will be able to see the file system Reparse Points that define the optimized file stub for the deduped file)
  • Any file that was not deduped will be available

Comments

  • Anonymous
    January 01, 2003
    Hey Rhiannon - that should never happen...please feel free to shoot me an email and I can get you in touch with one of the Program Managers for this feature to better understand what happened.  ken dot lince at Microsoft.com Thanks!

  • Anonymous
    June 05, 2012
    This is a really exciting feature. I'd love to see some figures on performance too. Maybe i'll just have to give it a go :)

  • Anonymous
    January 03, 2013
    Just don't have a power outage without a UPS, it'll scramble your entire dedup volume.  Microsoft's answer seems to be "don't run dedup unless you're in a clustered environment" meaning another server with the same copy of the data "just in case".  Nice try.  They shouldn't be bragging about a feature that isn't even trustworthy.  Learned this the hard way just after deploying a file server.