Windows Server Troubleshooting: Storage Data Deduplication
Data Duplication Elimination is a Windows Server 2012 \ 2012R2 \ 2016 role service. This service identifies and removes data duplications without compromising data integrity.
Purpose: To store more data and use less physical disk space.
Improvements in Windows Server 2016:
- Support for volume sizes up to 64 TB. Data Duplication Elimination in Windows Server 2012 R2 does not perform well in volumes larger than 10 TB;
- Support for file sizes up to 1TB. In Windows Server 2012 R2, many large files are not good candidates for Deletion of Data Duplication;
Volume Duplication Elimination Volume Requirements:
After you install the role service, you can enable Data Duplication Elimination by Volume, it includes the following requirements:
- Volumes must not be a boot or system volume;
- Volumes can be partitioned using the boot master record (MBR) or GPT (GUID partition table) format and must be formatted using the NTFS or ReFS file system;
- Volumes must be connected to Windows Server and can not be displayed as non-removable drives .;
- Volumes can be in shared storage, such as Fiber Channel, iSCSI SAN, or SAS array;
- Files with extended attributes, encrypted files, files smaller than 32 KB, and files from the reparse point will not be processed for Deletion of Data Duplication.
- Data Replication Elimination is not available for Windows client operating systems.
In Windows Server 2016, Data Duplication Elimination transparently removes duplication without changing access semantics.
Planning a Data Duplication Elimination deployment:
- Segment deployments. Data Duplication Elimination is designed to be applied to primary data volumes - and not logically extended - without adding dedicated hardware;
- Determine which volumes are candidates for duplicate deletion. Duplicate deletion can be very effective in optimizing storage and reducing the amount of disk space consumed - saving 50-90 percent of the system's storage space when applied to the right data;
Useful commands:
- Type Optimization : Optimize
- Type Scrubbing : Cancel Job
- Type GarbageCollection : Garbage Collection
- Type Unoptimization : Cancel Optimization
- Get-DedupStatus : Most commonly used, this cmdlet returns the status of duplicate deletion of volumes that have Data Duplication Elimination metadata;
- Get-DedupVolume : This cmdlet returns the status of the de-duplication of volumes that have Data Duplication Elimination metadata;
- Get-DedupJob : This cmdlet returns the status and de-duplication information for jobs with deduplication running or in the queue.