With Windows Server 2012, Microsoft has added loads of new features and functionality and I’m going to take a walkthrough of the new stuff.
Data Deduplication
Is the technology that allows you to eliminating duplicate copies of repeating data, so instead of storing the data twice you would just have a flag which points to the other data.
But a picture says more then a thousand words. Lets say you have a file which consists of these blocks and block 5 has a repeating data.
So instead of repeating that data, we can just setup a flag pointing to the original data and remove the duplicated data block.
So if you have a LARGE file with much duplicate data you will be able to reduce the data significantly.
So lets finish up with an example using WS2012.
First of we need to install the Data Deduplication feature.
After it is installed head back to the server manager –> File and Storage Servers –> Disks
And if you have a VM add a new HD to the VM and create a new disk and a new volume.
So from here press Configure Data Deduplication.
And press for Enable Data Deduplication. By default it starts deduplicating files that are older then 5 days. Set that 0 days for the purpose of this demo.
We are going to leave it at that, and press Apply –> OK.
Now go back the Volumes View.
You can see that Deduplication is enabled for that volume but since there is nothing on the volume yet there is nothing to deduplicate.
So what we are going to do now for the purpose of this demo is to create multiple VHD files on that volume.
NOTE: This part is just to demonstrate how deduplication works
Open Disk Management and choose create VHD
Enter a name for it and place it on the deduplicated volume. (Make it 3GB)
Open Explorer to the volume and make a copy of that VHD file in the same folder.
So now you should have 2 VHD files with the same usage
IF you head back to server manager now you can see that the free space decreased by 6 GB.
Go back and create one more VHD file. So you have 3 duplicate files.
Now the schedule says it dedupliates files older then 0 days, so we need to run a PowerShell command to do a manual job to get instant results.
Run the command Start-dedupjob –Type Optimization –Volume E: (In my case it is E: change the letter with the one you have)
After that you can enter the command get-dedupjob to see the status.
Now after this is done, go back to the volumes view on Server Manager. You can now see that Windows has saved 6 GB which is the equivalent of the 2 VHD files.
You can also view the Event Log for Deduplication events if you get some sorts of errors.
If you wish to remove deduplication on a volume you can run the command
Start-Dedupjob –Type DeOptimization.
There are also other options when running from PowerShell, you can see all the deduplication cmdlets by running
get-command –module deduplication.
So when to use dedupliction in a production environment?
Remember that deduplication is most useful where you have a lot of duplicate data. But Microsoft does not recommend that you use this for files that are constantly changing and for virtual machines.
For instance an VMM Library ISO share could be a good choice, you can also specify which types of files the deduplication should optimize but use it for most static files (Files that undergo little change)
Great candidates for deduplication:
- Folder redirection servers
- Virtualization depot or provisioning library
- Software deployment shares
- SQL Server and Exchange Server backup volumes
Should be evaluated based on content:
- Line-of-business servers
- Static content providers
- Web servers
- High-performance computing (HPC)
Not good candidates for deduplication:
- Hyper-V hosts
- VDI VHDs
- WSUS
- Servers running SQL Server or Exchange Server
- Files approaching or larger than, 1 TB in size
This has been part 1 of Windows Server Storage, stay tuned.