How quick's it going to take to backup and restore Exchange data with DPM?
One of the most often asked, and most difficult to answer, questions concerning Microsoft System Centre Data Protection Manager (dpm) is how long will it take to backup and restore data... As you'd expect the answer is 'it depends'. So I thought I'd provide an example from my own experience which it might be useful to use to extrapolate an estimate for your own environments...
So first let's have a quick look at my test environment...
Some other key bits of information:
- 7 Protection Groups each protecting a single Storage Group
- Custom Volumes
- 8 Day retention period
- Weekly Express Full schedule
- 15 minute incremental schedule
- 2 minute incremental offset
- No Express Full offset
So here are my results...
1. Replica Creation and Consistency Check
This test is the initial job which copies the data from Exchange Server to DPM and runs a consistency check between the Exchange Server volumes protected by DPM and the DPM volume where the corresponding data will be stored. For a definition of a consistency check please go to the following location: https://technet.microsoft.com/en-us/library/cc161653.aspx
Protection Group | Data Transferred | Change Percentage | Time to Completion (h:m:s) |
PG1 | 80,689.88MB | 100% | 01:42:29 |
PG2 | 80,788.82MB | 100% | 01:52:41 |
PG3 | 80,821.94MB | 100% | 01:43:30 |
PG4 | 80,609.82MB | 100% | 01:46:00 |
PG5 | 80,647.94MB | 100% | 01:50:26 |
PG6 | 80,131.94MB | 100% | 01:51:04 |
PG7 | 80,611.82MB | 100% | 01:44:01 |
Total Data Transferred | Bandwidth Consumption | Backup Time |
564,302.16MB | 368Mbps | 01:52:41 |
* Bandwidth consumption was calculated as an average bytes received per second for the duration of the peak transfer of data during the consistency check.
** All jobs ran in parallel and so the backup time is the time it took for the longest job to complete.
2. First Express Full Backup
This job takes across any changes since the original replica creation and consistency check. Have a look here for more information about the Express Full job.
Protection Group | Data Transferred | Change Percentage | Time to Completion (h:m:s) |
PG1 | 1,164.19MB | 1.4% | 00:24:43 |
PG2 | 1,155.94MB | 1.4% | 00:33:35 |
PG3 | 1,129.50MB | 1.36% | 00:25:10 |
PG4 | 267.38MB | 0.3% | 00:24:48 |
PG5 | 1,144.25MB | 1.4% | 00:33:12 |
PG6 | 286.88MB | 0.3% | 00:21:21 |
PG7 | 1,171.44MB | 1.4% | 00:33:04 |
* A consistency check had to be run previously against Protection Groups 4 and 6 which is why there was less data to transfer. The issue was resolved by the installation of the DPM Feature Pack.
Total Data Transferred | Bandwidth Consumption | Backup Time |
6,320MB | 104Mbps | 00:33:35 |
* Bandwidth consumption was calculated as an average bytes received per second for the duration of the peak transfer of data during the backup job.
3. Second Express Full Backup
This job takes across any changes since the last Express Full backup job.
Protection Group | Data Transferred | Change Percentage | Time to Completion (h:m:s) |
PG1 | 4,114.50 MB | 5% | 00:26:44 |
PG2 | 3,997.31 MB | 4.5% | 00:34:41 |
PG3 | 3,926.00 MB | 4.5% | 00:40:28 |
PG4 | 4,123.56 MB | 5% | 00:41:18 |
PG5 | 1,144.25MB | 4.5% | 00:35:36 |
PG6 | 4,019.13 MB | 4.5% | 00:19:19 |
PG7 | 4,002.19 MB | 5% | 00:36:56 |
Total Data Transferred | Bandwidth Consumption | Backup Time |
28,338.44MB | 240Mbps | 00:41:18 |
4. Example Incremental synchronisation job
This test is the regular incremental synchronisation which occurs by default every 15 minutes. Have a look here more information about this type of job.
Protection Group | Data Transferred | Time to Completion (h:m:s) |
PG1 | 51.19MB | 00:01:05 |
PG2 | 56.19MB | 00:01:04 |
PG3 | 55.19MB | 00:01:05 |
PG4 | 57.19MB | 00:01:06 |
PG5 | 46.19MB | 00:01:04 |
PG6 | 40.19MB | 00:01:04 |
PG7 | 36.19MB | 00:01:07 |
Total Data Transferred | Bandwidth Consumption | Backup Time |
342.33mb | 4Mbps | 00:01:07 |
5. Database Restore
The final test I ran was to restore a single database over the top of a 'failed' database to its original location.
Protection Group | Total Data Transferred | Time to Completion (h:m:s) | Bandwidth Consumption |
PG2 | 92,176.44MB | 00:21:33 | 744Mbps |
What it is quite interesting to see what happens to the bandwidth consumption as an incremental backup job kicks off during the restore. You would expect to see this on a production server. The following screenshot shows how the bytes sent per second drops drops from around 744Mbps to about 520Mbps... So obviously parallel backups will increase restore times. Pretty obvious I know but interesting to see.
So what conclusions can we draw from these results?
Well first of all DPM seems to make pretty efficient use of the available bandwidth. At times during the consistency check and the restore DPM was hitting about saturation point of Gb Ethernet...
Secondly because DPM is only concerned with changes following the initial replica creation and consistency check backups are fast. My express full backups were taking under 45 minutes and because DPM is VSS based the interruption to service was always under 2 seconds. The regular transaction log syncs were taking about a minute each time.
..and of course as we are backing up from the replica database there is no impact on the active database and therefore the clients.
You should also be able to get pretty decent restore rates - 93GB in under 25 minutes is pretty fast I reckon.
Lastly I'd like to point out that tests were performed using the latest Feature Pack for DPM; available for download at https://www.microsoft.com/downloads/details.aspx?familyid=AD5CD1A2-9B87-4A2C-90A2-9DBAF1024310&displaylang=en and for the duration of the testing DPM proved to be very stable...
Of course I do have to point out that this was a test rig with nothing else going on its isolated network. Unfortunately it's very difficult to say exactly how DPM in your environment might perform based on these results. However I hope the information is useful as a guide for some...
---
Doug Gowans
Comments
Anonymous
January 01, 2003
..on-the-wire compression was not turned on during my tests. As I understand it this would have a definite impact on the performance of the CPU but might improve the time to backup if you are particularly bandwidth constrained or have a bottleneck at the destination which means that performance would be improved by reducing the amount of data being thrown at once at the destination... I believe that if more than one storage group for the same mailbox server is a member of the same protection group the jobs will run serially so the Express Full backup will take longer. Mine ran in parallel.Anonymous
September 12, 2008
In the performance settings for the protection groups, did you have over-the-wire compression turned on or off? I wonder if speeds would get better or worse with the opposite setting. Also, I wonder how performance would be affected if all the storage groups were in one protection group.