Backup solutions for Exchange 2007... (3)
Just to continue on from previous blogs; 'Backup solutions for Exchange 2007' and 'Backup solutions for Exchange 2007... (2)', in which I discussed some of the options for backing up standalone mailbox role servers with or without LCR enabled, I now need to move on to the options for the next design which is as follows:
Design 3 - Two node MNS Cluster with CCR enabled
...so just to reiterate (AGAIN) " ..in the context of data protection I have to mention the importance of Service Level Agreements. Before we can even start designing a backup solution it is vital that we have a good understanding of what our recovery objectives are. We really need to try and pin down firstly whether Outlook is a critical application in terms of our ability to communicate via email (so can we use a dial-tone recovery?) and secondly whether the data held within our databases is critical to the business (so we need to plan for a standard database recovery?). If it is a yes to both then we need to understand how long our business can be without access to Outlook and our Exchange data. Ok so this is a very difficult exercise and if the business will not dictate this then we should be directing the business by coming up with a number ourselves and then obtaining their agreement on this. Once we've got a number, like (at its most basic) 4 hours to restore the full service including all data, then we have something to aim for... "
The main options are as follows: (These are the same as with the previous design but with some minor changes, again specifically when considering the use of VSS.)
- Traditional streaming backup to tape
- Traditional streaming backup to disk and then to tape
- Snapshot backups based on the Volume Shadow Copy Service
Traditional streaming backup to tape
This is the standard form of backup solution that all Exchange Administrators will be familiar with in some form or another. This is possible through the ESE api and is supported by NTBackup and the new System Centre Data Protection Manager 2007 (DPM), as well as numerous 3rd party products from our partners. There are a number of advantages and disadvantages to using traditional streaming backup to tape when you are using standalone mailbox role servers. These are summarized as follows:
Advantages | Disadvantages |
Mature technology with numerous options in terms of software and hardware | Will impact the performance of the server during the course of the backup so needs to be considered particularly with companies providing a 24 hour service |
Can run backups against multiple storage groups concurrently (NTBackup would required multiple backup jobs to do this) | Need to be aware of its impact on IS Maintenance. Your backup window should be staggered to avoid the IS Maintenance period |
Generally simple to setup | Can be relatively slow, particularly when compared to VSS snaps or streaming backup to disk |
Can be relatively expensive in terms of the number of tapes that are required | |
Full backup each night is generally recommended to be able to meet most recovery objectives (alternative could be weekly fulls and daily differentials) | |
Often restricted to relatively small databases in order to meet our recovery SLA's | |
Must be run against the active database only. |
Traditional streaming backup to disk and then to tape
Very similar in terms of advantages and disadvantages above; the differences being that the speed of any backup is going to be faster, and therefore the impact of your backup on performance and IS Maintenance will be minimized. Also it is likely that if you need to restore your database from last night it will most likely still be on disk and therefore offline restores to a new storage group then becomes an option (making use of database portability). (*Be careful with public folders though as these are not 'portable'.) Also a traditional restore from disk is likely to be relatively fast, especially when compared to a restore from tape. The pro's and con's are as follows:
Advantages | Disadvantages |
Mature technology with numerous options in terms of software and hardware | Will impact the performance of the server during the course of the backup so needs to be considered particularly with companies providing a 24 hour service |
Can run backups against multiple storage groups concurrently (NTBackup would required multiple backup jobs to do this) | Need to be aware of its impact on IS Maintenance. Your backup window should be staggered to avoid the IS Maintenance period |
Generally simple to setup | Can be relatively slow, particularly when compared to VSS snaps** |
Generally faster than streaming backups to tape** | Can be relatively expensive in terms of the number of tapes that are required |
Full backup each night is generally recommended to be able to meet most recovery objectives (alternative could be weekly fulls and daily differentials or even incremental's) | |
Often restricted to relatively small databases in order to meet our recovery SLA's | |
Requires additional disk space | |
Must be run against the active database only. |
**The speed of your backup and restore will be determined by a number of factors including network, tape device, RAID type,backup software etc etc.. To give you an idea MSIT used to use NTBackup to back up there Exchange 2003 data to disk and then tape and achieved the following:
- "Individual backup throughput per storage group can be sustained at approximately 1.2 GB per minute
- Total throughput can be sustained at approximately 4.8 GB per minute per Exchange virtual server with four concurrent backups running.
- Restore rates can be achieved in the range of 2 GB per minute for a disk-to-disk-based restoration. This throughput is achievable once the disks being written to are not under any form of production load."
This information was taken from a 'Note on IT' article.
Snapshot backups based on the Volume Shadow Copy Service
The third option which many administrators might not be so familiar with is to take snapshot, 'point in time' backups of your Exchange data. Snapshots are supported to run against the active copy of a storage group although continuous replication does now enable us to offload snaps to the replica database. In a deployment using CCR VSS snaps can be taken of the replica database and not necessarily of the active database which has the advantage of reducing the impact of the snap on the active database and on the active node. Products such as DPM will also take care of transaction log truncation of the active database even when the snap is operating against the replica and will 'follow' the replica database. So in the event of a failover when the replica becomes the active, DPM will now protect the formerly active database and new replica. This is configurable in DPM so administrators can choose for this behaviour to be overridden.
Support for VSS has been in place since Exchange 2003 but in my experience has not been widely adopted. (Indeed NTBackup does not provide support for 'Exchange aware' snaps.) VSS allows files to be backed up when they are still open essentially by pausing disk I\O. On an Exchange Server a read only copy of the Exchange data is copied to disk which will typically take a couple of seconds and will almost imperceptibly interrupt Outlook, if run against the active database. There will be no impact to clients if snaps are taken against the replica database. We can take snaps every night for example, alongside transaction log synchronisation's every 15 minutes, and so will be able to restore to multiple points in time using a combination of the last snap and multiple transaction log synchronisation's. A good explanation of how this works in detail can be found here. Exchange 2007 has improved support for VSS including, for example, the ability to restore VSS backups to alternative locations (database portability again) but the technology is essentially the same.
Again there are numerous partner products that can provide you with the ability to take snapshots but DPM is the product which I think will really interest administrators who want to re-evaluate their backup solution.
DPM's approach is described as follows:
"DPM uses a combination of transaction log replication and block-level synchronization in conjunction with the Exchange VSS Writer to help ensure your ability to recover Exchange Server databases. After the initial baseline copy of data, two parallel processes enable continuous data protection with integrity:
· Transaction logs are continuously synchronized to the DPM server, as often as every 15 minutes.
· An “express full” uses the Exchange Server VSS Writer to identify which blocks have changed in the entire production database, and send just the updated blocks or fragments. This provides a complete and consistent image of the data on the DPM 2007 server. DPM 2007 maintains up to 512 shadow copies of the full Exchange Server database(s) by storing only the differences between any two images.
Assuming one “express full” per week, stored as one of 512 shadow copy differentials between one week and the next, plus seven days x 24 hours x 4 (every 15 minutes), DPM 2007 provides over 344,000 data consistent recovery points for Exchange."
Using VSS in an environment with CCR obviously has a number of advantages and disadvantages:
Advantages | Disadvantages |
Backup can be offloaded to the replica database reducing the impact on the active database and clients alike** | Recovering historical data from a point in time prior to my first snap means I need to retain my tape devices - say beyond 7 days and up to 7 years |
Might be able to eliminate or at least significantly reduce any reliance on tape based backups | If I am mandated to keep data offsite I may need to retain my tape devices of replicate my backups offsite |
Very fast backup (after the 1st) | Might require large amounts of additional disk space |
Potentially very fast recovery | Often a little more complex to design and configure |
Only one backup per storage group but with E2K7 a 1:1 ratio of databases:storage groups is recommended and you can run multiple VSS snaps in parallel | |
Faster backup and recovery times means that databases can be larger so therefore fewer servers might be required | |
IS Maintenance will not be interrupted as snaps taker far less time that traditional streaming backups | |
Aside from the first full backup there is little performance impact for clients | |
A solution like DPM means that control of most backups and recoveries is controlled by the messaging team and not by a separate team which can confuse and delay recoveries** |
**Depends on the solution that is deployed as to whether you can take advantage of this.
In an Exchange Server 2007 deployment with CCR the solution that is the easiest to manage and potentially the least expensive is the use of VSS through something like System Centre Data Protection Manager. I particularly like the fact that it can easily be managed from within the messaging team. In my experience numerous disaster recovery situation have taken longer to resolve than they should have, due to miscommunication or a lack of knowledge between the messaging teams and the teams responsible for the backup solution. Other huge advantages are the fact that the VSS requestor can operate against the replica database and therefore not affect either the performance of the active node and or the service to the user community. By only taking changes and continuously synchronising transaction logs it means that not only does the administrator have numerous recovery options it also means that both snaps and restores will be relatively very fast when compared to more traditional backup methods.
The only issue I have with VSS based snaps of this type is that it is vital that during the design phase, administrators understand exactly why backups are being taken, by understanding exactly what the requirements of your business. For example if you need very long term retention of data you are probably going to need to retain tape based backups of some sort and if you have to keep most of your tape infrastructure then do you need disk based recovery as well? Some companies will need to cater for this but others won't. The release of SCR in Exchange 2003 SP1 might also stimulate discussions over whether any form of backup is required at all.
Design 4 - Multiple 2 node MNS Cluster with CCR plus SCR
....to follow.
Comments
Anonymous
December 18, 2007
Just to continue on from previous blogs; 'Backup solutions for Exchange 2007' and 'BackupAnonymous
August 18, 2009
On 2003, Streaming backup to disk acheived similar rates as MSIT. On 2007 CCR, streaming backup to disk only gets a fraction of the speed (like 50MB/min) and according to the event log, the delay is during the log file backup and deletion. Any ideas?