Dela via


Exchange 2007 – Using VSS to perform an online offline database seed.

When using continuous replication in Exchange 2007, an operation that sometimes needs to be performed is a database seed.  This operation is typically performed as part of enabling replication, and infrequently it is performed as part of the process for recovering from divergence.

Seeding is most often performed with the Update-StorageGroupCopy cmdlet. During seeding, an ESE streaming backup is performed on the source database.  This API is used to copy the database from the source to the target.  There are sometimes where this process fails or for various reasons cannot be utilized.  This means an alternate way to seed the database is needed.

One method is to perform a manual offline seeding. In this operation, the source database is dismounted, verified to be in a clean shutdown state, and then manually copied offline to the target. This can obviously be inconvenient, since the source database has to be down while the copy procedure is being performed.

Another method is to use a VSS backup of the database to seed the database copy.  You can use VSS to back up the database, and VSS to restore the database.  (Sorry – if you are using the online streaming backup API for your databases you will not be able to use these instructions).

When using an Exchange-aware VSS application, there are typically four destinations for a restore (note, your backup software may not enable all the options):

  1. Original storage group
  2. Alternate storage group
  3. Recovery storage group
  4. File system

To use the VSS backup and restore method, you would choose to restore to the file system.

The following steps outline a high level process on how to utilize a VSS backup and restore to file system to complete an online offline database seed operation.

====================================

The first step is to enable replication for the storage group.  In CCR this is handled for you automatically every time you create a database. When using SCR or LCR, this is accomplished using the Enable-StorageGroupCopy cmdlet. It is important to ensure that neither circular logging nor backups truncate any of the log files necessary to complete this process.

(SCR)
Enable-StorageGroupCopy –Identity <ServerName\StorageGroupName> –StandbyMachine <SCRTargetName> –SeedingPostponed

(LCR)
Enable-DatabaseCopy –Identity <ServerName\DatabaseName> –CopyEdbFilePath “path\database.edb”

(LCR)
Enable-StorageGroupCopy –Identity <ServerName\StorageGroupName> –CopyLogFolderPath <path> –CopySystemFolderPath <path> –SeedingPostponed

For more information on enabling SCR, please see my blog post at https://blogs.technet.com/timmcmic/archive/2009/01/22/inconsistent-results-when-enabling-standby-continuous-replication-scr-in-exchange-2007-sp1.aspx

If you have already enabled continuous replication for the storage group, proceed to the second step.

====================================

The second step is to ensure that the storage group copy is in a suspended state.  Storage group copies can be suspended either in bulk or one at a time.  The following are example commands:

(All Storage Groups)
Get-StorageGroup –Server <SourceServerName> | Suspend-StorageGroupCopy –StandbyMachine <TargetMachineName>

(Single Storage Group)
Suspend-StorageGroupCopy –identity <ServerName\StorageGroupName> –StandbyMachine <TargetMachineName>

It is important that in the SCR environment these commands are run on both the source and target servers.  All servers should indicate a suspended status, reflecting that both Active Directory replication and the Microsoft Exchange Replication service configuration updates occurred successfully.

====================================

The third step is to note the important paths that are necessary to complete the rest of these steps. Specifically, we are interested in the storage group log file path, the system folder path and copy system folder path, and the log file prefix.  For the mailbox database we are interested in the database file path and copy database file paths.

To get all paths for all storage groups on the source, use the following command:

Get-StorageGroup –Server <ServerName> | fl Name,LogFolderPath,SystemFolderPath,CopyLogFolderPath,CopySystemFolderPath,LogFilePrefix

This will give you a formatted list of storage group names, log paths, and system paths.

To get the paths for all mailbox databases, use the following command:

Get-MailboxDatabase –Server <ServerName> | fl Name,EdbFilePath,CopyEdbFilePath

This will give you a formatted list of mailbox database names and mailbox database paths.

Here is an example of the output you can expect to see (copy path attributes will only be populated if you are utilizing LCR):

Name : Mailbox Database LCR
EdbFilePath : d:\SG1\DB1.edb
CopyEdbFilePath : d:\SG1-LCR\DB1.edb

Name : Mailbox Database CCR or SCR
EdbFilePath : d:\SG2\DB2.edb
CopyEdbFilePath :

Name : Storage Group LCR
LogFolderPath : d:\SG1
SystemFolderPath : d:\SG1
CopyLogFolderPath : d:\SG1-LCR
CopySystemFolderPath : d:\SG1-LCR
LogFilePrefix : E00

Name : Storage Group CCR or SCR
LogFolderPath : d:\SG2
SystemFolderPath : d:\SG2
CopyLogFolderPath :
CopySystemFolderPath :
LogFilePrefix : E01

====================================

The fourth step is to verify that the source log file sequence is in order.  If the source log file sequence has been manually manipulated, and if any log file gaps are present, this results in a failure of the seed operation.  This step ensures that log files are in sequence on the source machine.

To ensure that the log sequence on the source machine is in the correct order, perform the following operations:

1. Open a command prompt and navigate to the log directory of the storage group.  This path can be found from the output gathered in step 3 above.

2. Run the following eseutil command:

eseutil /ml <LogFilePrefix>

The log file prefix can be found from the output gathered in step 3.

When you run this command it will scan every log file found in the source directory.  If any gaps or errors are identified, you cannot continue with these steps.  If the command completes and errors on the last log file in the series this is expected, as the Exx.log is currently open for writing and cannot be scanned.  The following is sample output that you should receive for a storage group that is online.

Extensible Storage Engine Utilities for Microsoft(R) Exchange Server
Version 08.02
Copyright (C) Microsoft Corporation. All Rights Reserved.
Initiating FILE DUMP mode...

Verifying log files...
Base name: e00

      Log file: d:\SG1\E0000001353.log - OK
Log file: d:\SG1\E0000001354.log - OK
Log file: d:\SG1\E0000001355.log - OK
Log file: d:\SG1\E0000001356.log - OK
Log file: d:\SG1\E0000001357.log - OK
Log file: d:\SG1\E0000001358.log - OK
Log file: d:\SG1\E0000001359.log - OK
Log file: d:\SG1\E000000135A.log - OK
Log file: d:\SG1\E000000135B.log - OK
Log file: d:\SG1\E000000135C.log - OK
Log file: d:\SG1\E000000135D.log - OK
Log file: d:\SG1\E000000135E.log - OK
Log file: d:\SG1\E000000135F.log - OK
Log file: d:\SG1\E0000001360.log - OK
Log file: d:\SG1\E0000001361.log - OK
Log file: d:\SG1\E0000001362.log - OK
Log file: d:\SG1\E0000001363.log - OK
Log file: d:\SG1\E0000001364.log - OK
Log file: d:\SG1\E0000001365.log - OK
Log file: d:\SG1\E0000001366.log - OK
Log file: d:\SG1\E0000001367.log - OK
Log file: d:\SG1\E0000001368.log - OK
Log file: d:\SG1\E0000001369.log - OK
Log file: d:\SG1\E00.log
ERROR: Cannot open log file (d:\SG1\E00.log). Error -1032.

Operation terminated with error -1032 (JET_errFileAccessDenied, Cannot access file, the file is locked or in use) after 368.625 seconds.

====================================

The fifth step is to perform a VSS backup of the database. Please consult with your backup vendor to ensure that a successful FULL backup is performed.  Please also make sure that a consistency check of the backup is performed.

====================================

The sixth step is to restore the VSS backup.  When you perform the restore, you should select the option to restore to file system.  This may require that you restore to the file system of an Exchange server, so it may be necessary to ensure that sufficient free space exists on a volume on the Exchange server where the restore will be performed.

For SCR and CCR restore to the original server.  In our example we will say that we are restoring to the original server at x:\Restore.

For LCR we can restore to the CopyEdbFilePath.  In our example you would restore to d:\SG1-LCR.  This will prevent us from having to run a copy operation at a later time.

If multiple databases are being restored, I recommend that databases be restored individually.

At this point we now have the EDB file on the file system, and we will use it for the seeding operation.

====================================

The seventh step is to ensure that the target paths are ready to have the database moved in place.  The paths referenced in these steps can be obtained from the output gathered in step 3.

For SCR – ensure that the logFolderPath, systemFolderPath, and edbFilePath are empty on the SCR target.

For CCR – ensure that the logFolderPath, systemFolderPath, and edbFilePath are empty on the passive node.

For LCR – ensure that the copyLogFolderPath, copySystemFolderPath, and copyEdbFilePath are empty.

At this point the destination paths are empty and ready for the database to be moved.

We now need to create the directory structure where logs, system, and database files will be copied.

For SCR and CCR - create the log, system, and database folder.  In our example logs, system, and database files are located at d:\SG1.  Therefore on the SCR target or CCR passive node I would create the directory structure d:\SG1.

For LCR - we would create the copy folder.  In our example we have copies placed in d:\SG1-LCR.  Therefore on the server I would create the directory structure D:\SG1-LCR.

If you are using nested folders you need to create the entire directory structure.

====================================

The eighth step is to move the restored database to the target directory.  This can be accomplished in a few different ways, but I will make a recommendation below.

For CCR and SCR:

From the source server map a drive to the drive$ share of the target.  For example, I would map the drive Y: to [\\SCRTarget\d$\](file:///\\SCRTarget\d$\) using our example.

Open a command prompt and navigate to the restore directory.  In our example this is X:\Restore

Use Eseutil to copy the database from the source directory to the target directory.  The sample command using our example is:

eseutil /y SG1-DB1.edb /d y:\SG1-DB1.edb

Here is the expected output from this command:

Extensible Storage Engine Utilities for Microsoft(R) Exchange Server
Version 08.02
Copyright (C) Microsoft Corporation. All Rights Reserved.

Initiating COPY FILE mode...
Source File: SG1-DB1.edb

Destination File: y:\SG1-DB1.edb

                      Copy Progress (% complete)

          0 10 20 30 40 50 60 70 80 90 100

          |----|----|----|----|----|----|----|----|----|----|

          ...................................................

Operation completed successfully in 13.281 seconds.

At this point the copy has been seeded on the target server.

For LCR, this step is not necessary as the restoration to file system was performed to the LCR location.

Information on the usage of Eseutil can be found here.  https://technet.microsoft.com/en-us/library/aa998249(EXCHG.80).aspx

====================================

The ninth step is to verify the health of the copied database.  We need to ensure that the database was not corrupted as a part of the copy process.

For SCR and CCR:

Log on locally to the SCR target or CCR passive node, open a command prompt, and navigate to the database directory.  In our example this would be d:\SG1.

Use Eseutil /k to perform a checksum of the database:

eseutil /k SG1-DB1.edb

The following output will be observed when the command completes:

Extensible Storage Engine Utilities for Microsoft(R) Exchange Server
Version 08.02
Copyright (C) Microsoft Corporation. All Rights Reserved.

Initiating CHECKSUM mode...
Database: SG1-DB1.edb
Temp. Database: TEMPCHKSUM3888.EDB

File: SG1-DB1.edb

                     Checksum Status (% complete)

          0 10 20 30 40 50 60 70 80 90 100

          |----|----|----|----|----|----|----|----|----|----|

          ...................................................

514 pages seen
0 bad checksums
0 correctable checksums
129 uninitialized pages
0 wrong page numbers
0x4676 highest dbtime (pgno 0x86)
65 reads performed
4 MB read
1 seconds taken
4 MB/second
2755 milliseconds used
42 milliseconds per read
78 milliseconds for the slowest read
15 milliseconds for the fastest read

Operation completed successfully in 0.140 seconds.

We are interested in ensuring that there are 0 bad checksums (bolded line above).

For LCR, this command should be run locally on the machine.

Open a command prompt and navigate to the copy database directory.  In our example, this would be d:\SG1-LCR.

Use Eseutil /k to perform a checksum of the database:

eseutil /k SG1-DB1.edb

The following output will be observed when the command completes:

Extensible Storage Engine Utilities for Microsoft(R) Exchange Server
Version 08.02
Copyright (C) Microsoft Corporation. All Rights Reserved.

Initiating CHECKSUM mode...
Database: SG1-DB1.edb
Temp. Database: TEMPCHKSUM3888.EDB

File: SG1-DB1.edb

                     Checksum Status (% complete)

          0 10 20 30 40 50 60 70 80 90 100

          |----|----|----|----|----|----|----|----|----|----|

          ...................................................

514 pages seen
0 bad checksums
0 correctable checksums
129 uninitialized pages
0 wrong page numbers
0x4676 highest dbtime (pgno 0x86)
65 reads performed
4 MB read
1 seconds taken
4 MB/second
2755 milliseconds used
42 milliseconds per read
78 milliseconds for the slowest read
15 milliseconds for the fastest read

Operation completed successfully in 0.140 seconds.

We are interested in ensuring that there are 0 bad checksums (bolded line above).

====================================

The last step in the process is to resume the storage group copy::

(SCR):   Get-StorageGroup –Server <SourceServerName> | Resume-StorageGroupCopy –StandbyMachne <SCRTargetName>

(CCR / LCR):   Get-StorageGroup –Server <SourceServerName> | Resume-StorageGroupCopy

(Note:  These command resume storage group copy for all storage groups.  If you have a storage group that has copy suspended for another reason it may be necessary to resume single storage groups).

When replication has resumed successfully, you can note the following events in the application log indicating that replication began copying log files.

Event Type: Information
Event Source: MSExchangeRepl
Event Category: Action
Event ID: 2084
Date: 3/16/2010
Time: 10:12:50 AM
User: N/A
Computer: SERVER
Description: Replication for storage group SERVER\Storage Group SCR or CCR has been resumed.

For more information, see Help and Support Center at https://go.microsoft.com/fwlink/events.asp.

Event Type: Information
Event Source: MSExchangeRepl
Event Category: Service
Event ID: 2114
Date: 3/16/2010
Time: 10:13:19 AM
User: N/A
Computer: SERVER
Description: The replication instance for storage group SERVER\Storage Group SCR or CCR has started copying transaction log files. The first log file successfully copied was generation 31201.

For more information, see Help and Support Center at https://go.microsoft.com/fwlink/events.asp.

====================================

The following are links to references from this post.

· Enable-StorageGroupCopy (https://technet.microsoft.com/en-us/library/aa996389(EXCHG.80).aspx)

· Enable-DatabaseCopy (https://technet.microsoft.com/en-us/library/aa996389(EXCHG.80).aspx)

· Suspend-StorageGroupCopy (https://technet.microsoft.com/en-us/library/aa998182(EXCHG.80).aspx)

· Get-StorageGroup (https://technet.microsoft.com/en-us/library/aa998331(EXCHG.80).aspx)

· Get-MailboxDatabase (https://technet.microsoft.com/en-us/library/bb124924(EXCHG.80).aspx)

· ESEUTIL (https://technet.microsoft.com/en-us/library/aa998249(EXCHG.80).aspx)

· Resume-StorageGroupCopy (https://technet.microsoft.com/en-us/library/bb124529(EXCHG.80).aspx)