Freigeben über


Exchange 2010 Datacenter switchover

when exchange 2010 was released and the DAG feature was first introduced  everybody was excited about it and looking forward to deploy it, but performing the disaster recovery is a nightmare because it requires multiple steps so I thought sharing the steps to do a site switchover in a scenario based will help many people who would like to either create a DR document or do the actual site switchover

in this Blog I will only cover the mailbox role and will do simulate a mailbox servers failure in production site that requires a full site switchover, there are other steps you need to do to activate the other server roles that are not mentioned here 

we have the below Organization that contains two Sites

CLT site (production)

ADA-CLT-MBX, ADA-CLT-MBX02, ADA-CLT-HC

SEA site (DR)

ADA-SEA-MBX, ADA-SEA-HC

we are going to deploy a DAG on three Mailbox servers and learn how to do site switchover

first we need to create a new domain called  Adatum.com

place one DC in CLT site and another one in SEA site

install exchange on all servers in both CLT and SEA sites

follow this blog to implement DAG

 https://blogs.technet.com/b/winde76/archive/2011/03/23/step-by-step-create-a-database-availability-group-dag.aspx

 

No DAC Enabled

to test our DR site switch over we need to do the following steps

  • simulate the production site power failure by switching off all exchange mailbox servers in main site
  • Stop the Cluster service on each DAG member in the second data-center by running the following command on each member:

net stop clussvc

  • On a DAG member in the second data-center, force a quorum start of the Cluster service by running the following command:

net start clussvc /forcequorum

 

 

  • Open the Fail-over Cluster Management tool and connect to the DAG's underlying cluster. Expand the cluster, and then expand Nodes. Right-click each node in the primary data-center, select More Actions, and then select Evict.

 

  • activate mailbox servers in DR site

The quorum must be modified based on the number of DAG members in the second data-center.

  • If there's an odd number of DAG members, change the DAG quorum model from a Node a File Share Majority to a Node Majority quorum by running the following command:

    cluster <DAGName> /quorum /nodemajority

 

  • If there's an even number of DAG members, reconfigure the witness server and directory by running the following command in the Exchange Management Shell:
 Set-DatabaseAvailabilityGroup <DAGName> -WitnessServer <ServerName>
 in our scenario we have odd number so we will use
  
 cluster loayaldag /quorum /nodemajority
  

  
  •  Start the Cluster service on any remaining DAG members in the second data-center by running the following command:
    
 net start clussvc
  
  •  Perform server switch-overs to activate the mailbox databases in the DAG by running the following command 

get-mailboxdatabase | Move-ActiveMailboxDatabase -ActivateOnServer ADA-SEA-MBX -SkipActiveCopyChecks -SkipHealthChecks -SkipClientExperienceChecks -SkipLagChecks -MountDialOverride:Besteffort****

 

 

 

if after the above command the databases are not mounted you can run this command

 

Get-MailboxDatabase <DAGMemberinSecondSite> | Mount-Database

 

now you need to change the OWA url and the MX records to point to the DR HUB and CAS servers

 

 

after the power is restored in the Primary site we need to reactivate the service

 

  • start all mailbox servers

 

  • remove the copies from the Primary site

 

 

 

  • remove the servers from the DAG

 

 

  • add the servers back to the DAG either using EMC or Add-DatabaseAvailabilityGroupServer 

 

  • add database copies again

 

 exchange server is in DAC mode

 

 

  • first we need to enable DAC mode by running

 

Set-DatabaseAvailabilityGroup -Identity loayaldag -DatacenterActivationMode DagOnly 

 

 

  • simulate the failure shutdown the mailbox servers in the primary site

 

The Cluster service must be stopped on each DAG member in the second data-center

 

Stop-Service ClusSvc "exchange management shell"

or

net stop clussvc "cmd"

 

  •  The DAG members in the primary data-center must be marked as stopped in the primary data-center. Stopped is a state of Active Manager that prevents databases from mounting, and Active Manager on each server in the failed data-center is put into this state by using the stop-DatabaseAvailabilityGoup cmdlet from the Primary site servers, If the Mailbox server is unavailable but Active Directory is operating in the primary data-center, the Stop-DatabaseAvailabilityGroup command with the ConfigurationOnly parameter must be run against all servers in this state in the primary data-center

 

in our scenario the mailbox servers are off but the AD is still active so we will use the  ConfigurationOnly parameter

 

Stop-DatabaseAvailabilityGroup -Identity loayaldag -ActiveDirectorySite CLT -ConfigurationOnly

 

 

  • to complete activation of the mailbox servers in the second data-center are as follows

 

 

The Mailbox servers in the standby datacenter are then activated by using the Restore-DatabaseAvailabilityGroup cmdlet,The Active Directory site of the standby datacenter is passed to the Restore-DatabaseAvailabilityGroup cmdlet to identify which servers to use to restore service and to configure the DAG to use an alternate witness server. If the alternate witness server wasn't previously configured, you can configure it by using the AlternateWitnessServer and AlternateWitnessDirectory parameters of the Restore-DatabaseAvailabilityGroup cmdlet

 

 

 

Restore-DatabaseAvailabilityGroup -Identity loayaldag -ActiveDirectorySite sea -AlternateWitnessServer ada-sea-hc -AlternateWitnessDirectory c:\loayaldag

 

  •  The databases can now be activated. Depending on the specific configuration used by the organization, this may not be automatic. If the servers in the standby datacenter have an activation blocked setting, the system won't do an automatic failover from the primary datacenter to the standby datacenter of any database. If no failover restrictions are present for any of the database copies in the standby datacenter, the system will activate copies in the second datacenter assuming they are healthy. If databases are configured with an activation blocked setting that requires explicit manual action, there are two choices for action:
  1. Clear the setting that blocks activation. This will make the system return to its default behavior, which is to activate any available copy.
  2. Leave the setting unchanged and use the Move-ActiveMailboxDatabase cmdlet to complete the database activation in the second datacenter. To complete this step using the Move-ActiveMailboxDatabase cmdlet when activation blocked is set, you must explicitly identify the target of the move.

 in our scenario there is no block so we will just go for activation

 

get-mailboxdatabase | Move-ActiveMailboxDatabase -ActivateOnServer ada-sea-mbx -SkipActiveCopyChecks -SkipClientExperienceChecks -SkipHealthChecks -SkipLagChecks**  -MountDialOverride:besteffort

 

  • after we fix the issues on the servers in the clt site we need to restore the service and add them back to DAG

 

  Start-DatabaseAvailabilityGroup -Identity loayaldag -ActiveDirectorySite clt

 

 

Set-DatabaseAvailabilityGroup

 

  • After the Mailbox servers in the primary data-center have been incorporated into the DAG, they will need some time to synchronize their database copies. Depending on the nature of the failure, the length of the outage, and actions taken by an administrator during the outage, this may require reseeding the database copies
  

 hope that the above is clear and straightforward