What You Need To Know About Exchange 2010 DAG Failover Behavior
Article written by Jaroslav Zikmund, Microsoft Premier Field Engineer.
In this article, I’ll provide some insight into changes in Microsoft Exchange Server 2010 Database Availability Group (DAG) failover behavior. The Exchange 2010 mailbox server which is part of the DAG uses windows clustering in a different way than previous versions and many administrators who are familiar with previous versions of Exchange may have incorrect expectations.
If we have Exchange 2007 or 2003 in a failover cluster and we stop some Exchange services (for example, the Exchange Store), most people would expect that the Exchange virtual server will be successfully moved to another node in the cluster. Legacy clusters work this way, but what happens if we stop the Microsoft Exchange Information Store service on DAG member? From a failover clustering server point of view, basically nothing happens. From end user point of view, depending on the Microsoft Outlook version and mode, Outlook will stop receiving emails, freeze, or will be disconnected from the server.
In an Exchange 2010 DAG environment, if an administrator or some other service manually stops the information store, DAG will not try to failover the database to a different node. The database will be simply dismounted.
Failover behavior in Exchange 2007
All cluster resources are monitored and tested by a Cluster Resource Monitor (resrcmon.exe). In the event that an administrator manually takes the Information Store offline using Resource Manager, the cluster manager detects this changes and, depending on resource setting in cluster, restarts the resource or initiates failover.
Failover behavior in Exchange 2010
The Exchange Information Store is no longer a cluster resource, so the cluster resource monitor is no longer responsible for checking the Information Store status. In this case, if an administrator stops the service, Exchange server views this is a standard administrator action and not an error. This is similar to Exchange 2007 server if you take a resource offline using cluster manager. This behaviour doesn’t mean that Exchange is not checking service status anymore; if the service crashes, the Active manager will initiate failover and behaviour is similar as in Exchange 2007. For test purposes, a service crash can be simulated using the PowerShell command get-process store | kill
(for obvious reasons this type of test should typically be run in lab scenario and not in a production environment).
What do Exchange 2007 and 2010 failover have in common?
One item in terms of failover is common for Exchange 2007 and 2010: the Windows Cluster Administrator should not be used for regular Exchange administration, or to initiate failover/switchover. The Exchange Management Console or Exchange Management Shell should be your primary tools for Exchange administration.