Exchange 2010 SP1: StartDagServerMaintenance.ps1 fails on databases that have only two database copies.

In Exchange 2010 Service Pack 1 we introduced some new DAG management scripts. These scripts can be found in the Exchange Server installation directory \ scripts. (This is usually c:\Program Files\Microsoft\Exchange Server\v14\scripts).

 

One of the scripts introduced is the StartDagServerMaintenance.ps1 script. More information on this script can be found at:

https://technet.microsoft.com/en-us/library/ff625233.aspx

https://technet.microsoft.com/en-us/library/dd298065.aspx

 

When administrators utilize this script the following actions are being taken:

1) All database copies are moved to another server in the DAG based on the selection of the next best copy.

2) If the cluster core resources are owned on the node the resources are arbitrated to a different DAG member (thereby moving the Primary Active Manager functionality to another node).

3) The DatabaseCopyAutoActivationPolicy property of the mailbox server is set to a value of BLOCKED thereby preventing the DAG member from receiving or activating database copies.

4) The individual database copies hosted on the DAG member are activation suspended.

5) The node is paused within the cluster service preventing the cluster core resources from arbitrating to the node (and thereby preventing the node from becoming the Primary Active Manager).

 

When an administrator attempts to place a DAG member into maintenance mode and the DAG member hosts an ACTIVE database that has only two copies the following occurs:

1)  The database copy is moved to the other node hosting the passive copy (pending the copy is healthy).

2)  The command fails with the following error after the database is moved.  (In this example the mounted copy is on server DAG-4).

 

*Pre StartDagServerMaintenance*

Name Status CopyQueue ReplayQueue LastInspectedLogTime ContentIndex
Length Length State
---- ------ --------- ----------- -------------------- ------------

TESTSCRIPT\DAG-4 Mounted 0 0 Healthy

TESTSCRIPT\DAG-3 Healthy 0 0 7/25/2011 10:17:30 AM Healthy

*StartDagServerMaintenance*

 

[PS] C:\Program Files\Microsoft\Exchange Server\V14\Scripts>.\StartDagServerMaintenance.ps1 DAG-4
The following objects are hosted by 'DAG-4', before attempting to move them off: `n(Database='TESTSCRIPT', Reason='Copy is active'))
Write-Error : The following objects are still hosted by 'DAG-4', even after attempting to move them off: `n(Database='TESTSCRIPT', Reason='Copy is critical for redundancy according to Red Alert script'))
At C:\Program Files\Microsoft\Exchange Server\V14\Scripts\StartDagServerMaintenance.ps1:216 char:16
+ write-error <<<< ($StartDagServerMaintenance_LocalizedStrings.res_0014 -f ( PrintCriticalMailboxResourcesOutput($criticalMailboxResources)),$shortServerName) -erroraction:stop
+ CategoryInfo : NotSpecified: (:) [Write-Error], WriteErrorException
+ FullyQualifiedErrorId : Microsoft.PowerShell.Commands.WriteErrorException,Microsoft.PowerShell.Commands.WriteErrorCommand

*Post StartDagServerMaintenance*

 

Name Status CopyQueue ReplayQueue LastInspectedLogTime ContentIndex
Length Length State
---- ------ --------- ----------- -------------------- ------------
TESTSCRIPT\DAG-3 Mounted 0 0 Healthy
TESTSCRIPT\DAG-4 Healthy 0 0 7/25/2011 10:33:57 AM Healthy

When an administrator attempts to place a DAG member into maintenance mode and the DAG member hosts an PASSIVE database that has only two copies the following occurs:

1) The command fails with the following error after the database is moved. (In this example the passive copy is on server DAG-4).

 

*Pre StartDagServerMaintenance*

 

Name Status CopyQueue ReplayQueue LastInspectedLogTime ContentIndex
Length Length State
---- ------ --------- ----------- -------------------- ------------
TESTSCRIPT\DAG-3 Mounted 0 0 Healthy
TESTSCRIPT\DAG-4 Healthy 0 0 7/25/2011 10:33:57 AM Healthy

 

*StartDagServerMaintenance*

 

[PS] C:\Program Files\Microsoft\Exchange Server\V14\Scripts>.\StartDagServerMaintenance.ps1 DAG-4
The following objects are hosted by 'DAG-4', before attempting to move them off: `n(Database='TESTSCRIPT', Reason='Copy is active'))
Write-Error : The following objects are still hosted by 'DAG-4', even after attempting to move them off: `n(Database='TESTSCRIPT', Reason='Copy is critical for redundancy according to Red Alert script'))
At C:\Program Files\Microsoft\Exchange Server\V14\Scripts\StartDagServerMaintenance.ps1:216 char:16
+ write-error <<<< ($StartDagServerMaintenance_LocalizedStrings.res_0014 -f ( PrintCriticalMailboxResourcesOutput($criticalMailboxResources)),$shortServerName) -erroraction:stop
+ CategoryInfo : NotSpecified: (:) [Write-Error], WriteErrorException
+ FullyQualifiedErrorId : Microsoft.PowerShell.Commands.WriteErrorException,Microsoft.PowerShell.Commands.WriteErrorCommand

 

*Post StartDagServerMaintenance*

 

Name Status CopyQueue ReplayQueue LastInspectedLogTime ContentIndex
Length Length State
---- ------ --------- ----------- -------------------- ------------
TESTSCRIPT\DAG-3 Mounted 0 0 Healthy
TESTSCRIPT\DAG-4 Healthy 0 0 7/25/2011 10:33:57 AM Healthy

Administrators can find manual maintenance mode instructions available in the following blog post:

https://blogs.technet.com/b/timmcmic/archive/2011/07/25/exchange-2010-sp1-startdagservermaintenance-ps1-fails-when-a-server-contains-databases-with-a-single-copy.aspx

 

After completing the manual instructions and when maintenance mode is no longer needed the administrator may utilize the StopDagServerMaintenance.ps1 script to revert the manual changes.

Comments

  • Anonymous
    January 01, 2003
    @Dan: I'd like to see the output of the failure to confirm. I do not expect the design of this script to be changed. TIMMCMIC

  • Anonymous
    January 01, 2003
    @Peddy1st You are correct.  This is a switch that was added to correct this condition in SP2 RU2 (I believe). Technet documentation is being updated to offiicially reflect this. TIMMCMIC

  • Anonymous
    January 01, 2003
    @Amir The script today has protections to ensure > 1 viable copy left after a node is put into maintenance mode. TIMMCMIC

  • Anonymous
    January 01, 2003
    @Dan: You are correct - the stop and start DAG server scripts reset your atttributes to their defaults.  IE - if you set a mailbox copy auto activation policy to instrasite it will be reset to unrestricted.  There's nothing in the script to remember what you had before. For customers with custom settings they either utilize these scripts wrapped in another script to reset them back to their preferred settings or given your circumstances abandon their use. TIMMCMIC

  • Anonymous
    January 01, 2003
    @adi You are correct.  It was pointed out to me after this that it's when there are two copies of a database in greater than a two node DAG. Interestingly enough this scenario is fixed in Exchange 2010 SP2 RU1. TIMMCMIC

  • Anonymous
    October 16, 2011
    Is there a specific reason as to why the script behaves in this way? Is it related to DCs not replicating correctly or what?

  • Anonymous
    December 30, 2011
    Tim, Any chance the person who wrote the script will offer a modification of the script to work with a DAG with only 2 database replicas? FYI - we have 3 replicas with the 3rd being on redudant DAG nodes in a second datacenter for DR purposes, and the script fails. So it looks like this fails not just for DAGs with 2 database replicas, but more specifically for DAGs with only 2 local database replicas.

  • Anonymous
    January 13, 2012
    We will have to wait until our next maintenance window to get you the output. Side note - we had originally set all four of our DAG nodes in our passive datacenter to: DatabaseCopyAutoActivationPolicy : IntrasiteOnly on top of setting our DAG to DAC mode to help prevent database failovers/moves to the passive datacenter. We used the start and stop scrpts on the nodes in our passive datacenter which seemed to work fine (since they were a tertiary copy of the databases), but we just noticed at our last maintenance window some databases actually migrated over to the passive datacenter and we were stumped as to why. Apparently the StopDagServerMaintenance.ps1 script set the mailbox servers back to "Unrestricted" after our first post-DR deloyment maintenance window which allowed this to happen as a result of the Move-ActiveMailboxDatabase cmdlet in our second post-DR deployment maintenance window. Do you have any suggestions other then completely abandoning the start and stop maintenance scripts how to keep our servers in the passive datacenter from being potential targets for the move-mailboxdatabase cmdlet? It seems as if the StopDagServerMaintenance script will always set the AutoActivationPolicy back to Unrestricted, and this is not ideal in an Active/Passive datacenter DAG deployment.

  • Anonymous
    January 30, 2012
    Had the same scenario - 2 member dag with 2 copies - Didnt get this error. We got the same error with 3 member dag where some databases had only 2 copies. Environment is at E2K10 SP1 RU4v2.

  • Anonymous
    February 07, 2012
    @Tim - That's excellent news as we had almost given up and started to write our own maintenance scripts to try and automate the same steps in the current scripts but w/o the limitation discussed. Also do you still want to see the output of the script failing in our enviornment givne the information on the change in SP2 RU1? Also, and I know this is always a tough question to answer, but do you have a rought time frame when RU1 might be out? I'm not looking for a specific date, just a rough idea in months when it might be out so we know how long we will have to limp along running all the commands by hand. Thanks for the follow up on this BTW.

  • Anonymous
    February 13, 2012
    Nevermind on the question of when SP1 RU1 will be out - it was released today and here is the specific KB# regarding the scripting issues: support.microsoft.com/.../2585649 Thanks again for staying on top of stuff like this Tim!

  • Anonymous
    July 04, 2012
    The comment has been removed

  • Anonymous
    July 07, 2014
    Greetings,

    Should this script work with Exchange 2010 SP3 when there are only two copies of the databases? (One active, one passive) Powershell gives an error when running the script. Manual switchover works fine, and suspending replication on each DB works fine.

    I am sure this worked previously, but maybe I did not have a copy set up.

    Any info would be appreciated.

  • Anonymous
    June 24, 2015
    I am also experiencing the EXACT same behavior on a fully patched Exchange 2010 environment (2010 Enterprise, SP3, Rollup 10). In this case, I have 2 Nodes, looking to setup a 3rd in another Datacenter for DR. Databases have DB copies on each node, but when I attempt to run the StartDagServerMaintenance script I get the following:
    Copy is critical for redundancy according to Red Alert script

    What's even more frustrating is even though all resources are moved to DAG2, when the server hosting DAG1 is rebooted, DAG2's Databases all dismount as well.

    This is unacceptable.

    • Anonymous
      May 08, 2016
      @John:You should most likely open a support case. I believe there may be other issues present.TIMMCMIC