Exchange 2010: Remove-databaseavailabilitygroupserver–configurationOnly does not evict the member from the cluster.

Administrators may encounter conditions where DAG members cannot be gracefully removed from a database availability group.  For example, a member server may have encountered an unrecoverable failure or the server may need to be removed from the DAG in order to perform a server recovery.

 

In order to account for these and similar conditions the remove-databaseavailabilitygroupserver –configurationOnly command exists.  This command, when utilized, simply removes the member from the Database Availability Groups Active Directory object.

 

Here is an example…

 

Using get-databaseavailabilitygroup –status | fl name,servers,operationalservers the membership of the DAG can be verified.  In this example the only operational server is MBX-1 since that is the only server currently running in the DAG.

 

[PS] C:\>Get-DatabaseAvailabilityGroup DAG -status | fl name,Servers,OperationalServers

Name : DAG
Servers : {MBX-1, MBX-2}
OperationalServers : {MBX-1}

 

Using the remove-databaseavailabilitygroupserver –configurationOnly command a DAG member can be removed.

 

[PS] C:\>Remove-DatabaseAvailabilityGroupServer -Identity DAG -MailboxServer MBX-2 -ConfigurationOnly

Confirm
Are you sure you want to perform this action?
Removing Mailbox server "MBX-2" from database availability group "DAG".
[Y] Yes [A] Yes to All [N] No [L] No to All [?] Help (default is "Y"): a

 

The results of the command can be verified using get-databaseavailabilitygroup –status | fl name,servers,operationalservers:

 

[PS] C:\>Get-DatabaseAvailabilityGroup DAG -status | fl name,Servers,OperationalServers

Name : DAG
Servers : {MBX-1}
OperationalServers : {MBX-1}

 

When a server is removed from the DAG in this manner it is not evicted from the corresponding cluster.  You can verify cluster membership using the built in cluster commands.  Here is an example from this test:

 

( Windows 2008 / Windows 2008 R2 )

 

[PS] C:\>cluster.exe node
Listing status for all available nodes:

Node Node ID Status
-------------- ------- ---------------------
MBX-1 1 Up
MBX-2 2 Down

( Windows 2008 R2 )

[PS] C:\>Import-Module FailoverClusters

[PS] C:\>Get-ClusterNode

Name State
---- -----
mbx-1 Up
mbx-2 Down

 

In general this issue surfaces when administrators complete a server rebuild operation and note that the rebuilt node cannot be added back to the cluster because it already exists in the cluster.  Here is an example:

 

[PS] C:\>Add-DatabaseAvailabilityGroupServer –identity DAG –mailboxServer MBX-2

WARNING: The operation wasn't successful because an error was encountered. You may find more details in log file
"C:\ExchangeSetupLogs\DagTasks\dagtask_2012-06-24_14-51-47.841_add-databaseavailabiltygroupserver.log".
A server-side database availability group administrative operation failed. Error: The operation failed. CreateCluster errors may result from incorrectly configured static addresses. Error: An error occurred while attempting a cluster operation. Error: Node mbx-2 is already joined to a cluster. [Server: MBX-1.domain.com]
+ CategoryInfo : InvalidArgument: (:) [Add-DatabaseAvailabilityGroupServer], DagTaskOperationFailedException
+ FullyQualifiedErrorId : D05F37CD,Microsoft.Exchange.Management.SystemConfigurationTasks.AddDatabaseAvailabilityGroupServer

 

 

When using the remove-databaseavailabilitygroupserver –configurationOnly administrators must remove the node from the cluster.  This can be accomplished through two methods:

 

( Windows 2008 / Windows 2008 R2 )

 

Administrators may utilize Failover Cluster Manager.  After connecting to the cluster servicing the Database Availability Group the nodes hive can be expanded.  The administrator can right click on the node that was removed –> select more actions –> evict

 

image

 

( Windows 2008 R2 )

[PS] C:\>Import-Module FailoverClusters

[PS] C:\>Remove-ClusterNode MBX-2

Remove-ClusterNode
Are you sure you want to evict node mbx-2?
[Y] Yes [N] No [S] Suspend [?] Help (default is "Y"): y
Remove-ClusterNode : The cluster node 'MBX-2' was evicted from the cluster, but was not fully cleaned up. Please see the Failover Clustering application event log on node MBX-2 for more information.
The RPC server is unavailable
At line:1 char:19
+ Remove-ClusterNode <<<< MBX-2
+ CategoryInfo : NotSpecified: (:) [Remove-ClusterNode], ClusterCmdletException
+ FullyQualifiedErrorId : Remove-ClusterNode,Microsoft.FailoverClusters.PowerShell.RemoveClusterNodeCommand

 

(Note:  The RPC error is expected as the command attempts to cleanup the local cluster configuration on the node but the node is not accessible)

 

After cleaning up the cluster configuration the administrator can run set-databaseavailabilitygroup –identity <DAGNAME> to ensure the appropriate cluster configuration is utilized.

Comments

  • Anonymous
    April 20, 2014
    Great article! Thanks for taking the time to put this together and for sharing it with the rest of the Exchange Community!

    Best Regards,

    FT
  • Anonymous
    April 23, 2014
    Excellent Article

    I have DAG with 4 nodes on the primary site and 3 nodes on the dr site
    I need to shutdown all 4 nodes on the primary site at the same time for maintenance
    I assume that there is a way of doing that without having to use classic DAG Disaster Recovery (and haing the whole DAG going down) ?

    If I use the procedure described in your article to evict the 4 servers of the primary site , will it be easy to add them back (reseed all databases... ) ?

    Thanks
  • Anonymous
    June 25, 2015
    Thanks for this, really saved my bacon!
  • Anonymous
    June 25, 2015
    go back to profill
  • Anonymous
    July 14, 2015
    Great article! Wondering if you could comment on a scenario. Take a server that's going to be down for a while, but isn't going to be otherwise rebuilt, maybe we need to wait some time for parts or something. The server is part of a 4 node DAG and in the primary data center, so for the long duration of the outage we don't have Quorum in the primary data center alone, a WAN outage could bring down production! I add another server to the DAG to get an extra vote, but that switches the quorum model to "Majority Node" effectively removing the witness vote. The issue is I still don't have Quorum in the primary data center for the duration of the outage, and would have to add yet another server to get it. There are 2 questions I garnish from this:

    1) Is there any way to force the Quorum model to "Majority Node & Witness" regardless of node count?
    2) If no to number 1, does adding the second server at least give me a quick way to re-establish quorum by using Stop-Database AvailabilityGroup -MailboxServer or is there another take like using Remove-DatabaseAvailabilityGroupServer -Configuration Only as you've mentioned above.

    Let me know what you think. Thanks
  • Anonymous
    July 14, 2015
    @Steven:

    This is always a tough situation. There is no way to force the quorum model to change as Exchange enforces it and it's not supported to do so outside of Exchange.

    Stop and remove will not help much here overall, as it would just do what adding or removing the node is.

    You may have to consider adding additional voter nodes.

    TIMMCMIC
  • Anonymous
    August 20, 2015
    The comment has been removed
  • Anonymous
    August 23, 2015
    @KC. ...

    Just to clarify - I have never seen an official support policy for a dormant file share witness server. I personally would not recommend this approach. There are potential issues with this - both with the Kerberos security context of the VM up to the file share witness and the paxos information potentially contained in the file share witness.

    Those that are concerned about witness redundancy should really either consider a legitimate third site witness keeping an additional witness server available and adjusting with set-databaseavailabilitygroup as necessary.

    TIMMCMIC