Share via


Exchange Server: Fail Over and Dynamic Quorum for a Two Node DAG Cluster

Background

The available info mostly based on three or more nodes didn’t really seem to apply on a two-node DAG cluster and sometimes things keep falling apart.

The Test Environment:

  • Two - Exchange 2013 SP1:  NodeA, NodeB

  • One FileShare Witness: WitA

  • Two DCs

  • AD Site:- Default-First-Site

  • Domain\Forest: Single

  • OS: Windows Server 2012 R2

Test Scenarios:

Witness Stays Active






 

 

PAM\Active DBs

 

 

 

 

Seq.

NodeA

NodeB

Witness

Quorum

DAG

Recovery

1

Up

Up

Up

Quorum

Online

-

2

Down

Up

Up

Quorum

Online

-

3

Down

Down

Up

No Quorum

Offline

-

4a

Down

Up

Up

Quorum

Online

Auto

4b

Up

Down

Up

No Quorum

Offline

Manual

5

Up

Up

Up

Quorum

Online

Auto

PAM Stays Active






 

 

PAM\Active DBs

 

 

 

 

Seq.

NodeA

NodeB

Witness

Quorum

DAG

Recovery

1

Up

Up

Up

Quorum

Online

-

2

Down

Up

Up

Quorum

Online

-

3

Down

Up

Down

No Quorum

Offline

-

4a

Down

Up

Up

Quorum

Online

Auto

4b

Up

Up

Down

Quorum

Online

Auto

5

Up

Up

Up

Quorum

Online

Auto

PAM Takeover






Seq.

NodeA

NodeB

Witness

Quorum

DAG

Recovery

1

Up

Up(PAM)

Up

Quorum

Online

-

2

Up(PAM)

Down

Up

Quorum

Online

-

3

Up(PAM)

Down

Down

No Quorum

Offline

-

4a

Up(PAM)

Down

Up

Quorum

Online

Auto

4b

Up(PAM)

Up

Down

Quorum

Online

Auto

5

Up

Up

Up

Quorum

Online

Auto

PAM Takeover-Down






 

 

 

 

 

 

 

Seq.

NodeA

NodeB

Witness

Quorum

DAG

Recovery

1

Up

Up(PAM)

Up

Quorum

Online

-

2

Up(PAM)

Down

Up

Quorum

Online

-

3

Up(PAM)

Down

Down

No Quorum

Offline

-

4

Down(PAM)

Down

Down

No Quorum

Offline

-

5a

Up(PAM)

Down

Up

Quorum

Online

Auto

5b

Up(PAM)

Up

Down

Quorum

Online

Auto

5c

Down(PAM)

Up

Up

No Quorum

Offline

Manual

6

Up

Up

Up

Quorum

Online

Auto

 

Test Scenarios description:

  1. NodeA, NodeB, Witness UP and we have quorum
  2. NodeA Down, All DBs mounted on NodeB, Witness UP and we have quorum
      1. Witness Down, we lose quorum and NodeB loses cluster and DAG, Databases dismounts

        1. NodeA comes up and with NodeB  forms quorum, DBs online
        1. Witness comes up and with NodeA forms quorum, DBs online
      1. NodeB down, Witness alone, we lose quorum and no DB\Mailbox Server in service.

        1. NodeB comes up, with witness forms quorum and DBs online
        1. NodeA comes up, with witness up, doesn’t form quorum, DBs still offline (Manual /forcequorum required)
  3. The last server comes up, checks with other votes and adds itself to quorum.

Point to be noted is if DAC mode is enabled and this is a cross-site scenario there are additional factors to be considered for activation, such as Boot time on the PAM and the Witness Servers etc. We will cover that some other time.

NOTE: We have not covered all scenarios here, as the list can go on.

 

Key Takeaways\Interesting Finds:

  1. If the PAM is the last man standing and goes down. Even if we have the required two votes, service is not restored automatically. This is to avoid a tie and as the server coming up now can have old data. It's up to the admin to take a call and force quorum with old data or wait till the PAM server (last man standing with the latest data) comes online.
  2. Dynamic Quorum doesn’t have much to play in this scenario when we have a witness and last two nodes online. You can refer to my earlier question and understanding on this point here.

Basically, when we have 3voters out of which one is a witness, the quorum freezes at 2votes, irrespective of nodes\witness going down. Also, dynamic weight for all three votes is fixed at 1. Even if servers are Down.

 

References

How does Dynamic Quorum work for a two-node DAG:

https://social.technet.microsoft.com/Forums/office/en-US/ee5bac97-45af-4738-8753-6ef5322bc590/how-does-dynamic-quorum-work-for-a-two-node-dag?forum=exchangesvravailabilityandisasterrecovery