Another Reason to Include a Director in Your Lync Server 2010 Deployment

Update 1/1/11 - Updated post with more information on what happens with an Enterprise Edition pool and how EndpointConfiguration.cache adds a wrinkle to the whole process.

In Lync Server 2010 the Director is now a dedicated role.  Because there's an actual Director role now you don't have to worry about users accidentally getting homed on the Director, like you did in OCS 2007 R2.  But the real reason to take another look at the Director role in Lync Server 2010 has to do with the new resiliency option.  Specifically the ability to associate a backup registrar pool.  The Director plays an important role in informing the clients what their primary and backup registrars are.

So the natural question that comes up is "Is a Director required for the backup registrar ability to work?".  The answer to that is technically no, it isn't required.  The Director isn't providing any special functionality.  Just like in OCS 2007 R2, the Front End Servers have the same Registrar service/functionality as the Director.  If a user registers against a registrar that isn't their primary registrar, a 301 Redirect will be returned that includes the user's primary and backup registrar.  A Front End Server or a Director both have the ability to do this.  There is another option described below, but it has some limitations of it's own.  Using a Director is the recommended and simplest way to take advantage of this functionality.

The trace below shows the response from the Director to the client during client sign in: 

12/20/2010|11:56:24.001 728:A3C INFO :: Data Received - 172.16.8.8:5061 (To Local Address: 172.16.8.7:50327) 756 bytes:
12/20/2010|11:56:24.001 728:A3C INFO :: SIP/2.0 301 Redirect request to Home Server
Authentication-Info: TLS-DSK qop="auth", opaque="E44B1AA4", srand="50E900F0", snum="1", rspauth="23003a3babd1b5a8b3b02c34372b07e66a8b0bfa", targetname="BETA-LS14-DIR.beta.deitterick.com", realm="SIP Communications Service", version=4
From: <sip:acooper@beta.deitterick.com>;tag=4787346104;epid=c68d71323c
To: <sip:acooper@beta.deitterick.com>;tag=FBF2A5131E02B962310A8078B52D77C3
Call-ID: a59c3223b23b4d398d4c77b95769e1b1
CSeq: 4 REGISTER
Via: SIP/2.0/TLS 172.16.8.7:50327;ms-received-port=50327;ms-received-cid=100
Contact: <sip:beta-ls14-se1.beta.deitterick.com:5061;transport=TLS>;q=0.7
Contact: <sip:beta-ls14-se2.beta.deitterick.com:5061;transport=TLS>;q=0.3
Expires: 2592000
Content-Length: 0

12/20/2010|11:56:24.001 728:A3C INFO :: End of Data Received - 172.16.8.8:5061 (To Local Address: 172.16.8.7:50327) 756 bytes

The two important lines from the redirect are these:

Contact: <sip:beta-ls14-se1.beta.deitterick.com:5061;transport=TLS>;q=0.7
Contact: <sip:beta-ls14-se2.beta.deitterick.com:5061;transport=TLS>;q=0.3

The Director passes back to the client both the user's primary and backup registrar.  The q= value tells you if the server is the primary or secondary.  In this case, beta-ls14-se1.beta.deitterick.com is the primary, and beta-ls14-se2.beta.deitterick.com is the backup.  The q=0.7 specifies the primary registrar and the q=0.3 specifies the backup registrar.

If we take the Director out of the environment and set the SRV record to point to beta-ls14-se1.beta.deitterick.com any user that is homed on that pool won't be notified of their backup registrar.  Looking at the trace below from a user homed on that server you can see that there isn't a 301 Redirect returned to the client:

This means that the client doesn't know about it's backup registrar.  If the server/pool that the user is homed on is unavailable the user will be unable to register with the backup registrar.

There are two ways to resolve this issue.  The first is to include a Director in your design.  The SRV record for automatic client log on will point to the Director and users will be returned their primary and backup registrars.  The second option is to use multiple SRV records with different priorities.  Using this method you can specify both the primary and backup registrar in DNS for automatic client log on so that in the event that the primary registrar is unavailable the client has a way to contact the backup registrar.

The only issue with the second option is that it doesn't scale very well.  For a small environment with two pools, this would work fine, but for a large environment with multiple pools in multiple data centers this may not be the most efficient option and a Director might make more sense.

Enterprise Edition Pool

An Enterprise Edition pool behaves pretty much the same as a Standard Edition Server, with one difference, there are multiple registrars in the pool that the user can register with.  In Lync Server 2010 a user homed on an Enterprise Edition pool has one Front End Server defined as their primary registrar.  In Lync Server 2010 the Registrar service has been split out into it's own component with no shared registrar database anymore.  Because of this all of the endpoints for a user need to register against the same registrar server in the user's home pool.  What that all means is that depending on the Front End Server you try to register with, you may or may not get a 301 Redirect to your home server returned to you.  If you try to register with the registrar that is defined as your primary registrar, you won't be redirected.  If you try to register with any of the other Front End Servers in the pool, you'll get the 301 Redirect.  And as I mentioned above, the 301 Redirect is where the client is informed of the user's backup registrar.

So another question that comes up is "How do I figure out with Front End Server in the pool that is defined as a user's primary registrar?".  The answer is pretty simple...PowerShell.

You can use the PowerShell cmdlet Get-CsUserPoolInfo to display a user's primary and backup registrars as well as the order of Front End Servers that the user will register against.

Get-CsUserPoolInfo -Identity lcarter@beta.deitterick.com

PrimaryPoolFqdn                     : ls14pool.beta.deitterick.com
BackupPoolFqdn                      : beta-ls14-se2.beta.deitterick.com
UserServicesPoolFqdn                : ls14pool.beta.deitterick.com
PrimaryPoolMachinesInPreferredOrder : {1:5-2, 1:5-1}
BackupPoolMachinesInPreferredOrder  : {1:3-1}

As you can see, the user's primary pool is ls14pool.beta.deitterick.com and their backup pool is beta-ls14-se2.beta.deitterick.com.  For their primary pool they will registrar against server 1:5-2 first, then 1:5-1.  That's great, but how do you match those identifiers up with the actual Front End Server FQDNs?  The answer to that is more PowerShell, of course!

If you pipe the command above to the Select-Object cmdlet, you can expand the properties of PrimaryPoolMachinesInPreferredOrder.  I also piped all of that to a Format-List to pull out the important pieces from the output:

Get-CsUserPoolInfo -Identity lcarter@beta.deitterick.com | Select-Object -ExpandProperty PrimaryPoolMachinesInPreferredOrder | Format-List MachineId,Fqdn

MachineId : 1:5-2
Fqdn      : beta-ls14-ee2.beta.deitterick.com

MachineId : 1:5-1
Fqdn      : beta-ls14-ee1.beta.deitterick.com

So from this you can see that beta-ls14-ee2.beta.deitterick.com is the user's primary registrar.

Now that we know which Front End Server in the pool the user will be redirected to we know that if they try to register with beta-ls14-ee2.beta.deitterick.com, no 301 Redirect will be returned.  If they try to register with beta-ls14-ee1.beta.deitterick.com, a 301 Redirect will be returned and it will include the backup registrar.

In this trace, the user contacts their primary registrar (172.16.8.10), so no 301 Redirect is returned:

12/31/2010|13:18:17.346 630:358 INFO :: Sending Packet - 172.16.8.10:5061 (From Local Address: 172.16.8.7:50430) 795 bytes:
12/31/2010|13:18:17.346 630:358 INFO :: REGISTER sip:beta.deitterick.com SIP/2.0
Via
: SIP/2.0/TLS 172.16.8.7:50430
Max-Forwards: 70
From: <sip:lcarter@beta.deitterick.com>;tag=e5fc295f13;epid=159a38c2a2
To: <sip:lcarter@beta.deitterick.com>
Call-ID: 50490a0fd7534119ad400d5ca46e20aa
CSeq: 1 REGISTER
Contact: <sip:172.16.8.7:50430;transport=tls;ms-opaque=d4b2948966>;methods="INVITE, MESSAGE, INFO, OPTIONS, BYE, CANCEL, NOTIFY, ACK, REFER, BENOTIFY";proxy=replace;+sip.instance="<urn:uuid:C816ACB8-5459-5B19-ADB3-2A9F0A6974A7>"
User-Agent: UCCAPI/4.0.7577.0 OC/4.0.7577.0 (Microsoft Lync 2010)
Supported: gruu-10, adhoclist, msrtc-event-categories
Supported: ms-forking
Supported: ms-cluster-failover
Supported: ms-userservices-state-notification
ms-keep-alive: UAC;hop-hop=yes
Event: registration
Content-Length: 0

12/31/2010|13:18:17.346 630:358 INFO :: End of Sending Packet - 172.16.8.10:5061 (From Local Address: 172.16.8.7:50430) 795 bytes

In this trace the user contacts the other server in the pool (172.16.8.9) and a 301 Redirect is returned.  You can see in the response that it also includes the primary and backup registrars:

01/01/2011|13:09:59.182 B34:A44 INFO :: Sending Packet - 172.16.8.9:5061 (From Local Address: 172.16.8.7:50779) 795 bytes:
01/01/2011|13:09:59.182 B34:A44 INFO :: REGISTER sip:beta.deitterick.com SIP/2.0
Via: SIP/2.0/TLS 172.16.8.7:50779
Max-Forwards: 70
From: <sip:lcarter@beta.deitterick.com>;tag=e820708c06;epid=159a38c2a2
To: <sip:lcarter@beta.deitterick.com>
Call-ID: 5de48871972847a0b16669398687d8dd
CSeq: 1 REGISTER
Contact: <sip:172.16.8.7:50779;transport=tls;ms-opaque=edd211a910>;methods="INVITE, MESSAGE, INFO, OPTIONS, BYE, CANCEL, NOTIFY, ACK, REFER, BENOTIFY";proxy=replace;+sip.instance="<urn:uuid:C816ACB8-5459-5B19-ADB3-2A9F0A6974A7>"
User-Agent: UCCAPI/4.0.7577.0 OC/4.0.7577.0 (Microsoft Lync 2010)
Supported: gruu-10, adhoclist, msrtc-event-categories
Supported: ms-forking
Supported: ms-cluster-failover
Supported: ms-userservices-state-notification
ms-keep-alive: UAC;hop-hop=yes
Event: registration
Content-Length: 0

01/01/2011|13:09:59.197 B34:A44 INFO :: End of Sending Packet - 172.16.8.9:5061 (From Local Address: 172.16.8.7:50779) 795 bytes

 

01/01/2011|13:09:59.260 B34:A44 INFO :: Data Received - 172.16.8.9:5061 (To Local Address: 172.16.8.7:50779) 758 bytes:
01/01/2011|13:09:59.260 B34:A44 INFO :: SIP/2.0 301 Redirect request to Home Server
Authentication-Info
: TLS-DSK qop="auth", opaque="212BCC58", srand="A46A87C1", snum="1", rspauth="9a517940f6f912c5fd30a4ab48f95c4bb256adf7", targetname="BETA-LS14-EE1.beta.deitterick.com", realm="SIP Communications Service", version=4
From: <sip:lcarter@beta.deitterick.com>;tag=e820708c06;epid=159a38c2a2
To: <sip:lcarter@beta.deitterick.com>;tag=9E9546E818951C08C2E9CA2E852F3D2E
Call-ID: 5de48871972847a0b16669398687d8dd
CSeq: 4 REGISTER
Via: SIP/2.0/TLS 172.16.8.7:50779;ms-received-port=50779;ms-received-cid=9B000
Contact: <sip:beta-ls14-ee2.beta.deitterick.com:5061;transport=TLS>;q=0.7
Contact: <sip:beta-ls14-se2.beta.deitterick.com:5061;transport=TLS>;q=0.3
Expires: 2592000
Content-Length: 0

01/01/2011|13:09:59.260 B34:A44 INFO :: End of Data Received - 172.16.8.9:5061 (To Local Address: 172.16.8.7:50779) 758 bytes

 

At this point the client knows about the user's backup registrar.  If the client lost connectivity to the entire pool/data center the client would be able to contact the backup registrar.

So now for the wrinkle...EndpointConfiguration.cache.  Upon the first successful logon, the client writes out the user's primary registrar to the EndpointConfiguration.cache file.  All subsequent logons will use this file to determine which server to send the initial register to.  That means that even if you have a Director configured in the environment, it won't be used after the first logon.  This also means that the client will connect directly to it's primary registrar, so no 301 Redirect will be returned to the client.  Also, the backup registrar is not cached on the client.  It needs to be provided to the client at each logon.

What happens if the user's primary registrar is/goes down?  In this case the client can't use the EndpointConfiguration.cache file, so it will fall back to automatic or manual configuration, depending on which you have configured in your environment.  At that point the client would connect either to a Director and be redirected to the next Front End Server in the user's PrimaryPoolMachinesInPreferredOrder list or to the pool and possibly be redirected to the next Front End Server in the user's PrimaryPoolMachinesInPreferredOrder list.  I say possibly because there's no way of knowing which Front End Server the client will try to connect to.  In my lab environment I'm using DNS load balancing.  The IP addresses for both Front End Servers will be returned to the client and then the client will pick one and attempt to connect.  It's possible that the IP that the client picks is the user's registrar.  Without a Director, there is no way to guarantee that the client will always get the user's backup registrar returned to it.

 

So...what does this all mean?  When you are planning your Lync Server 2010 deployment, if you are planning on defining a backup registrar for your users, you need to make sure that you understand how the backup registrar will be returned to the clients and make sure that you can guarantee that all clients will be able to connect to their backup registrar, either by using a Director, multiple SRV records, or both.  By doing this you can make sure that in the event of a fail over, you can achieve the resiliency that you are looking for.

Comments

  • Anonymous
    January 01, 2003
    @Hemal You answered your own question.  It will know the same way any other client would find out the user's primary registrar.  The client in the branch will do an SRV lookup the first time the user signs in.  The SRV record is going to point somewhere, most likely to a pool or Director.  The registrar service on the Front End Server or Director will look the user up in AD and determine its homed pool, the SBA in this example.  It will respond with the 301 Redirect telling the client the FQDN of the SBA.  On subsequent logons the EndpointConfiguration.cache file will tell the client to connect to the SBA first.

  • Anonymous
    January 01, 2003
    @TygaB The main point of the backup registrar is to provide voice resiliency for the users.  You are correct, the user's contacts and conferences are not replicated between pools, so when the users fail to their backup registrar, they will loose their presence, contacts, conferences, and the ability to start multi-party sessions.  Policies that you've defined, i.e. Client Policy or Conferencing Policy are stored in the CMS and are replicated to all servers in the environment.  Those will be provoded to the user's via in-band provisioning.  As far as setting up another Standard Edition Server, it depends on what of the above functionality you want to provide.  If you need users to be able to make and receive phone calls if their primary registrar is unavailable, then maybe the backup registrar functionality would be useful to you.

  • Anonymous
    January 01, 2003
    @Keenan Are you talking about a failover between servers in the same pool or between the primary registrar pool and the backup registrar pool?

  • Anonymous
    January 01, 2003
    @Andody Yes, you can set the backup registrar pool to be another Standard Edition Server or Enterprise Edition pool in another Central Site.  Just make sure if you are going to use a Standard Edition Server that it can handle all of the users from the Enterprise Edition pool.

  • Anonymous
    January 01, 2003
    @Harry Because you've defined the primary/backup registrar relationship in Topology Builder, that will allow those users that aren't homed on that pool to be able to register with the backup registrar.  The definition of a homed user hasn't changed (none of the user's data is replicated to the backup registrar pool), some additional capability has been added.

  • Anonymous
    January 01, 2003
    @Harry It would be another pool.

  • Anonymous
    January 01, 2003
    @Mike To answer your first question, yes, there will be an external web services url for each pool.  For your second question, you can forward those requests to either the Director or the VIP of the Front End Pool.  You can find some more information on that here: technet.microsoft.com/.../gg425874.aspx

  • Anonymous
    January 01, 2003
    @Alex The client won't receive responses to the SIP traffic it sends.

  • Anonymous
    January 01, 2003
    @Sachin Desai One point that I want to clarify about my response to your first question is that you can't guarantee that the clients have been informed about their backup registrar.  It would depend on what FQDN was set in manual configuration, Director or pool.

  • Anonymous
    January 01, 2003
    @John Thanks for the feedback!  RNL will happen on the SBA.  Users will still be able to make and receive phone calls when the WAN connection is down.

  • Anonymous
    January 01, 2003
    @Sachin Desai

  1. No, with manual configuration set, the client won't fallback to SRV lookup.  If the Standard Edition server or all of the Enterprise Edition servers for the pool that the user is homed on are down, you're dead in the water.  That's why it's recommended to use automatic configuration.
  2. After the first successful logon, the client would use the EndpointConfiguration.cache file for all subsequent logons, which lists the user's primary registrar.  The client would only go back to the Director if the user's primary registrar was down and the client was not aware of what the user's backup registrar was.
  3. Yes.
  • Anonymous
    January 01, 2003
    @Stealthnet I would expect failover of the client between servers in the same pool to be a little faster than that, but there are many factors involved.  Your best solution would be to open a case with CSS to see if client failover is working as expected in your environment.

  • Anonymous
    January 01, 2003
    @Shawn If you're asking if you can use a Standard Edition Server as the primary registrar, then yes, you can.  You can also mix and match.  The primary registrar can be Standard Edition and the backup registrar can be Enterprise Edition, and vice versa.

  • Anonymous
    January 01, 2003
    The comment has been removed

  • Anonymous
    January 01, 2003
    @Scott It doesn't change for an SBA.  The SBA contains the same registrar functionality as a Front End Server or Director.  You are correct that for users in the branch office, the SBA would be the user's primary registrar.  Even a user that is homed on an SBA will utilize services from the SBA's associated central site pool.

  • Anonymous
    December 27, 2010
    The comment has been removed

  • Anonymous
    December 29, 2010
    You rock !. I was looking for these answer since few days. very helpful. Thanks, Sachin Desai

  • Anonymous
    January 07, 2011
    The comment has been removed

  • Anonymous
    January 08, 2011
    Agree, I tested this, I found that if I set director's pool name in client connection settings(hardcode), it still send redirect 301 to client and set primary and backup registrar. if I set user pool name, it doesn't send redirect 301 to client. as you concluded, director is required to achieve primary/backup failover capability easily.

  • Anonymous
    January 11, 2011
    Hello Doug, As there si no much information available regarding the Director, would you please help me to clarify a question I do have regarding the director:

  • The Lyncweb (address book, group expansion...) will be forwarded from the reverse proxy server to the Lync Front End Pool. Director is not involved?
  • The meet and dialin simple URLs will be forwarded from the reverse proxy server to the Director(s) that will forward to the correspodning Front End Pool? Thanks. Mike
  • Anonymous
    January 14, 2011
    How does this differ when using an SBA? For scalability reasons, the SBA is supposed to be the primary registrar. The client is only supposed to go out to the WAN in the event of an SBA outage.

  • Anonymous
    January 17, 2011
    How does the client in branch site know that SBA is primary registrar as the client in branch site would also do the SRV query the way client in Central site does.

  • Anonymous
    January 18, 2011
    The comment has been removed

  • Anonymous
    February 09, 2011
    The comment has been removed

  • Anonymous
    February 09, 2011
    Dodiette, Thanks man:) this will make me happy, because standard edition comes with SQL express bundled, so  I dont have to be worry about SQL planning in the DR. about the users, they are less than 200 users.

  • Anonymous
    March 08, 2011
    Doug, great post!  Quick question for you or the community.... We have HLB in place supporting two Lync Front Ends.  When we launch the Lync client, it connects to the HLB VIP.  IF it HLB directs the request to the primary Lync registrar, the Lync client remains connected thru the HLB VIP. For other users though, their Lync clients connect to the HLB VIP and may receive the SIP/2.0 301 Redirect request to Home Server which points the client to the correct registrar: Contact: <sip:lyncfe2.contoso.com:5061;transport=TLS>;q=0.7 Contact: <sip:lyncfebackup.contoso.com:5061;transport=TLS>;q=0.3 Once the Lync client receives the 301 redirect, the client connects directly to front end.  The redirect will then get cached in the Endpointconfiguration.cache which ultimately negates the use of the HLB as the connection point for the clients.  Due to this “routing,”  we are seeing reconnect times ranging from 1-3 minutes for failover to occur during a server failure. Is this the expected behavior when using a hardware load balancer?

  • Anonymous
    April 11, 2011
    I have 3 sites and correspondingly have 3 enterprise pools. In the backup registrar pool, it is asking me for different enterprise pool, not the server in the same pool. Can you please confirm whether the backup registrar pool is the another pool or the server in the same pool.

  • Anonymous
    May 05, 2011
    Great article.  For primary registrar, can I put user one front end server with Lync Standard Edition?

  • Anonymous
    May 07, 2011
    The comment has been removed

  • Anonymous
    August 05, 2011
    Hi Doug, Is the point of this "resiliency" only so that users can still just "log in" to Lync?  From reading the comments you posted here since no data is replicated to the backup registrar's pool....does this mean that users will not see their groups or contacts that are saved in sql?  Does this also mean that the policies don't get applied to them? We have a SE with sql express on it...so I'm guessing that this backup registrar wouldn't really benefit our situation here even if we were to setup another SE with sql express in our DR site....would that be correct? Thanks!

  • Anonymous
    August 16, 2011
    The comment has been removed

  • Anonymous
    August 17, 2011
    The comment has been removed

  • Anonymous
    February 20, 2012
    Really clear and useful article, thanks, but one additional question (of course :-) )  When users are registered with a local SBA does Reverse Number Lookup use services from the pool/central servers or is this information replicated to the SBA - basically the question is 'will RNL still work if the WAN between the SBA and pool is down' ?

  • Anonymous
    May 23, 2012
    Wonderful article , questions posted are excellent that most of the scenarios are discussed ..Cheers !!

  • Anonymous
    July 09, 2012
    The comment has been removed

  • Anonymous
    November 06, 2012
    How client know his primary registrar is down?