SQL Server 2016 AlwaysOn Availability Group Enhancements: Load Balance Read-Only Routing
Overview
SQL Server 2104 and SQL 2012 read-only routing directed traffic to the first available replica in the routing list, unless it was not accessible, and then it would direct the connection to the next replica in the routing list. When you have multiple secondary replicas available for read, it is not possible to spread the read load across those replicas.
In SQL Server 2016 you can configure load-balancing across a set of read-only replicas.
Benefits
This will allow following benefits
- Use advanced configuration to ‘bundle’ read-only routed connections to certain replicas.
- Load balance read-only routed requests across a set of preconfigured available readable secondary replicas.
Configure an Availability Group for Load Balanced Read-Only Routing
First, configure each replica to allow for read access when in the secondary role, and define the connection string to the replica. These two steps are completed just like you would configure read only routing in SQL Server 2012 or SQL Server 2014.
1 Configure read-only access
Enable read access so that the replica can be accessed when it is a secondary:
ALTER AVAILABILITY GROUP [ag]
MODIFY REPLICA ON N'SQL16N2' WITH (SECONDARY_ROLE(ALLOW_CONNECTIONS = READ_ONLY))
GOALTER AVAILABILITY GROUP [ag]
MODIFY REPLICA ON N'SQL16N3' WITH (SECONDARY_ROLE(ALLOW_CONNECTIONS = READ_ONLY))
GO
2 Configure read-only routing URL
For each secondary replica, define the address to connect to it:
ALTER AVAILABILITY GROUP ag MODIFY REPLICA ON N'SQL16N2' WITH (SECONDARY_ROLE (READ_ONLY_ROUTING_URL = N'TCP://SQL16N2:1433'));
GO
ALTER AVAILABILITY GROUP ag MODIFY REPLICA ON N'SQL16N3' WITH (SECONDARY_ROLE (READ_ONLY_ROUTING_URL = N'TCP://SQL16N3:1433'));
GO
3 SQL Server 2016 introduces load balanced lists in the read-only routing list
Create a read-only routing list for SQL16N1 when it is in the primary role. Here is an example:
ALTER AVAILABILITY GROUP ag MODIFY REPLICA ON N'SQL16N1' WITH (PRIMARY_ROLE (READ_ONLY_ROUTING_LIST=(('SQL16N3', 'SQL16N2'), 'SQL16N1')));
This routing list behavior 'load balances' read-only connections between SQL16N3 and SQL16N2. Above, we have two embedded lists in the routing list:
List 1: 'SQL16N3', 'SQL16N2'
List 2: 'SQL16N1'
Here is how the routing plays out:
Route to the replicas in the first list SQL16N3 and SQL16N2 are accessible to read-only connections. The first incoming read-only connection will be routed to SQL16N3, the second read-only connection will be routed to SQL16N2, the third read-only connection will be routed to SQL16N3, the fourth read-only connection will be routed to SQL16N2, and so on, with a ‘round-robin’ distribution of read-only connections between the two replicas in the first list.
If any replicas become unavailable, routing will continue with remaining replicas in the first list If SQL16N3 or SQL16N2 becomes inaccessible to read-only connections, then the read-only connections will only be routed to the accessible read only replicas in the first list. For example, if SQL16N3 is in a not synchronized state or ALLOW_CONNECTIONS is set to NO, then all read-only connections will be routed to SQL16N2. So long as one of the servers is available for read-only connections, NO read-only connections will be routed to SQL16N1.
If all replicas in first list are inaccessible route to replicas in next list If SQL16N3 and SQL16N2 become inaccessible to read-only connections, then all read-only connections will only be routed to the next list of replicas, which in this case is SQL16N1.
Resume routing to first list if any replicas become available As secondary replicas that have higher priority in the list become accessible to read-only connections, future read-only connections will connect to them as appropriate.
Test your read-only routing
SQLCMD is a good tool for testing read-only routing and can be used from the primary replica. Remember to specify the listener name, read intent connection parameter set to read only and an availability database. If these are not all present, SQLCMD connection will not route to the secondary, sort of a silent failure, the connection may succeed, but to the primary replica.
I have configured my routing list for load balance just as the example above demonstrates. Here is my test. I connect twice using SQLCMD and notice that the first connection attempt results in a connection to SQL16N3, my second connection attempt results in a connection to SQL16N2, just as my load balanced routing list defined.
Troubleshooting Read-Only Routing
Useful Queries for Diagnosing Read-Only Routing Issues
Execute this query against the primary replica to see the settings and state information on your availability replicas that can impact the replica accessibility for read-only routing. See the FAQ section below for how this query can be used to trouble-shoot different scenarios you might encounter.
select ar.replica_server_name, ar.endpoint_url, ar.read_only_routing_url, secondary_role_allow_connections_desc, ars.synchronization_health_desc
from sys.availability_replicas ar join sys.dm_hadr_availability_replica_states ars on ar.replica_id=ars.replica_id
View replica routing list priority
select ar.replica_server_name,arl.routing_priority, ar2.replica_server_name,ar2.read_only_routing_url
from sys.availability_read_only_routing_lists arl join sys.availability_replicas ar
on (arl.replica_id = ar.replica_id) join sys.availability_replicas ar2
on (arl.read_only_replica_id = ar2.replica_id)
order by ar.replica_server_name asc, arl.routing_priority asc
The routing list for availability group ar is defined like this, this is how a SQL Server 2012 or SQL Server 2014 routing list might look.
ALTER AVAILABILITY GROUP ag MODIFY REPLICA ON N'SQL16N1' WITH (PRIMARY_ROLE (READ_ONLY_ROUTING_LIST=( 'SQL16N3', 'SQL16N2', 'SQL16N1' )));
Execute the replica routing list priority query. There is a clearly defined routing priority across all three replicas.
Now, adjust the routing list to implement load balancing between SQL16N3 and SQL16N2:
ALTER AVAILABILITY GROUP ag MODIFY REPLICA ON N'SQL16N1' WITH (PRIMARY_ROLE (READ_ONLY_ROUTING_LIST=(( 'SQL16N3', 'SQL16N2'), 'SQL16N1' )));
Query the replica routing list priority again. The priority of the load balanced replicas, SQL16N3 and SQL16N2 are tied at 1, indicating that they are both eligible to route to but SQL16N1 will be a lower priority:
Diagnose Read-Only Routing with Extended Events
There are extended events that can be used to assist in diagnosing read-only routing connection behavior.
sqlserver.hadr_evaluate_readonly_routing_info -- Despite what the name implies, this XEvent is not triggered when a read-only connection attempt is made. This event will be triggered on the primary replica only when the routing list is ‘evaluated’ for one of the following reasons:
- When log scanning of an availability database begins or ceases, which occurs when an availability group goes offline or comes online, or when synchronization of an availability database is suspended or resumed.
- When the routing list or routing URL is modified.
- When any other availability group configuration change is made at the primary replica.
Here is what the XEvent looks like captured. The is_routing_replica_not_found is only TRUE when SQL Server cannot return a routing URL to the client because there are no accessible replicas.
sqlserver.read_only_route_complete This XEvent is triggered on the primary replica only. This XEvent is triggered when SQL Server sends a replica URL to the client, but does not mean the connection will be successful. This indicates that there is an availability replica from the routing list that is accessible for read-only routing. If the URL is bad, or there are connectivity issues to the replica, the connection will fail, but this event will show complete.
Here is what the read_only_route_complete XEvent looks like:
sqlserver.read_only_route_fail -- This XEvent is triggered on the primary replica only, in the event that SQL Server is unable to respond to the client attempting a read-only connection with a readonly URL. This occurs when no secondary is accessible for reasons such as they are not synchronized, offline, not configured for read access, or their readonly URL is incorrectly formatted.
Here is what the XEvent looks like when triggered. There is no interesting information reported, since we know that the event means that no read-only URL could be returned to the client, there is no report of the circumstances behind the event.
Here is a script to create an XEvent session that captures these events. Execute this on the primary replica of your availability group:
CREATE EVENT SESSION [AlwaysOn_ROO] ON SERVER
ADD EVENT sqlserver.hadr_evaluate_readonly_routing_info,
ADD EVENT sqlserver.read_only_route_complete,
ADD EVENT sqlserver.read_only_route_fail
ADD TARGET package0.event_file(SET filename=N'AlwaysOn_ROO.xel',max_file_size=(5),max_rollover_files=(4))
WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=30 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=OFF,STARTUP_STATE=ON)
GO
ALTER EVENT SESSION AlwaysOn_ROO ON SERVER STATE=START
GO
To monitor activity, locate the XEvent session and ‘Watch Live Data’:
Here is our routing list:
ALTER AVAILABILITY GROUP ag MODIFY REPLICA ON N'SQL16N1' WITH (PRIMARY_ROLE (READ_ONLY_ROUTING_LIST=('SQL16N3', 'SQL16N2')));
First, connect to SQL Server using read-routing, and we will see we are successfully connected to the first replica in our routing list. The read_only_route_complete is triggered because SQL Server was able to return an accessible replica routing URL to the client.
Next, let’s alter the read access of our two replicas listed in the routing list, we will set them both to NO for read access.
ALTER AVAILABILITY GROUP [ag]
MODIFY REPLICA ON N'SQL16N2' WITH (SECONDARY_ROLE(ALLOW_CONNECTIONS = NO))
GO
ALTER AVAILABILITY GROUP [ag]
MODIFY REPLICA ON N'SQL16N3' WITH (SECONDARY_ROLE(ALLOW_CONNECTIONS = NO))
GO
The hadr_evaluate_readonly_routing_info XEvent is triggered because we altered the availability group’s routing information. The read_only_route_fail XEvent is triggered because no secondary replica in the routing list could be accessed, and the client fails to connect and receives message ‘Unable to access the ‘agdb’ database because no online secondary replicas are enabled for read only access.’
FAQs
I configured my application to connect with read-only routing but I successfully connect to the primary replica
Reason Your application must properly specify the database name of an availability group database. For example, when attempting to connect using SQLCMD, if we specify all the parameters, but fail to pass the database parameter, set to an availability database, the connection fails to route to the secondary and is routed to the primary. Here we see SQLCMD passing the read-only routing parameter but does not specify the availability database as a parameter, resulting in connection to the primary instead of a read-only secondary.
Reason If you defined the primary replica in your read-only routing list and your secondary replicas cannot be accessed (for example, synchronization is suspended), then the routing will connect the application to the primary replica. For example, the current primary replica is SQL16N1 and we define the routing list like this:
/*Configure a routing list. If SQL16N3 is not available read-only connections connect to SQL16N2*/
ALTER AVAILABILITY GROUP ag MODIFY REPLICA ON N'SQL16N1' WITH (PRIMARY_ROLE (READ_ONLY_ROUTING_LIST=('SQL16N3', 'SQL16N2', 'SQL16N1')));
GO
I configured my routing list but I am connecting to a replica I didn’t expect to connect to
Reason The secondary replica you anticipated connecting to is not accessible, here are the reasons:
- The secondary is not configured for read access. You must set the secondary for read access, see step 1 in the above section ‘Configure an Availability Group for Load Balanced Read-Only Routing.’
- The secondary is suspended. If the secondary is not in a synchronizing or synchronized state, you cannot read-only connect to that secondary replica.
- The SQL Server hosting the secondary replica is not running.
- The read-only routing URL for that secondary is not defined correctly. Compare how that routing URL has been defined for the problematic secondary, see step 2 in the above section ‘Configure an Availability Group for Load Balanced Read-Only Routing.’ To see the existing read-only URL query:
The following query can be executed against the primary replica and it will provide the details needed to trouble-shoot all of the above scenarios.
select ar.replica_server_name, ar.endpoint_url, ar.read_only_routing_url, secondary_role_allow_connections_desc, ars.synchronization_health_desc
from sys.availability_replicas ar join sys.dm_hadr_availability_replica_states ars on ar.replica_id=ars.replica_id
In the following graphic we can see the results of this query, note that SQL16N2 and SQL16N3 are not configured for read-only connections. Also SQL16N2 is not healthy right now, it may be suspended or SQL Server may not be running on SQL16N2. Finally, there is a problem with the read-only routing URL for SQL16N1:
For more information see the Read-Only Routing page:
Configure Read-Only Routing for an Availability Group (SQL Server)