SQL Server Parallel Query Placement Decision Logic
Recently I had a conversion with (Jonathan Kehayias jonathan@sqlskills.com) about the placement of workers belonging to a parallel query. As I asked around and reviewed the code I quickly found that the assumption is still ‘Least Loaded Node’ but this changed in SQL Server 2012 and broad awareness for both our support engineers and customers is a bit light. This post highlights the various decision options available to SQL Server 2012, 2014 and 2016 releases.
There are many different DOP placement options SQL Server has to pick from. The table below outlines the selection types and documents there behaviors.
SMP or Connection Bound |
Place threads on one nodeConnect requires: -T2479 |
Either the system only has a single node or it is treated as if the connection node is the only node on the system. DOP is allowed as long as free threads on the node are >= (dop * branches in query) |
Full |
Place threads on all nodes
|
Node zero will always be the starting node. Starting at node id zero SQL Server loops across schedulers and nodes until all workers are placed. DOP is allowed as long as all workers can be placed on the full system. |
Least Loaded Node |
Place threads on the least loaded node.Requires: -T2467 |
Loop over the online nodes determining if there are enough free threads on any single node to support the current DOP request. Making sure there are enough schedulers online within the node to support the request without stacking up requests from the query on the same scheduler(s.) |
Use Next Node |
Place threads within nodeRequires: -T2468 |
Find the next node that can service the DOP request. Unlike full mode, the global, resource manager keeps track of the last node used. Starting from the last position, and moving to the next node, SQL Server checks for query placement opportunities. If a node can’t support the request SQL Server continues advancing nodes and searching. |
Spread |
Place threads on multiple nodes |
This is the most common decision made by SQL Server. The decision spreads the workers across multiple nodes as required. The design is similar to full except the starting position is based on the saved, next node, global enumerator. |
- SQL Server does check for online nodes and skips the offline nodes.
- SQL Server considers the number of online schedulers within each node.
- When assigning workers SQL Server avoids assignment of the query’s workers on the same scheduler.
The decision logic is protected and synchronized across queries using the QRY_PARALLEL_THREAD_MUTEX (see sys.dm_os_wait_status.)
The following are a few of the outputs surrounding the DOP calculation decisions.
- XEvent: XeSqlPkg::calculate_dop
- SQL Server 2016 DMV: sys.dm_exec_query_parallel_workers
- Statistics XML: runtime information
Bob Dorr - Principal SQL Server Escalation Engineer
Comments
- Anonymous
March 23, 2016
This is the most documentation I've ever seen on node/thread/work placement. I have so many basic questions, like, what is the penalty for crossing nodes, do nodes share buffers, does it matter if cores are hyperthreaded, etc. - Anonymous
May 16, 2016
DOP? Is that degree of parallelism? "DOP is allowed as long as...." What does that mean? Is DOP a synonym for "parallelism"?