Поделиться через


Service Broker Wait Types

SQL server engine keeps track of wait operations (aka wait types) performed by all its executing threads, either to serialize access to protected structures or to wait for asynchronous events/notifications. Sys.dm_os_wait_stats DMV can be used to get the statistics for all wait types and can potentially point to performance issues and code paths with high degrees of contention.

Service Broker threads use 12 different wait types. The sections below describe these wait types in detail and their expected values depending on usage of specific Service Broker features.

[Note: Let avg_wait_time_ms = wait_time_ms/waiting_tasks_count for each wait type from the DMV]

1.     BROKER_CONNECTION_RECEIVE_TASK

Each Service Broker (and Database Mirroring) connection endpoint has a list of buffers posted to receive data from the network. There are two threads working on this list, one that posts buffers for receive and one that processes them after receiving the data.

This wait type is charged whenever these threads attempt to access this list to add or remove buffers.

Waiting_tasks_count and wait_time_ms for this wait type should both be proportional to the amount of network data received by all Service Broker (and Database Mirroring) connection endpoints and avg_wait_time_ms should be a really small value.

2.     BROKER_ENDPOINT_STATE_MUTEX

This wait type is charged each time there is some state change for a Service Broker (or Database Mirroring) connection endpoint during the connection establishment (i.e. handshake) phase - e.g. initialization before connect or after accept, login negotiation (authentication, encryption), validation, error, arbitration and error. This wait type is also charged per connection endpoint every time sys.dm_broker_connections (or sys.dm_db_mirroring_connections) DMV is queried to serialize access to each connections handshake state.

Avg_wait_time_ms for this wait type should be very small and the wait_time_ms and waiting_tasks_count should both be proportional to the number of times Service Broker (or Database Mirroring) establishes connection with some other SQL server instance and the number of times sys.dm_broker_connections (or sys.dm_db_mirroring_connections) DMV is queried.

Rapidly increasing values of wait_time_ms and waiting_tasks_count for this DMV could indicate very frequent connection establishment (and teardown). Since service broker transport tears down connections after ~90 seconds of inactivity, these values can increase if applications use service broker once every ~90 seconds.

3.     BROKER_EVENTHANDLER

Each SQL server instance has a primary event handler thread for processing Service Broker startup/shutdown and timer events. This thread never goes away and is always either waiting for such events or processing them.

This wait type is charged each time Service Broker's primary event handler waits for instance startup/shutdown or any dialog timer events (dialog timeouts) and mirrored routes timeouts.

Wait_time_ms for this wait type should approximately be equal to the interval since instance startup. Waiting_tasks_count merely indicates the number of times the primary event handler had to wait due to absence of any events.

Neither of these two fields in the DMV indicates any performance issue in the engine. If Service Broker is not being used at all (either directly or through DBMail, Event Notification), then max_wait_time_ms and wait_time_ms would approximately be the same and waiting_tasks_count would be really small value.

4.     BROKER_INIT

This wait type is charged each time Service Broker fails to initialize internal broker managers for any database. Service Broker waits for about 1 second before re-attempting to initialize broker for same database. These events should be rare.

Waiting_tasks_count for this wait type is the number of times Service Broker failed to initialize broker on any database. Wait_time_ms will be proportional to waiting_tasks_count with avg_wait_time_ms being close to 1 second. 

High or increasing values of waiting_tasks_count for this wait type indicate some problem in the SQL instance.

5.     BROKER_MASTERSTART

This wait type is charged only during instance startup, when Service Broker is waiting for master database to startup.

Waiting_tasks_count should be just 1 and wait_time_ms should be really small for this wait type.

6.     BROKER_RECEIVE_WAITFOR

This wait type is charged once per WAITFOR RECEIVE SQL statement, where the statement execution waits for messages to arrive in the user queue.

Waiting_tasks_count must be same as the number of times such statements have been executed and wait_time_ms should be the total time their execution had to wait before messages arrived or WAITFOR timeout for each.

If avg_wait_time_ms is much higher than expected, errorlog and profiler events should be checked on both initiator and target server instances for potential problems.

7.     BROKER_REGISTERALLENDPOINTS

This wait type is charged only during instance startup, when Service Broker is waiting for all endpoint types to be registered, so that it can start Broker and/or Database Mirroring endpoints.

Waiting_tasks_count should be just 1 and wait_time_ms should be really small for this wait type.

8.     BROKER_SERVICE

This wait type is charged when next hop destination list associated with a target service/broker instance pair gets updated or re-prioritized due to addition or removal of a dialog to the target service/broker instance pair. Service Broker sends messages to these next hop destinations in the order of their priority and hence it needs to serialize access to destination-list and their effective priority changes.

Waiting_tasks_count and wait_time_ms for this wait type merely indicate the number of times Service Broker had to serialize access to these internal structures with avg_wait_time_ms being really small.

9.     BROKER_SHUTDOWN

This wait type is charged only during instance shutdown, when Service Broker waits a few seconds for its primary event handler and all connection endpoints to shutdown.

Waiting_tasks_count and wait_time_ms for this wait type should be both 0 unless instance shutdown has already started.

10. BROKER_TASK_STOP

Service Broker has several task handlers to execute broker internal tasks related to transmission of messages, asynchronous network operations and processing of received messages.

This wait type is charged only when one of these task handlers is stopping due to absence of broker internal tasks. The task handler waits for maximum 10 seconds before getting destroyed in case it needs to be restarted to execute some task.

Waiting_tasks_count and wait_time_ms should both be small values for heavy Service Broker usage scenarios. In addition, every 5 seconds, Service Broker schedules an internal cleanup task that does not do much work when broker is not being used. But, it causes one of the task handlers to wake-up, restart, execute the task and then start waiting again. As a result, even though Service Broker is not used at all, waiting_tasks_count and wait_time_ms for this wait type keep increasing, proportional to the interval since instance startup with avg_wait_time_ms being close to 5 seconds.

11. BROKER_TO_FLUSH

For performance reasons Service Broker maintains all dialog state (TO - transmission object) in memory as well as in temporary tables on disk. Every time a TO is updated, it is scheduled to be flushed lazily to the temporary table on disk. Service Broker employs an always alive lazy flusher task to do this job.

This wait type is charged when the TO lazy flusher task is waiting for some TOs to be saved to the temporary tables. The lazy flusher sleeps for 1 second before waiting again for ~1 second for TOs to be saved.

If Service Broker is not used at all, wait_time_ms and waiting_tasks_count for this wait type should be proportional to the duration since instance startup, with avg_wait_time_ms being close to ~1 second. When Service Broker is used heavily these columns should have lowe values since the lazy flusher will be busy as well.

12. BROKER_TRANSMITTER

Service Broker has a component known as the Transmitter which schedules messages from multiple dialogs to be sent across the wire over one or more connection endpoints. The transmitter has 2 dedicated threads for this purpose.

This wait type is charged when these transmitter threads are waiting for dialog messages to be sent using the transport connections.

High values of waiting_tasks_count for this wait type point to intermittent work for these transmitter threads and are not indications of any performance problem. If service broker is not used at all, waiting_tasks_count should be 2 (for the 2 transmitter threads) and wait_time_ms should be twice the duration since instance startup.

Example: Broker wait types statistics after 1 hour (3,600,000 ms) of idle system   

Service Broker Wait Type

waiting_tasks_count

wait_time_ms

BROKER_CONNECTION_RECEIVE_TASK

0

0

BROKER_ENDPOINT_STATE_MUTEX

0

0

BROKER_EVENTHANDLER

3

81 *

BROKER_INIT

0

0

BROKER_MASTERSTART

0

0

BROKER_RECEIVE_WAITFOR

0

0

BROKER_REGISTERALLENDPOINTS

0

0

BROKER_SERVICE

0

0

BROKER_SHUTDOWN

0

0

BROKER_TASK_STOP

724

3634180

BROKER_TO_FLUSH

1762

1804005

BROKER_TRANSMITTER

2

0 *

* Service Broker's primary event handler and the transmitter threads are still waiting for some dialog activity to wake them up. Since wait_time_ms gets updated only after the wait is over, we see 0/low values for these wait types.

Comments

  • Anonymous
    February 19, 2009
    This is awesome.. I was struggling to find out why SB events are comming in wait stats.  After reading this blog its clear that there are other parts of SQL Engine that uses SB too. Very useful blog.

  • Anonymous
    June 16, 2009
    Great information, and a great blog! I was wondering if you could help me understand one thing with regard to the BROKER_TASK_STOP event.  The information in this article states that "...Waiting_tasks_count and wait_time_ms should both be small values for heavy Service Broker usage scenarios..." Can this event indicate a problem with a task handler stopping?  I want to determine whether I can unilaterally disregard this event when it occurs, or if there are scenarios when I should check some more information to determine whether there is a problem.

  • Anonymous
    June 17, 2009
    Thanks! BROKER_TASK_STOP event might indicate problems if you think Service Broker is being used extensively by your application, but somehow the task handlers are stopping/waiting too often. For instance, if the network between the initiator and the target instances becomes too congested to slow down the message arrival rate, Service Broker on the target side will probably wake up a waiting task handler to process the received message, save it in the target queue and go back to waiting for next, and firing this event more often than normal.

  • Anonymous
    November 19, 2009
    The comment has been removed

  • Anonymous
    January 12, 2010
    We're experiencing a performance situation where >50% of our WAITSTATS = BROKER_ENDPOINT_STATE_MUTEX. In addition, >45% of our WAITSTATS = THREADPOOL. This is on a third party app so I didn't write the broker messaging piece. Any recommendations about what perfmon counters to gather or any other information so that we can feed it to the developers for tuning? Would we get any benefit from lowering or raising the Max Worker Threads. Currently it is set to the default = 704 for a x64 16cpu system. We aren't seeing any pressure on System:Processor Queue Length. Thanks

  • Anonymous
    September 27, 2011
    Very useful blog indeed! Many things turned clear to me after reading this post. I had no idea there were other parts of SQL engine using SB as well... YouTube converter to mp3 videotomp3converter.com

  • Anonymous
    July 02, 2014
    how to fix these issue when you see these type of wait types

  • Anonymous
    September 17, 2015
    Good explanation about wait types, sometimes we need more information to understand the concepts or take some actions about it.! Good!

  • Anonymous
    January 28, 2016
    Good explanation! Thanks!