Partilhar via


AlwaysON - HADRON Learning Series: HADR_SYNC_COMMIT vs WRITELOG wait

The distinction between these two wait types is subtle but very helpful in tuning your Always On environment.

The committing of a transaction means the log block must be written locally as well as remotely for synchronous replicas.   When in synchronized state this involves specific waits for both the local and remote, log block, harden operations.

HADR_SYNC_COMMIT = Waiting on response from remote replica that the log block has been hardened.  This does not mean the remote, redo has occurred but instead that the log block as been successfully stored on stable media at the remote, replica.  You can watch the remote, response behavior using the XEvent: hadr_db_commit_mgr_update_harden.

WRITELOG = Waiting on local I/O to complete for the specified log block.

The design puts the local and remote log block writes in motion at the same time (async) and then waits for their completion.   The wait order is 1) remote replica(s) and 2) the local log.

The HADR_SYNC_COMMIT is usually the longer of the waits because it involves shipping the log block to the replica, writing to stable media on the replica and getting a response back.   By waiting on the longer operation first the wait for the local write is often avoided. 

Once the response is received any wait on the local (primary), log (WRITELOG) occurs as necessary.

Accumulation of HADR_SYNC_COMMIT wait time is the remote activity and you should look at the network and log flushing activities on the remote replica.

Accumulation of WRITELOG wait time is the local log flushing and you should look at the local I/O path constraints.

Reference: https://blogs.msdn.com/b/psssql/archive/2011/04/01/alwayson-hadron-learning-series-how-does-alwayson-process-a-synchronous-commit-request.aspx

Reference: https://blogs.msdn.com/b/psssql/archive/2013/04/22/how-it-works-always-on-when-is-my-secondary-failover-ready.aspx

Bob Dorr - Principal SQL Server Escalation Engineer

Comments

  • Anonymous
    September 11, 2013
    Simply nice Bob. Thanks for the post. I have a question.. Do we need to ensure that the application timeout is greater than the remote-Logflush time incase if we have a geographical replicas hosted for AGs ?

  • Anonymous
    November 11, 2017
    Hi Bob,Is there any benchmark for HADR SYNC Time. In our scenario it is around 16ms. Replica is in same data center. Is it good, bad or ugly ?