Deadlock Analysis (Basic)
Today I want to start a small blog series for the deadlock analysis with SQL Server. In the first part we will cover the basics about deadlocks, why and when they occur and how to capture them.
The second part which will be more exciting will cover the basic analysis of deadlocks and the different approaches to solve and to avoid them.
The last part takes the excitement to the extreme and will be an introduction and discussion of some in-depth SQL Scripts, which will help us to analyze and understand the deadlocks in more detail.
Unless otherwise stated the provided information and scripts run on SQL Server 2005 or higher.
What are Deadlocks?
A deadlock occurs, when multiple threads are waiting on each other’s resources. All threads are waiting, for the other thread(s) to release their resources, which they will never do (as they wait for the other resources as well). A deadlock has to contain at least two nodes, but can contain much more than that; the biggest deadlock I saw had 13 nodes.
The most common misconception about deadlocks is that they are a problem of the database system - the database should solve them on its own and should not throw any error to the application. But deadlocks are happening in all database systems that are using locks to guarantee the logical integrity of the data. Deadlock detection and cancellation is a database feature to prevent stale locks, which otherwise would never get resolved by their own.
You can easily create a deadlock with two nodes on SQL Server by the following procedure:
- Open two connections to the SQL Server (Connection ConA and ConB)
- On connection ConA prepare a database and a table with some data with the script:
USE master
GO
IF EXISTS (SELECT * FROM sys.databases WHERE name = N'DL_Analysis')
DROP DATABASE [DL_Analysis]
GO
CREATE DATABASE [DL_Analysis]
GO
ALTER DATABASE [DL_Analysis] SET RECOVERY SIMPLE
USE [DL_Analysis]
GO
SET ANSI_NULLS ON
SET ANSI_PADDING ON
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [SAPTABLE](
[NAME] [nvarchar](40) COLLATE SQL_Latin1_General_CP850_BIN2 NOT NULL DEFAULT N' ',
[SPRSL] [nvarchar](1) COLLATE SQL_Latin1_General_CP850_BIN2 NOT NULL DEFAULT N' ',
[TEXT] [nvarchar](70) COLLATE SQL_Latin1_General_CP850_BIN2 NOT NULL DEFAULT N' ',
CONSTRAINT [SAPTABLE~0] PRIMARY KEY CLUSTERED
(
[NAME] ASC,
[SPRSL] ASC
)ON [PRIMARY]
)ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [SAPTABLE~NAM] ON [SAPTABLE]([NAME]) ON [PRIMARY]
GO
-- Insert two rows of test data
INSERT INTO SAPTABLE([NAME],[SPRSL],[TEXT]) VALUES (N'Miller',N'E',N'English name')
INSERT INTO SAPTABLE([NAME],[SPRSL],[TEXT]) VALUES (N'Schulze',N'D',N'Deutscher Name.')
Go
on connection ConA run:
use [DL_Analysis]
BEGIN TRANSACTION
-- Update the first row, starting with Miller
UPDATE SAPTABLE SET SPRSL = N'D' WHERE NAME = N'Miller' AND SPRSL = N'E'
-- Now run the script for the deadlock partner
-- We will wait in this connection for 1 minute....
WAITFOR DELAY'00:01:00'
-- .. and then will execute the next Update here (now Schulze)
UPDATE SAPTABLE SET SPRSL = N'E' WHERE NAME = N'Schulze' AND SPRSL = N'D'
-- Deadlock occurred.
ROLLBACK -- to be able to repeat it
This transaction will update column SPRSL, where NAME equals to ‘Miller’, than it will wait for one minute and then it will try to update the row, where NAME = ‘Schulze’
- While the transaction on connection ConA is waiting, run this code on connection ConB:
use [DL_Analysis]
BEGIN TRANSACTION
-- Run both updates at once, starting with Schulze
UPDATE SAPTABLE SET SPRSL = N'F' WHERE NAME = N'Schulze' AND SPRSL = N'D'
UPDATE SAPTABLE SET SPRSL = N'F' WHERE NAME = N'Miller' AND SPRSL = N'E'
-- Deadlock occurred.
ROLLBACK -- to be able to repeat it
It will update the row where NAME = ‘Schulze’ and then the row where NAME = ‘Miller’. It can update the row ‘Schulze’, but has to wait on the lock on row ‘Miller’, as the request on connection ConA has already changed it and has an exclusive lock on it.
- Once the waiting on connection ConA is over, the request will try to update the row ‘Schulze’, but has to wait, as the request on connection ConB has already changed it and has an exclusive lock on it.
- Now we have a deadlock, the request ConA has changed ‘Miller’ and has a lock (X lock) on it, but has to wait on ‘Schulze’, which is blocked by ConB. Connection ConB has an X lock on ‘Schulze’ and has to wait for to lock on ‘Miller’, which is held by ConA.
- The SQL Server is able to detect such deadlocks and will resolve it by killing the transaction with the least amount of generated log, the transaction on connection ConA in our case. The transaction is ended and rolled back with an error 1205:
Msg 1205, Level 13, State 51, Line 3 Transaction (Process ID 60) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
- The transaction on connection ConB can continue.
How to capture deadlocks?
When a deadlock occurs in a SAP system, we do not get much information in the ST22 dumps or within the SQL Server error log. The database monitor of the SAP system (Transaction DBACOCKPIT) shows the occurred deadlocks, the table on which the deadlock occurs and the work process information. This is only done for the ABAP part of the system (no Java deadlocks are shown here) and it shows only the victims; there is no information about the winning processes.
SQL Profiler trace
Therefore SAP provides a note (32129 (login required)) to setup a more detailed deadlock tracing for SQL Server. This note sets several SQL Server trace flags and configures a SQL Server Profiler Trace.
The trace flags that get set are:
Trace flag |
Description |
Purpose |
1204 |
Deadlock victim/chain output |
This trace flag produces detailed information about deadlocks, including participants and victims. |
1222 |
XML Deadlock Information to the error log |
This trace flag pipes XML Deadlock information to the error log |
3605 |
Send trace output to the error log |
This trace flag is necessary to get the above debug information into the error log. If it is not set, the other trace flags will not be able to write to the SQL Server Error log. |
For the background profiler trace you have to execute the script that is attached to the note. It will create a profiler trace to a local file in your SQL Server log directory (where the Error Log lives), which will only contain the deadlock event of the SQL Server events. The trace will be registered as a startup trace and will be active instantly.
Once all of this (SQL Profiler Trace and Trace flags) is in place, one has to wait, that the deadlock occurs again. When this happens, a deadlock graph is written to the trace file and as XML written to the error log. These two files we need for a first deadlock analysis.
When you open the deadlock trace file (e.g. deadlock01_20130201-120016.trc) in the SQL Server Profiler, you can examine the deadlock in a graphical format:
You can identify the two nodes and some of the locks they held.
Each thread is represented with an oval; the one which is crossed is the one that was chooses as deadlock victim. The other one is the deadlock winner. The ovals are showing more information about the thread:
Line |
Content |
Server process id |
The Session ID (SPID) of the request |
Server batch id |
An internal reference number for the batch, where the statement is running in. In an SAP system it is zero. |
Execution context id |
When the query is executed in parallel, this shows the ID of one of the parallel running threads. In an SAP system it is mostly zero. |
Deadlock priority |
If the thread has an associated deadlock priority, it is shown here. SAP does not use deadlock priority. |
Log Used |
The amount of transaction log, that was already generated by this thread |
Owner ID |
An internal reference number for the transaction. |
Transaction descriptor |
An internal reference number for the state of the transaction. |
In the middle of the graph, in the rectangles, information about the two participating locks is shown:
Line |
Content |
Locktype |
Type of lock the thread holds or requests. Common values are “Key Lock”, “Page Lock” or ”RID Lock” |
HoBt ID |
Heap or B-Tree ID, the ID of the internal allocation structure of the table or partition |
Associated ObjId |
The ObjectID that is associated to the HoBt ID. The name of the object can be retrieved via the OBJECT_NAME function. |
Index name |
The name of the index, on which the locks are held or requested. |
The lines between the threads and locks are representing the held and requested locks. In our example the right thread (Winner) holds an X lock (Owner Mode: X) on the upper lock and requests a U lock (Request Mode: U) to the lower lock. The victim thread holds the X lock on the lower lock and requests a U-lock for the upper lock. Following the lines you can circle around between the ovals and will never come to an end – a deadlock occurred.
We now have information about the threads and the locks, but what is about the statement that causes all this? Where can I find it?
It is revealed when you hover over the left and right ovals:
In our example it just shows the script we ran before.
You can extract all information about the deadlock from the SQL Server deadlock trace in XML format. Therefore right click on the line with the deadlock and choose “Extract event data”:
You then can save this event as an XDL file (SQL Server Deadlock File). When you double click the file it gets opened it the SQL Server Management Studio and shown in a graphical format (as in the SQL Profiler). But when you open it with a text editor (e.g. notepad) it is opened as an XML file.
I’d like to describe the content of this XML file in more detail, as it contains quite useful information about the deadlock and the participating processes. The XML has a structure like:
XML Tag |
Description |
Deadlock-list |
The surrounding bracket around all deadlocks (most of the time only one) deadlocks |
Deadlock |
First deadlock, with Victim Process ID |
Process-list |
The list of involved Processes (most of the time only two) |
Process |
Process number one, the victim. This section contains many attributes about the Process itself, they will be discussed later |
ExecutionStack |
The statement that causes the deadlock, which SQL Handle and (sometimes) as a prepared statement. |
Inputbuffer |
The complete input buffer of the process, in an SAP system most of the time only one statement, in our example the complete script of the connection. Has the statement from the ExecutionStack included. |
Process |
Process number two, the survivor or winner of the deadlock |
ExecutionStack |
The same information as for the victim. |
Inputbuffer |
The same information as for the victim. |
Resource-List |
All the resources that are involved in the deadlock. Mostly only a list of two KEY locks, but might be other as well (e.g. PAGE locks) |
Keylock/PageLock |
Detailed information about the first lock itself, they will be discussed later. |
Owner-List |
The list of owner of this resource, mostly one process. |
Owner |
The owner with process id and lock mode (X, U, S etc.) |
Waiter-List |
The list of waiter of this resource; can be more than one. |
Waiter |
The waiter with process id and lock mode (X, U, S etc.) and request type (WAIT). |
Keylock/PageLock |
Detailed information about the second lock. |
Owner-List |
The same information as for the first lock. |
Owner |
The same information as for the first lock. |
Waiter-List |
The same information as for the first lock. |
Waiter |
The same information as for the first lock. |
Here is an example (fragment), how this looks like in our case:
XML Tag |
Example |
Deadlock-list |
|
Deadlock |
victim="process3f9f4c8" |
Process-list |
|
Process |
id="process3f9f4c8" taskpriority="0" logused="684" waitresource="KEY: 11:72057594038779904 (7e576f73234b)" waittime="2185" ownerId="286942698" transactionname="user_transaction" lasttranstarted="2013-02-01T12:00:44.620" XDES="0x86dab950" lockMode="U" schedulerid="1" kpid="3640" status="suspended" spid="59" sbid="0" ecid="0" priority="0" trancount="2" lastbatchstarted="2013-02-01T12:00:44.537" lastbatchcompleted="2013-02-01T11:51:13.350" clientapp="Microsoft SQL Server Management Studio - Query" hostname="MyHostname" hostpid="4868" Loginname="MyDomain\User" isolationlevel="read committed (2)" xactid="286942698" currentdb="11" lockTimeout="4294967295" clientoption1="671090784" clientoption2="390200" |
ExecutionStack |
<frame procname="adhoc" line="13" stmtstart="110" sqlhandle="0x0200000060960b1fa1d4f597f9ad060adbabb997bfa96045"> UPDATE [SAPTABLE] set [SPRSL] = @1 WHERE [NAME]=@2 AND [SPRSL]=@3 </frame> <frame procname="adhoc" line="13" stmtstart="724" sqlhandle="0x0200000047ca5e0e3acdcd8f7c7dc91d718edf54ac1624ac"> UPDATE SAPTABLE SET SPRSL = N'E' WHERE NAME = N'Schulze' AND SPRSL = N'D' -- Deadlock occured </frame> |
Inputbuffer |
use [DL_Analysis] BEGIN TRANSACTION -- Update the first row, starting with Miller UPDATE SAPTABLE SET SPRSL = N'D' WHERE NAME = N'Miller' AND SPRSL = N'E' -- Now run the script for the deadlock partner -- we will wait in this connection for 10 seconds .... WAITFOR DELAY '00:00:10' -- .. and then will execute the next Update here (now Schulze) UPDATE SAPTABLE SET SPRSL = N'E' WHERE NAME = N'Schulze' AND SPRSL = N'D' -- Deadlock occured |
Resource-List |
|
Keylock/PageLock |
hobtid="72057594038779904" dbid="11" objectname="DL_Analysis.dbo.SAPTABLE" indexname="SAPTABLE~0" id="lockd047c00" mode="X" associatedObjectId="72057594038779904" |
Owner-List |
|
Owner |
id="process7041708" mode="X" |
Waiter-List |
|
Waiter |
id="process3f9f4c8" mode="U" requestType="wait" |
Keylock/PageLock |
hobtid="72057594038779904" dbid="11" objectname="DL_Analysis.dbo.SAPTABLE" indexname="SAPTABLE~0" id="lock28b09180" mode="X" associatedObjectId="72057594038779904" |
Owner-List |
|
Owner |
id="process3f9f4c8" mode="X" |
Waiter-List |
|
Waiter |
id="process7041708" mode="U" requestType="wait" |
The process attributes are (from the example) :
Attribute |
Description |
id="process3f9f4c8" |
The ID of the process |
taskpriority="0" |
The priority of the task, most of the time 0. |
logused="684" |
The amount of transaction log that was generated through this transaction. The process with the least amount will be the victim. |
waitresource="KEY: 11:72057594038779904 (7e576f73234b)" |
The resource, the process is waiting on. It is defines as Type (RID, KEY, PAGE etc.):DatabaseID:AllocationUnitID(HashValue of the Key) |
waittime="2185" |
How long has the process waited on this resource? |
ownerId="286942698" |
Internal number of the transaction. |
transactionname="user_transaction" |
The name or type of transaction |
lasttranstarted="2013-02-01T12:00:44.620" |
When did the last transaction started? |
XDES="0x86dab950" |
|
lockMode="U" |
The lock mode the process is waiting for. |
schedulerid="1" |
The CPU this process is running on. |
kpid="3640" |
The Window Kernel Thread ID |
status="suspended" |
The status of the process (suspended, as it is waiting for the lock). |
spid="59" |
The SQL Server Session ID |
sbid="0" |
The System BatchID or Request ID |
ecid="0" |
The Execution Context, when executed in parallel, otherwise Zero. |
priority="0" |
The deadlock priority of the process. |
trancount="2" |
How many nested transaction has the process open? |
lastbatchstarted="2013-02-01T12:00:44.537" |
When did the last batch started? |
lastbatchcompleted="2013-02-01T11:51:13.350" |
When did the last batch completed? |
clientapp="Microsoft SQL Server Management Studio - Query" |
The name of the client application. |
hostname="MyHostname" |
The hostname where the client application is running on. |
hostpid="4868" |
The Window Process ID of the application |
loginname="MyDomain\MyUser" |
The user that is connected to the SQL Server. |
isolationlevel="read committed (2)" |
The transaction isolation level of the process. |
xactid="286942698" |
Transaction ID |
currentdb="11" |
Database ID |
lockTimeout="4294967295" |
The maximum lock timeout. |
clientoption1="671090784" |
The client options that are set (e.g. XACT_ABORT, ANSI_NULLS etc.) |
clientoption2="390200" |
The client options that are set (cont). |
The resource attributes are (from the example):
Resource Atrribute |
Description |
hobtid="72057594038779904" |
The Heap-or-B-Tree ID, the internal ID for the allocation unit of the table |
dbid="11" |
The database ID |
objectname="DL_Analysis.dbo.SAPTABLE" |
The object |
indexname="SAPTABLE~0" |
The index the resource belongs to |
id="lockd047c00" |
The internal representation of the lock. |
mode="X" |
The requested or owned lock mode |
associatedObjectId="72057594038779904" |
The object id of the table, most of the time identical with the HoBtID. Only when the table is partitioned the two are different (one ObjectID has then multiple HoBtID). |
In the next parts of the deadlock blog series we will make use of some of these attributes.
SQL Server Error Log File
When the trace flags are set correctly, you will get the exact same information as in the XML file in the SQL Server error log file (see this fragment):
2013-02-01 13:24:23.430 spid9s Deadlock encountered .... Printing deadlock information
2013-02-01 13:24:23.430 spid9s Wait-for graph
2013-02-01 13:24:23.430 spid9s NULL
2013-02-01 13:24:23.430 spid9s Node:1
2013-02-01 13:24:23.430 spid9s KEY: 11:72057594038779904 (7e576f73234b) CleanCnt:2 Mode:X
2013-02-01 13:24:23.430 spid9s Grant List 2:
2013-02-01 13:24:23.430 spid9s Owner:0x000000001E5D4780 Mode: X Flg:0x40 Ref:0
Life:02000000 SPID:58 ECID:0 XactLockInfo
0x0000000099191400
2013-02-01 13:24:23.430 spid9s SPID: 58 ECID: 0 Statement Type: UPDATE Line #: 6
2013-02-01 13:24:23.430 spid9s Input Buf: Language Event: use [DL_Analysis]
BEGIN TRANSACTION
-- Run both Updates at once, starting with Schulze
UPDATE SAPTABLE SET SPRSL = N'F' WHERE NAME = N'Schulze'
UPDATE SAPTABLE SET SPRSL = N'F' WHERE NAME = N'Miller' A
2013-02-01 13:24:23.430 spid9s Requested by:
2013-02-01 13:24:23.430 spid9s ResType:LockOwner Stype:'OR'Xdes:0x00000000071299C0 Mode: U
SPID:59 BatchID:0 ECID:0 TaskProxy:(0x0000000034AB8540)
Value:0x9845dfc0 Cost:(0/684)
2013-02-01 13:24:23.430 spid9s NULL
2013-02-01 13:24:23.430 spid9s Node:2
2013-02-01 13:24:23.430 spid9s KEY: 11:72057594038779904 (86c3d23ec34f) CleanCnt:2 Mode:X
2013-02-01 13:24:23.430 spid9s Grant List 0:
2013-02-01 13:24:23.430 spid9s Owner:0x000000008913FB80 Mode: X Flg:0x40 Ref:0
Life:02000000 SPID:59 ECID:0 XactLockInfo: 0x0000000007129A0
2013-02-01 13:24:23.430 spid9s SPID: 59 ECID: 0 Statement Type: UPDATE Line #: 13
2013-02-01 13:24:23.430 spid9s Input Buf: Language Event: use [DL_Analysis]
..
System health X-Event Session
If you are on SQL Server 2008 or higher you can select the XML directly from the internal SystemHealth X-Event session:
select
XEventData.XEvent.value('(.)[1]/@timestamp','datetime')as[DateAndTime],
cast(XEventData.XEvent.value('(data/value)[1]','nvarchar(max)')asXML)asDeadlockGraph
FROM
(selectCAST(target_dataasxml)asTargetData
fromsys.dm_xe_session_targetsstjoin
sys.dm_xe_sessionssons.address=st.event_session_address
wherename='system_health')ASData
CROSSAPPLYTargetData.nodes('//RingBufferTarget/event')ASXEventData(XEvent)
whereXEventData.XEvent.value('@name','nvarchar(4000)')='xml_deadlock_report'
DBACOCKPIT (newer releases)
In newer releases of the SAP Database Monitor you can directly access the XML from within the transaction DBACOCKPIT:
Summary
Today we discussed what deadlocks are, when they occur and how to get more information about them. Using a basic and fast approach we have collected as much information as possible. In the next part of the deadlock series, I will show how to use this information in basic analysis methods and describe some ways to solve or to avoid the deadlocks.
Comments
Anonymous
May 27, 2014
"In the next part of the deadlock series, I will show how to use this information in basic analysis methods and describe some ways to solve or to avoid the deadlocks." Has there ever been a second part of this series? I couldn't find anything. Any hint would be appreciated. Thanks! SvenAnonymous
January 05, 2015
I find it useful especially I am facing a deadlock issue now but can someone please lead me to the next part of the series.Anonymous
January 18, 2015
The next part (Examples) was published today at blogs.msdn.com/.../deadlock-analysis-examples.aspx