How to Analyze "Deadlocked Schedulers" Dumps?
How to Analyze "Deadlocked Schedulers" Dumps?
Newer version of this post is available in https://mssqlwiki.com/2010/06/15/how-to-analyze-deadlocked-schedulers-dumps/
Do you see "Deadlocked Schedulers" errors similar to one below and
stuck?
From SQLServer Errorlog
**Dump thread - spid = 0, PSS = 0x0000000000000000, EC =
0x0000000000000000
***Stack Dump being sent to C:\Program Files\Microsoft SQL
Server\MSSQL.1\MSSQL\LOG\SQLDump0001.txt
* BEGIN STACK DUMP:
* Deadlocked Schedulers
* Short Stack Dump
Stack Signature for the dump is 0x00000000000003D0
New queries assigned to process on Node 0 have not been picked up by a worker thread in the last 60 seconds. Blocking or long-running queries can
contribute to this condition, and may degrade client response time. Use the "max worker threads" configuration option to increase number of allowable
threads, or optimize current running queries. SQL Process Utilization: 0%. System Idle: 69%.
Cause
We get Deadlocked Schedulers error when Scheduler Monitor detects Threads(workers) are not Progressing on schedulers.
Some of common causes are
1.Most of the tasks are waiting on a single resource and SQL Server could not spawn new thread to take new work request and there is no Idle thread to process the new work Request . In systems with multiple nodes (Numa) If all the threads which belong to schedulers of single node is exhausted (or) Schedulers not progressing on single node can cause deadlocked scheduler condition.
2. Excessive blocking, Very long running Queries executed by all workers, All the threads waiting on some resource.
Steps to analyze "Deadlocked Schedulers" Dumps.
To analyze the dump download and Install Windows Debugger from This link
Step 1:
Open Windbg . Choose File menu –> select Open crash dump –>Select the Dump file (SQLDump000#.mdmp)
Step 2:
on command window type
.sympath srv*c:\Websymbols*https://msdl.microsoft.com/download/symbols;
Step 3:
Type .reload /f and hit enter. This will force debugger to immediately load all the symbols.
Step 4:
Verify if symbols are loaded for SQL Server by using the debugger command lmvm
0:002> lmvm sqlservr
start end module name
00000000`01000000 00000000`03679000 sqlservr T (pdb symbols) c:\websymbols\sqlservr.pdb\21E4AC6E96294A529C9D99826B5A7C032\sqlservr.pdb
Loaded symbol image file: sqlservr.exe
Image path: C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Binn\sqlservr.exe
Image name: sqlservr.exe
Timestamp: Wed Oct 07 21:15:52 2009 (4ACD6778)
CheckSum: 025FEB5E
ImageSize: 02679000
File version: 2005.90.4266.0
Product version: 9.0.4266.0
File flags: 0 (Mask 3F)
File OS: 40000 NT Base
File type: 1.0 App
File date: 00000000.00000000
Translations: 0000.04b0 0000.04e4 0409.04b0 0409.04e4
Step 5:
Type ~*kL 20 and look at the stack of all the threads to find what majority of threads are doing.
1. If it is blocking issue and If most of the threads are waiting to
acquire a lock you will find the most of the stack similar to one below. (We try
to acquire lock and go to wait, since someone is holding a lock)
ntdll!ZwSignalAndWaitForSingleObject
kernel32!SignalObjectAndWait
sqlservr!SOS_Scheduler::SwitchContext
sqlservr!SOS_Scheduler::Suspend
sqlservr!SOS_Event::Wait
sqlservr!LockOwner::Sleep
sqlservr!lck_lockInternal
sqlservr!GetLock
2. If most of threads are stuck while trying to write profiler events
to the destination you might find stack similar to one below
ntdll!ZwSignalAndWaitForSingleObject
kernel32!SignalObjectAndWait
sqlservr!SOS_Scheduler::SwitchContext
sqlservr!SOS_Task::Sleep
sqlservr!CTraceRowsetIoProvider::GetFreeBuffers
sqlservr!CTraceWriteRequest::InitForRowsetTrace
sqlservr!CTraceRowsetIoProvider::InitializeWriteRequest
sqlservr!CTrace::WriteRecord
sqlservr!CTraceController::ProduceRecord
sqlservr!CTraceData::TracePreBatchEvent
sqlservr!CSQLSource::Execute
sqlservr!process_request
sqlservr!process_commands
sqlservr!SOS_Task::Param::Execute
sqlservr!SOS_Scheduler::RunTask
sqlservr!SOS_Scheduler::ProcessTasks
3. If your stack’s are like one below refer https://support.microsoft.com/default.aspx?scid=kb;EN-US;974205
sqlservr!SpinlockBase::Sleep sqlservr!SpinlockBase::SpinToAcquire
sqlservr!TSyncHashTable_EntryAccessorsqlservr!CQSIndexStatsMgr::AddNewMissingIndex
sqlservr!CIdxSuggestion::Register
sqlservr!COptExpr::PqteConvert sqlservr!CPhyOp_Top::PqteConvert sqlservr!COptExpr::PqteConvert
sqlservr!COptExpr::PqteConvertTree sqlservr!COptContext::PcxteOptimizeQuery
sqlservr!CQuery::Optimize sqlservr!CQuery::PqoBuild sqlservr!CStmtQuery::InitQuery
sqlservr!CStmtSelect::Init
4. If you see many stacks like the one below it could be BPOOL memory pressure (or) Lazy writer waiting on I/O
sqlservr!BPool::Steal
sqlservr!SQLSinglePageAllocator::AllocatePages
sqlservr!MemoryNode::AllocatePagesInternal
sqlservr!MemoryClerkInternal::AllocatePages
sqlservr!IMemObj::PbGetNewPages
sqlservr!CSlotPageMgr::PbAllocate
5. If you see many stacks like the one below it should be because of excessive parallelism
sqlservr!CQScanXProducerNew::Open
sqlservr!FnProducerOpen
sqlservr!FnProducerThread
sqlservr!SubprocEntrypoint
6. If you see many stacks like the one below (Many threads waiting to flush log) it should be because of disk bottleneck's. Check if you see "I/O requests taking longer than 15 seconds" messages in Errorlog before Deadlocked Schedulers Dumps. Refer https://mssqlwiki.com/sqlwiki/sql-performance/io-bottlenecks/ for troubleshooting I/O issues.
sqlservr!SOS_Event::Wait
sqlservr!SQLServerLogMgr::WaitLCFlush sqlservr!SQLServerLogMgr::LogFlush sqlservr!SQLServerLogMgr::WaitLogFlush
sqlservr!XdesRMFull::Commit
Related blogs:
https://mssqlwiki.com/sqlwiki/sql-performance/basics-of-sql-server-memory-architecture/
https://mssqlwiki.com/sqlwiki/sql-performance/troubleshooting-sql-server-memory/
https://mssqlwiki.com/sqlwiki/sql-performance/io-bottlenecks/
If you liked this post, do like us on FaceBook at https://www.facebook.com/mssqlwiki and join our FaceBook group https://www.facebook.com/mssqlwiki#!/groups/454762937884205/
Regards
Karthick P.K
Comments
Anonymous
October 29, 2010
Thanks for sharing this article.Anonymous
March 17, 2011
I'm trying to review one of our dumps but do not get the nice names for the sqlservr! entries. I am not loading the SQL symbols? Thanks for this post.Anonymous
July 03, 2011
Contain very good information. Thans You KarthickAnonymous
January 28, 2012
Good article