How to track down High CPU in User Mode Applications - A live debug!
Written by Jeff Dailey.
Hello NTDebuggers, I’d like to talk about a common issue we deal with on a regular basis. We are often tasked with finding what functions are using CPU within a user mode process / application. Typically a user will find an application that is using more CPU then they expect it to, and this can affect overall system performance and responsiveness.
For this exercise I’ve written some contrived sample code called EATCPU. It’s included at the bottom of the blog post. The following image of task manager shows EATCPU consuming 41% CPU time. A customer or user may tell you this does not “normally” happen. It’s always good to ask what “normal” is. In this case we will say normal is ~10%.
The best case scenario is a live debug on the process that is running at high CPU levels. If you’re lucky enough to have a customer / user that will let you do a remote debug, and the problem reproduces on demand, you can take the following action.
You need to install the debugging tools for Windows, and set your symbol path. If at all possible acquire the symbols for the application you are debugging. We’ll assume that you are the expert that supports said program. If it’s written in house, get the symbols from the developer. If it’s from a third party, that vendor may be willing to provide you with symbols for their product. Microsoft has most of the symbols for our products available on our symbol server on the internet (!sympath srv*DownstreamStore*https://msdl.microsoft.com/download/symbols.)
The next thing is to attach windbg.exe to the process in question.
From the debuggers directory, type TLIST, this will list your process. Get the process id and then run WinDBG.EXE –p PROCESSID, or if your debugging a program like EATCPU, you can run WINDBG C:\program\EATCPU.EXE.
After the debugger is attached or the process is started in the debugger, reproduce the problem.
Microsoft (R) Windows Debugger Version 6.8.0001.0
Copyright (c) Microsoft Corporation. All rights reserved.
***** WARNING: Your debugger is probably out-of-date.
***** Check https://dbg for updates.
CommandLine: eatcpu.exe
Symbol search path is: srv*C:\symbols*\\symbols\symbols
Executable search path is:
ModLoad: 00400000 0041a000 eatcpu.exe
ModLoad: 779b0000 77b00000 ntdll.dll
ModLoad: 76780000 76890000 C:\Windows\syswow64\kernel32.dll
ModLoad: 62bb0000 62cd1000 C:\Windows\WinSxS\x86_microsoft.vc80.debugcrt_1fc8b3b9a1e18e3b_8.0.50727.762_none_24c8a196583ff03b\MSVCR80D.dll
ModLoad: 75cb0000 75d5a000 C:\Windows\syswow64\msvcrt.dll
(1090.164): Break instruction exception - code 80000003 (first chance)
eax=00000000 ebx=00000000 ecx=712b0000 edx=00000000 esi=fffffffe edi=77a90094
eip=779c0004 esp=0017faf8 ebp=0017fb28 iopl=0 nv up ei pl zr na pe nc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246
ntdll!DbgBreakPoint:
779c0004 cc int 3
0:000> g
(1090.11d4): Break instruction exception - code 80000003 (first chance)
eax=7efa3000 ebx=00000000 ecx=00000000 edx=77a1d894 esi=00000000 edi=00000000
eip=779c0004 esp=0109ff74 ebp=0109ffa0 iopl=0 nv up ei pl zr na pe nc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246
ntdll!DbgBreakPoint:
779c0004 cc int 3
0:007> .sympath SRV*c:\websymbols*https://msdl.microsoft.com/download/symbols
Symbol search path is: SRV*c:\websymbols*https://msdl.microsoft.com/download/symbols
0:007> g
(1090.17d4): Break instruction exception - code 80000003 (first chance)
eax=7efa3000 ebx=00000000 ecx=00000000 edx=77a1d894 esi=00000000 edi=00000000
eip=779c0004 esp=0109ff74 ebp=0109ffa0 iopl=0 nv up ei pl zr na pe nc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246
ntdll!DbgBreakPoint:
779c0004 cc int 3
Once the problem has started run the !runaway command.
0:007> !runaway
User Mode Time
Thread Time
2:c04 0 days 0:01:08.827 ß Note this thread, thread 2:c04 is using more CPU than any other.
7:17d4 0 days 0:00:00.000 ß Note the other threads are using very little if any CPU.
6:1a4c 0 days 0:00:00.000
5:d20 0 days 0:00:00.000
4:157c 0 days 0:00:00.000
3:1b28 0 days 0:00:00.000
1:1134 0 days 0:00:00.000
0:164 0 days 0:00:00.000
0:007> ~2s ß Using the thread number 2, set the thread context with the ~Ns command.
*** WARNING: Unable to verify checksum for eatcpu.exe
eax=cccccccc ebx=00b93c48 ecx=0000002b edx=00b937a8 esi=00000000 edi=00d9fcf0
eip=0041169c esp=00d9fcd0 ebp=00d9fd9c iopl=0 nv up ei pl nz na po nc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010202
eatcpu!checkSomething+0x1c:
0041169c f3ab rep stos dword ptr es:[edi] es:002b:00d9fcf0=cccccccc
0:002> k ß Dump the call stack using k.
If you look at the following call stack, the applications code in this thread starts where you see EATCPU, the code before that is C Runtime code for begin thread. I want to trace all the code that is running under being thread. The assumption here is that I’ll find something looping and eating CPU. To do this I will use the WT command. However first I need to specify a beginning address for WT to start trace at.
Let's use UF to Unassemble the Function that started our code by taking the return address of eatcpu!myThreadFunction.
ChildEBP RetAddr
00d9fd9c 00411657 eatcpu!checkSomething+0x1c [c:\source\eatcpu\eatcpu\eatcpu.cpp @ 49]
00d9fe74 004115a8 eatcpu!trySomething+0x27 [c:\source\eatcpu\eatcpu\eatcpu.cpp @ 45]
00d9ff58 62bb4601 eatcpu!myThreadFunction+0x38 [c:\source\eatcpu\eatcpu\eatcpu.cpp @ 35]
00d9ff94 62bb459c MSVCR80D!_beginthread+0x221
00d9ffa0 768019f1 MSVCR80D!_beginthread+0x1bc
00d9ffac 77a2d109 kernel32!BaseThreadInitThunk+0xe
00d9ffec 00000000 ntdll!_RtlUserThreadStart+0x23 [d:\vistartm\base\ntos\rtl\rtlexec.c @ 2695]
0:002> uf 004115a8 ß This command will unassemble the function at this address beginning to end.
0:007> uf 004115a8
eatcpu!myThreadFunction [c:\source\eatcpu\eatcpu\eatcpu.cpp @ 30]:
30 00411570 55 push ebp
30 00411571 8bec mov ebp,esp
30 00411573 81eccc000000 sub esp,0CCh
30 00411579 53 push ebx
30 0041157a 56 push esi
30 0041157b 57 push edi
30 0041157c 8dbd34ffffff lea edi,[ebp-0CCh]
30 00411582 b933000000 mov ecx,33h
30 00411587 b8cccccccc mov eax,0CCCCCCCCh
30 0041158c f3ab rep stos dword ptr es:[edi]
31 0041158e 8b4508 mov eax,dword ptr [ebp+8]
31 00411591 8945f8 mov dword ptr [ebp-8],eax
eatcpu!myThreadFunction+0x24 [c:\source\eatcpu\eatcpu\eatcpu.cpp @ 33]:
33 00411594 b801000000 mov eax,1
33 00411599 85c0 test eax,eax
33 0041159b 7410 je eatcpu!myThreadFunction+0x3d (004115ad)
eatcpu!myThreadFunction+0x2d [c:\source\eatcpu\eatcpu\eatcpu.cpp @ 35]:
35 0041159d 8b45f8 mov eax,dword ptr [ebp-8]
35 004115a0 8b08 mov ecx,dword ptr [eax]
35 004115a2 51 push ecx
35 004115a3 e880faffff call eatcpu!ILT+35(?trySomethingYAHHZ) (00411028)
35 004115a8 83c404 add esp,4
36 004115ab ebe7 jmp eatcpu!myThreadFunction+0x24 (00411594)
eatcpu!myThreadFunction+0x3d [c:\source\eatcpu\eatcpu\eatcpu.cpp @ 37]:
37 004115ad 5f pop edi
37 004115ae 5e pop esi
37 004115af 5b pop ebx
37 004115b0 81c4cc000000 add esp,0CCh
37 004115b6 3bec cmp ebp,esp
37 004115b8 e8a1fbffff call eatcpu!ILT+345(__RTC_CheckEsp) (0041115e)
37 004115bd 8be5 mov esp,ebp
37 004115bf 5d pop ebp
37 004115c0 c3 ret
0:002> wt -or 00411570 ß We will use WT to Watch Trace this code. I’ve selected the starting address of the myThreadFunction function. I’ve also specified –or to print the return value of each function. Wt produces very visual output. It allows you to quickly identify patterns in the way the code executes that would be much more difficult just using the T (TRACE) command.
8 0 [ 0] ntdll!RtlSetLastWin32Error eax = 0
>> No match on ret
8 0 [ 0] ntdll!RtlSetLastWin32Error
2 0 [ 0] eatcpu!checkSomething
1 0 [ 1] eatcpu!ILT+345(__RTC_CheckEsp)
2 0 [ 1] eatcpu!_RTC_CheckEsp eax = 0
9 3 [ 0] eatcpu!checkSomething
1 0 [ 1] eatcpu!ILT+345(__RTC_CheckEsp)
2 0 [ 1] eatcpu!_RTC_CheckEsp eax = 0
12 6 [ 0] eatcpu!checkSomething eax = 0
>> No match on ret
12 6 [ 0] eatcpu!checkSomething
7 0 [ 0] eatcpu!trySomething
1 0 [ 1] eatcpu!ILT+345(__RTC_CheckEsp)
2 0 [ 1] eatcpu!_RTC_CheckEsp eax = 0
10 3 [ 0] eatcpu!trySomething eax = 0
>> No match on ret
10 3 [ 0] eatcpu!trySomething
9 0 [ 0] eatcpu!myThreadFunction ß I see a pattern, a loop. Beginning of loop.
1 0 [ 1] eatcpu!ILT+35(?trySomethingYAHHZ)
60 0 [ 1] eatcpu!trySomething
1 0 [ 2] eatcpu!ILT+180(?checkSomethingYAHHZ)
62 0 [ 2] eatcpu!checkSomething
5 0 [ 3] kernel32!SetLastError
16 0 [ 3] ntdll!RtlSetLastWin32Error eax = 0
64 21 [ 2] eatcpu!checkSomething
1 0 [ 3] eatcpu!ILT+345(__RTC_CheckEsp)
2 0 [ 3] eatcpu!_RTC_CheckEsp eax = 0
71 24 [ 2] eatcpu!checkSomething
1 0 [ 3] eatcpu!ILT+345(__RTC_CheckEsp)
2 0 [ 3] eatcpu!_RTC_CheckEsp eax = 0
74 27 [ 2] eatcpu!checkSomething eax = 0
67 102 [ 1] eatcpu!trySomething
1 0 [ 2] eatcpu!ILT+345(__RTC_CheckEsp)
2 0 [ 2] eatcpu!_RTC_CheckEsp eax = 0
70 105 [ 1] eatcpu!trySomething eax = 0
18 176 [ 0] eatcpu!myThreadFunction ßEnd of loop / beginning of loop
1 0 [ 1] eatcpu!ILT+35(?trySomethingYAHHZ)
60 0 [ 1] eatcpu!trySomething
1 0 [ 2] eatcpu!ILT+180(?checkSomethingYAHHZ)
62 0 [ 2] eatcpu!checkSomething
5 0 [ 3] kernel32!SetLastError ßAlways look for what might be going wrong! Last error can give you a clue. We are setting last error at the low level of the loop
16 0 [ 3] ntdll!RtlSetLastWin32Error eax = 0
64 21 [ 2] eatcpu!checkSomething
1 0 [ 3] eatcpu!ILT+345(__RTC_CheckEsp)
2 0 [ 3] eatcpu!_RTC_CheckEsp eax = 0
71 24 [ 2] eatcpu!checkSomething
1 0 [ 3] eatcpu!ILT+345(__RTC_CheckEsp)
2 0 [ 3] eatcpu!_RTC_CheckEsp eax = 0
74 27 [ 2] eatcpu!checkSomething eax = 0 ßAlso note checkSomething is returning ZERO, this might indicate a problem. You need to look at the code or assembler.
67 102 [ 1] eatcpu!trySomething
1 0 [ 2] eatcpu!ILT+345(__RTC_CheckEsp)
2 0 [ 2] eatcpu!_RTC_CheckEsp eax = 0
70 105 [ 1] eatcpu!trySomething eax = 0
27 352 [ 0] eatcpu!myThreadFunction ßEnd of loop / beginning of loop
1 0 [ 1] eatcpu!ILT+35(?trySomethingYAHHZ)
60 0 [ 1] eatcpu!trySomething
1 0 [ 2] eatcpu!ILT+180(?checkSomethingYAHHZ)
62 0 [ 2] eatcpu!checkSomething
5 0 [ 3] kernel32!SetLastError
16 0 [ 3] ntdll!RtlSetLastWin32Error eax = 0
64 21 [ 2] eatcpu!checkSomething
1 0 [ 3] eatcpu!ILT+345(__RTC_CheckEsp)
2 0 [ 3] eatcpu!_RTC_CheckEsp eax = 0
71 24 [ 2] eatcpu!checkSomething
1 0 [ 3] eatcpu!ILT+345(__RTC_CheckEsp)
2 0 [ 3] eatcpu!_RTC_CheckEsp eax = 0
74 27 [ 2] eatcpu!checkSomething eax = 0
67 102 [ 1] eatcpu!trySomething
1 0 [ 2] eatcpu!ILT+345(__RTC_CheckEsp)
2 0 [ 2] eatcpu!_RTC_CheckEsp eax = 0
70 105 [ 1] eatcpu!trySomething eax = 0
36 528 [ 0] eatcpu!myThreadFunction ßEnd of loop / beginning of loop
1 0 [ 1] eatcpu!ILT+35(?trySomethingYAHHZ)
60 0 [ 1] eatcpu!trySomething
1 0 [ 2] eatcpu!ILT+180(?checkSomethingYAHHZ)
62 0 [ 2] eatcpu!checkSomething
5 0 [ 3] kernel32!SetLastError
16 0 [ 3] ntdll!RtlSetLastWin32Error eax = 0
64 21 [ 2] eatcpu!checkSomething
1 0 [ 3] eatcpu!ILT+345(__RTC_CheckEsp)
2 0 [ 3] eatcpu!_RTC_CheckEsp eax = 0
71 24 [ 2] eatcpu!checkSomething
1 0 [ 3] eatcpu!ILT+345(__RTC_CheckEsp)
2 0 [ 3] eatcpu!_RTC_CheckEsp eax = 0
74 27 [ 2] eatcpu!checkSomething eax = 0
67 102 [ 1] eatcpu!trySomething
12930 instructions were executed in 12929 events (0 from other threads)
Function Name Invocations MinInst MaxInst AvgInst
eatcpu!ILT+180(?checkSomethingYAHHZ) 69 1 1 1
eatcpu!ILT+345(__RTC_CheckEsp) 210 1 1 1
eatcpu!ILT+35(?trySomethingYAHHZ) 70 1 1 1
eatcpu!_RTC_CheckEsp 210 2 2 2
eatcpu!checkSomething 70 60 74 73
eatcpu!myThreadFunction 1 630 630 630
eatcpu!trySomething 71 10 70 68
kernel32!SetLastError 70 5 5 5
ntdll!RtlSetLastWin32Error 70 16 16 16
0 system calls were executed
eax=cccccccc ebx=00b93c48 ecx=00000002 edx=00b937a8 esi=00000000 edi=00d9fe6c
eip=0041164c esp=00d9fda8 ebp=00d9fe74 iopl=0 nv up ei pl nz na pe nc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010206
eatcpu!trySomething+0x1c:
0041164c f3ab rep stos dword ptr es:[edi] es:002b:00d9fe6c=cccccccc
0:002> !gle ßNow that we have broken in let's check and see what the last error is using !GLE (Get Last Error) This dumps out the last error from the TEB.
LastErrorValue: (Win32) 0x57 (87) - The parameter is incorrect.
LastStatusValue: (NTSTATUS) 0xc000000d - An invalid parameter was passed to a service or function.
0:007> bp kernel32!SetLastError ßLets set a breakpoint on last error to examine what is going on in the function calling it.
0:007> g
Breakpoint 1 hit
eax=cccccccc ebx=00b93c48 ecx=00000000 edx=00b937a8 esi=00d9fcd0 edi=00d9fd9c
eip=767913dd esp=00d9fcc8 ebp=00d9fd9c iopl=0 nv up ei pl zr na pe nc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246
kernel32!SetLastError:
767913dd 8bff mov edi,edi
0:002> kv ßGet the call stack
ChildEBP RetAddr
0:002> kv
ChildEBP RetAddr Args to Child
00d9fcc4 004116cb 00000057 00d9fe74 00000000 kernel32!SetLastError (FPO: [Non-Fpo]) ß0x57 Invalid parameter error, Why?
00d9fd9c 00411657 00000000 00d9ff58 00000000 eatcpu!checkSomething+0x4b (FPO: [Non-Fpo]) (CONV: cdecl) [c:\source\eatcpu\eatcpu\eatcpu.cpp @ 57]
00d9fe74 004115a8 00000000 00000000 00000000 eatcpu!trySomething+0x27 (FPO: [Non-Fpo]) (CONV: cdecl) [c:\source\eatcpu\eatcpu\eatcpu.cpp @ 45]
00d9ff58 62bb4601 0017ff34 4f9f12e9 00000000 eatcpu!myThreadFunction+0x38 (FPO: [Non-Fpo]) (CONV: cdecl) [c:\source\eatcpu\eatcpu\eatcpu.cpp @ 35]
00d9ff94 62bb459c 00b937a8 00d9ffac 768019f1 MSVCR80D!_beginthread+0x221 (FPO: [Non-Fpo])
00d9ffa0 768019f1 00b937a8 00d9ffec 77a2d109 MSVCR80D!_beginthread+0x1bc (FPO: [Non-Fpo])
00d9ffac 77a2d109 00b93c48 00d926a6 00000000 kernel32!BaseThreadInitThunk+0xe (FPO: [Non-Fpo])
00d9ffec 00000000 62bb4520 00b93c48 00000000 ntdll!_RtlUserThreadStart+0x23 (FPO: [Non-Fpo]) (CONV: stdcall) [d:\vistartm\base\ntos\rtl\rtlexec.c @ 2695]
0:002> !error 00000057 ßdouble check, using !error, this will decode the error into a human readable string.
Error code: (Win32) 0x57 (87) - The parameter is incorrect.
0:002> uf checkSomething ßlets disassemble the function calling SetLastError.
eatcpu!checkSomething [c:\source\eatcpu\eatcpu\eatcpu.cpp @ 49]:
49 00411680 55 push ebp
49 00411681 8bec mov ebp,esp
49 00411683 81ecc0000000 sub esp,0C0h
49 00411689 53 push ebx
49 0041168a 56 push esi
49 0041168b 57 push edi
49 0041168c 8dbd40ffffff lea edi,[ebp-0C0h]
49 00411692 b930000000 mov ecx,30h
49 00411697 b8cccccccc mov eax,0CCCCCCCCh
49 0041169c f3ab rep stos dword ptr es:[edi]
50 0041169e 837d0800 cmp dword ptr [ebp+8],0 ßCheck what our first parameter is on the stack, EBP+8 remember PLUS == Parameters. Note looking at the stack it’s 00000000
50 004116a2 741d je eatcpu!checkSomething+0x41 (004116c1) ßif parameter 1 ones 0 we are going to jump to this addres, else we execute the following code. (WE JUMP) eatcpu!checkSomething+0x24 [c:\source\eatcpu\eatcpu\eatcpu.cpp @ 52]:
52 004116a4 8bf4 mov esi,esp ßThe green would have been the good code path, non errant.
52 004116a6 68fa000000 push 0FAh
52 004116ab ff15a8814100 call dword ptr [eatcpu!_imp__Sleep (004181a8)] ßNote we sleep or do some work other then eat CPU here if we are passed non ZERO
52 004116b1 3bf4 cmp esi,esp
52 004116b3 e8a6faffff call eatcpu!ILT+345(__RTC_CheckEsp) (0041115e)
53 004116b8 b801000000 mov eax,1 ßWe are setting EAX to 1, this means we have succeded
53 004116bd eb15 jmp eatcpu!checkSomething+0x54 (004116d4) ßNow we jump to the clean up code for the fucntionn
eatcpu!checkSomething+0x41 [c:\source\eatcpu\eatcpu\eatcpu.cpp @ 57]:
57 004116c1 8bf4 mov esi,esp ßThis appears to be a failure case. We did not get an expected parameter so we report an error and return zero.
57 004116c3 6a57 push 57h ßPusing error 0x57 on the stack, invalid parameter.
57 004116c5 ff15a4814100 call dword ptr [eatcpu!_imp__SetLastError (004181a4)] ßOur call to setlasterror
57 004116cb 3bf4 cmp esi,esp
57 004116cd e88cfaffff call eatcpu!ILT+345(__RTC_CheckEsp) (0041115e)
58 004116d2 33c0 xor eax,eax ßXORing eax with eax will make EAX Zero. This is an error condition.
eatcpu!checkSomething+0x54 [c:\source\eatcpu\eatcpu\eatcpu.cpp @ 60]:
60 004116d4 5f pop edi
60 004116d5 5e pop esi
60 004116d6 5b pop ebx
60 004116d7 81c4c0000000 add esp,0C0h
60 004116dd 3bec cmp ebp,esp
60 004116df e87afaffff call eatcpu!ILT+345(__RTC_CheckEsp) (0041115e)
60 004116e4 8be5 mov esp,ebp
60 004116e6 5d pop ebp
60 004116e7 c3 ret
The key thing that should be observed by this scenario is that when dealing with a high CPU condition there is often some problem at the lower level of some loop condition that prevents the proper execution of code from happening. If you’re lucky the problem is reported by some error facility in the OS or the application. In either case you can use the above technique for isolation.
The following is the sample code for EATCPU.
// eatcpu.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include <windows.h>
#include <process.h>
void myThreadFunction(void *);
int doSomething(int);
int trySomething(int);
int checkSomething(int);
int _tmain(int argc, _TCHAR* argv[])
{
int truevalue = 1;
int falsevalue = 0;
// let's start some threads. These should represent worker threads in a process. Some of them will do “valid work” one will fail to do so.
_beginthread(myThreadFunction,12000,(void *)&truevalue);
_beginthread(myThreadFunction,12000,(void *)&truevalue);
_beginthread(myThreadFunction,12000,(void *)&truevalue);
_beginthread(myThreadFunction,12000,(void *)&falsevalue );
_beginthread(myThreadFunction,12000,(void *)&truevalue);
_beginthread(myThreadFunction,12000,(void *)&truevalue);
Sleep(10*60000);
return 0;
}
void myThreadFunction(void *value)
{
int *localvalue = (int *)value;
while(1)
{
trySomething(*localvalue);
}
}
int doSomething(int value)
{
return trySomething(value);
}
int trySomething(int value)
{
return checkSomething(value);
}
int checkSomething(int value)
{
if(value)
{ // Make sure we have have some valid input parameter.
//We will pretend we are doing work, this could be anything , file I/O etc.
Sleep(250);
return TRUE;
}
else
{ // This is an error condition, this function expects an NON Zero parameter and will report ZERO as an invalid parameter.
SetLastError(ERROR_INVALID_PARAMETER);
return FALSE;
}
}
Comments
Anonymous
May 15, 2008
PingBack from http://windows.wawblog.info/?p=38713Anonymous
May 22, 2008
Great reusable post! Thank you!Anonymous
July 08, 2008
One of the more difficult things about getting started with .NET Debugging is the mental shift that youAnonymous
July 14, 2008
The comment has been removedAnonymous
July 14, 2008
DebugHowtotrackdownHighCPUinUserModeApplications-Alivedebug!IsItaManagedoraNa...Anonymous
September 02, 2008
Can you also upload the EXE File.. that you have used