Compartir a través de


Part 2: Got Stack? No. We ran out and kv won’t tell me why!

Hello. It’s Ryan again with the second installment of my stack depletion walkthrough. Part 1 of this blog covered the initial analysis of a kernel memory dump captured due to a Stop 0x7f EXCEPTION_DOUBLE_FAULT. Our initial analysis revealed that kv was not able to provide us with a useful stack backtrace. Background information relating to Task States and Double Faults were also covered. If you haven’t yet reviewed this blog, you can find it at Part 1.

 

Previously, in part one of this blog, we reviewed the memory dump of a stop 0x7f EXCEPTION_DOUBLE_FAULT and found that we were not provided with a valid stack backtrace. Without a valid stack backtrace, we were unable to identify what depleted the stack. Whenever I am in the situation where kv is not able to properly walk the stack, my next course of action is to manually dump out the memory within the stack range using the dps command.

In the previous blog, we ran the !thread command to obtain the stack base and limit to view the thread in our fictiously named process, StackHog.exe.

!thread

Owning Process 874c6800 Image: StackHog.exe

...

Base b8ae9000 Limit b8ae6000

Let’s pass this address range to the dps command. The dps command will display the contents of memory and treat the data dumped out as pointers. It will then try to match up these pointer addresses with symbol information in the same manner that kv does.

dps b8ae6000 b8ae9000

.

.

.

 

omitting extremely long output

I'll spare the lengthy dps output here and instead describe what was observed. When dumping the stack in this manner, I noticed that one product’s drivers were appearing on the stack over and over again. This product also provided the executable that was running as the current process (I obtained this information from the !thread command that I ran to obtain the limit and base values). Running lm kv m and !lmi against these drivers verified that they were in fact all from the same product (StackHog.exe).

Using dps in this manner will often provide a good idea of what may have been using the stack. However, there are a few problems that may affect the reliability of this method. Dps simply dumps out whatever data is present anywhere on the stack. One problem is there can often be trash left over on the stack from previous activity that is unrelated to our present stack. Also, there may be trap handlers and other data that shows up to further complicate the call flow picture. In addition, the information will not clearly show the order of calls like a kv stack backtrace output will. The output can also be very lengthy since the entire range is displayed. On a stack where each call frame averages 8 frames, you may be only interested in the symbols from the return address in each frame. I have observed some stack frames where dps may resolve two or three symbols for that frame. All of this means that while dps is a good tool and often useful, on occasion some of the output from dps may be confusing instead of revealing. Sometimes when multiple unrelated code from various vendors (including Microsoft) are on the stack, you may need a more complete understanding of the true call flow. For these reasons, whenever possible, I want to be able to see the kv stack backtrace to display the call flow that led up to the crash.

Let's see if we can help kv to reconstruct the stack. Since the stack backtrace did not display properly on its own, we will need to help out a little to get the backtrace started. If we can provide good starting point values, the debugger can often dump out the rest of the stack. Also, when dealing with an overflowed stack, we will want to use the kf option instead of kv. The kf command will give us a better idea of how much space a driver and the calls it made are occupying on the stack. It does this by listing the amount of stack space between the stack frames. If you take these stack usage values as fact, you are assuming that the stack was backtraced correctly. Sometimes this isn’t the case when you don’t have symbols for all the code on the stack. So to investigate the stack usage, we need to get a good backtrace. The k command accepts parameters that will help it display the stack using address values that you provide. We need to provide a few addresses to any of the various versions of the k command by using the equal sign as I will demonstrate shortly. We need to provide the BasePtr, the StackPtr, and the InstructionPtr. For more information on the k command, refer to the msdn documentation .

Since the real problem here is that various modules have used up all the stack, we simply need to dump out most of that stack to see where most of the usage is. It isn't important that we identify what was going on when we died or what was at the very top of the stack. We just want to see as much of the stack as we can dumped out in an easy to understand format by using kf.

 So where can we find these values? I'll start by dumping out the stack using the address of the stack limit. Let's dump out this area of memory. I'll start by clearing the screen so that I can examine the output

3: kd>.cls

3: kd> dps b8ae6000 b8ae9000

                               <omitting output>

 

I wasn’t able to locate any valid patterns in the very top of the stack. This was probably due to lack of symbols and valid stack related register addresses combined with FPO or other optimizations. Things started making sense further down the stack in the stack range that I have listed below. Observe the patterns I have highlighted. Note that the symbol output is listed next to the return addresses. I will try to display my stack starting at this point. B8ae6100 is very close to the stack limit so we won’t be missing much of the output. We will get most of the stack output that we need to see if we start here.

b8ae60ec b8ae6100    <----------------------------------points to the next stack frame pointer (saved ebp) below

b8ae60f0 8081df65 nt!IofCallDriver+0x45 <---possible Return address. Start here.

b8ae60f4 8763f718

b8ae60f8 87758bd8

b8ae60fc 8b4abb00

b8ae6100  b8ae6128 This points to the next frame

b8ae6104 f7a2ec45 fltmgr!FltpDispatch+0x6f <----possible return address

b8ae6108 8b4abb00

b8ae610c 87758bd8

b8ae6110 00000000

b8ae6114 89751350

b8ae6118 00000080

b8ae611c 00000000

b8ae6120 b8ae6130

b8ae6124 8084cff9 nt!MmIsAddressValid+0xf

b8ae6128  b8ae613c

b8ae612c 8081df65 nt!IofCallDriver+0x45

b8ae6130 8765b6a8

b8ae6134 87758bd8

b8ae6138 89751350

b8ae613c  b8ae6144

b8ae6140 b958e196 BossHog+0x1196

b8ae6144  b8ae61c4

b8ae6148 b958f4bc BossHog!StackEater+0x80c

I'll start by identifying all of the stack address in the hopes of finding any candidates for saved stack frame pointers (saved ebp values). Based on the limit and base, these addresses will all start with b8ae. The next digit will be 6, 7, 8, or 9. I have identified all of the possible values. Next, I looked for patterns of addresses that are pointing to other pointers below them to create a chain. Next, I will look for possible return addresses. They should appear on the line right under the saved EBP value.

I'll try dumping the stack using the values from the first possible frame. The format is

kf=BasePtr, StackPtr, InstructionPtr

So now, it appears that we have a pattern, let me pass in these numbers to the kf command. If this works, then the stack walker will dump the stack out from this point down up to the maximum that you have set by using the .kframes command. I’ll start by raising the number of stack frames displayed using .kframes.

3: kd> .kframes 200

Default stack trace depth is 0n512 frames

Now, lets dump the stack out using kf. Kf will display the number of bytes of stack space used in hex on each line before it displays the frame.

3: kd> kf=b8ae6100 b8ae60ec 8081df65

  Memory ChildEBP RetAddr

          b8ae6100 f7a2ec45 nt!IofCallDriver+0x45

       28 b8ae6128 8081df65 fltmgr!FltpDispatch+0x6f

       14 b8ae613c b958e196 nt!IofCallDriver+0x45

WARNING: Stack unwind information not available. Following frames may be wrong.

        8 b8ae6144 b958f4bc BabyHog+0x1196

       80 b8ae61c4 8081df65 BabyHog!HogFarm+0x88c

       14 b8ae61d8 b80c18a6 nt!IofCallDriver+0x45

       84 b8ae625c b80cf367 BossHog+0x78a6

       10 b8ae626c b80cf3b7 BossHog+0x15367

       28 b8ae6294 8081df65 BossHog!DEVICEDISPATCH::DispatchPassThrough+0x48

       14 b8ae62a8 b76fbcf7 nt!IofCallDriver+0x45

       84 b8ae632c b7709ae6 BossHog01+0x7cf7

       10 b8ae633c b7709b36 BossHog01+0x15ae6

       28 b8ae6364 8081df65 BossHog01!DEVICEDISPATCH::DispatchPassThrough+0x48

       14 b8ae6378 8081e4ed nt!IofCallDriver+0x45

       18 b8ae6390 8085114a nt!IoPageRead+0x109

       9c b8ae642c 8085ea66 nt!MiDispatchFault+0xece

       84 b8ae64b0 8088c798 nt!MmAccessFault+0x89e

        0 b8ae64b0 808b64a6 nt!_KiTrap0E+0xdc

       c8 b8ae6578 bae5af2d nt!CcMapData+0x8c

       20 b8ae6598 bae5849b Ntfs!NtfsMapStream+0x4b

       74 b8ae660c bae5adf0 Ntfs!NtfsReadMftRecord+0x86

       38 b8ae6644 bae5afac Ntfs!NtfsReadFileRecord+0x7a

       38 b8ae667c bae19903 Ntfs!NtfsLookupInFileRecord+0x37

      110 b8ae678c bae1a6c4 Ntfs!NtfsLookupAllocation+0xdd

      1d0 b8ae695c bae1a87c Ntfs!NtfsPrepareBuffers+0x25d

      1dc b8ae6b38 bae1b1a6 Ntfs!NtfsNonCachedIo+0x1ee

       ec b8ae6c24 bae1b0c9 Ntfs!NtfsCommonRead+0xaf5

      1ac b8ae6dd0 8081df65 Ntfs!NtfsFsdRead+0x113

       14 b8ae6de4 f7a2ec45 nt!IofCallDriver+0x45

       28 b8ae6e0c 8081df65 fltmgr!FltpDispatch+0x6f

       14 b8ae6e20 b958e196 nt!IofCallDriver+0x45

        8 b8ae6e28 b958f4bc BabyHog+0x1196

       80 b8ae6ea8 8081df65 BabyHog!HogFarm+0x88c

       14 b8ae6ebc b80c18a6 nt!IofCallDriver+0x45

       84 b8ae6f40 b80cf367 BossHog+0x78a6

       10 b8ae6f50 b80cf3b7 BossHog+0x15367

       28 b8ae6f78 8081df65 BossHog!DEVICEDISPATCH::DispatchPassThrough+0x48

       14 b8ae6f8c b76fbcf7 nt!IofCallDriver+0x45

       84 b8ae7010 b7709ae6 BossHog01+0x7cf7

       10 b8ae7020 b7709b36 BossHog01+0x15ae6

       28 b8ae7048 8081df65 BossHog01!DEVICEDISPATCH::DispatchPassThrough+0x48

       14 b8ae705c 8081e4ed nt!IofCallDriver+0x45

       18 b8ae7074 8085114a nt!IoPageRead+0x109

       9c b8ae7110 8085ea66 nt!MiDispatchFault+0xece

       84 b8ae7194 8088c798 nt!MmAccessFault+0x89e

        0 b8ae7194 808b64a6 nt!_KiTrap0E+0xdc

       c8 b8ae725c bae5af2d nt!CcMapData+0x8c

       20 b8ae727c bae5d9d5 Ntfs!NtfsMapStream+0x4b

       30 b8ae72ac bae5f5e4 Ntfs!ReadIndexBuffer+0x8f

      174 b8ae7420 bae5f786 Ntfs!NtfsUpdateFileNameInIndex+0x62

       fc b8ae751c bae5f8c6 Ntfs!NtfsUpdateDuplicateInfo+0x2b0

      208 b8ae7724 bae5c8d9 Ntfs!NtfsCommonCleanup+0x1e82

      170 b8ae7894 8081df65 Ntfs!NtfsFsdCleanup+0xcf

       14 b8ae78a8 f7a2ec45 nt!IofCallDriver+0x45

       28 b8ae78d0 8081df65 fltmgr!FltpDispatch+0x6f

       14 b8ae78e4 b958e196 nt!IofCallDriver+0x45

        8 b8ae78ec b958f472 BabyHog+0x1196

       80 b8ae796c 8081df65 BabyHog!HogFarm+0x842

       14 b8ae7980 b80c18a6 nt!IofCallDriver+0x45

       84 b8ae7a04 b80cf367 BossHog+0x78a6

       10 b8ae7a14 b80cf3b7 BossHog+0x15367

       28 b8ae7a3c 8081df65 BossHog!DEVICEDISPATCH::DispatchPassThrough+0x48

       14 b8ae7a50 b770a8ac nt!IofCallDriver+0x45

       28 b8ae7a78 b76fb994 BossHog01!DEVICEDISPATCH::LowerDevicePassThrough+0x48

       7c b8ae7af4 b76fbbaf BossHog01+0x7994

       94 b8ae7b88 b7709ae6 BossHog01+0x7baf

       10 b8ae7b98 b7709b36 BossHog01+0x15ae6

       28 b8ae7bc0 8081df65 BossHog01!DEVICEDISPATCH::DispatchPassThrough+0x48

       14 b8ae7bd4 808f9732 nt!IofCallDriver+0x45

       30 b8ae7c04 80934bac nt!IopCloseFile+0x2ae

       30 b8ae7c34 809344ad nt!ObpDecrementHandleCount+0xcc

       28 b8ae7c5c 80934546 nt!ObpCloseHandleTableEntry+0x131

       44 b8ae7ca0 80934663 nt!ObpCloseHandle+0x82

       10 b8ae7cb0 8088978c nt!NtClose+0x1b

        0 b8ae7cb0 8082e811 nt!KiFastCallEntry+0xfc

       7c b8ae7d2c b8d8ec2d nt!ZwClose+0x11

       50 b8ae7d7c b8d8ede5 MamaHog+0x5c2d

       54 b8ae7dd0 b8d8fa85 MamaHog+0x5de5

      164 b8ae7f34 b8d917fe MamaHog+0x6a85

       40 b8ae7f74 b8d8d22a MamaHog+0x87fe

      2c4 b8ae8238 b958ecdf MamaHog+0x422a

       24 b8ae825c b958eee0 BabyHog!HogFarm+0xaf

       34 b8ae8290 8081e103 BabyHog!HogFarm+0x2b0

       30 b8ae82c0 bae1a22c nt!IopfCompleteRequest+0xcd

       10 b8ae82d0 bae5c00a Ntfs!NtfsCompleteRequest+0xc8

      104 b8ae83d4 8081df65 Ntfs!NtfsFsdCreate+0x48c

       14 b8ae83e8 f7a3c458 nt!IofCallDriver+0x45

       2c b8ae8414 8081df65 fltmgr!FltpCreate+0xe4

       14 b8ae8428 b958e196 nt!IofCallDriver+0x45

        8 b8ae8430 b958f71c BabyHog+0x1196

       60 b8ae8490 8081df65 BabyHog!HogFarm+0xaec

       14 b8ae84a4 b80d012b nt!IofCallDriver+0x45

       28 b8ae84cc b80c1862 BossHog!DEVICEDISPATCH::LowerDevicePassThrough+0x48

       8c b8ae8558 b80cf367 BossHog+0x7862

       10 b8ae8568 b80cf3b7 BossHog+0x15367

       28 b8ae8590 8081df65 BossHog!DEVICEDISPATCH::DispatchPassThrough+0x48

       14 b8ae85a4 b76f9639 nt!IofCallDriver+0x45

       4c b8ae85f0 b76fbb42 BossHog01+0x5639

       94 b8ae8684 b7709ae6 BossHog01+0x7b42

       10 b8ae8694 b7709b36 BossHog01+0x15ae6

       28 b8ae86bc 8081df65 BossHog01!DEVICEDISPATCH::DispatchPassThrough+0x48

       14 b8ae86d0 808f8f71 nt!IofCallDriver+0x45

       e8 b8ae87b8 80937942 nt!IopParseDevice+0xa35

       80 b8ae8838 80933a76 nt!ObpLookupObjectName+0x5b0

       54 b8ae888c 808eae25 nt!ObOpenObjectByName+0xea

       7c b8ae8908 808ec136 nt!IopCreateFile+0x447

       48 b8ae8950 b76ff4ae nt!IoCreateFileSpecifyDeviceObjectHint+0x52

       9c b8ae89ec b76ff0e6 BossHog01+0xb4ae

       58 b8ae8a44 b7685f8b BossHog01+0xb0e6

       50 b8ae8a94 b76867a3 daddyHog+0x3f8b

       28 b8ae8abc b76f66e7 daddyHog+0x47a3

       38 b8ae8af4 b76f6e76 BossHog01+0x26e7

      124 b8ae8c18 b770a0d5 BossHog01+0x2e76

       44 b8ae8c5c 808f5e2f BossHog01!DEVICEDISPATCH::DispatchPassThrough+0x5e7

       a4 b8ae8d00 808eed08 nt!IopXxxControlFile+0x255

       34 b8ae8d34 8088978c nt!NtDeviceIoControlFile+0x2a

        0 b8ae8d34 7c8285ec nt!KiFastCallEntry+0xfc

          0335e534 00000000 0x7c8285ec

Problem solved, we have our stack. Please note that the technique demonstrated above is only applicable to an x86 platform. The x64 architecture is completely different. For more information, please refer to Trey Nash’s previous blog

The same drivers I observed previously using the dps output are listed; however, it’s much easier to follow the call flow this time. Let’s see how much stack these hogs are using:

3: kd> ? 0x8+0x80+0x84+0x10+0x28+0x84+0x10+0x28+0x8+0x80+0x84+0x10+0x28+0x84+0x10+0x8+0x80+0x84+0x10+0x28+0x28+0x7c+0x94+0x10+0x28+0x50+0x54+0x164+0x40+0x2c4+0x24+0x34+0x8+0x60+0x28+0x8c+0x10+0x28+0x9c+0x58+0x50+0x28+0x38+0x124+0x44

Evaluate expression: 4176 = 00001050

How many kb is this?

3: kd> ? 4176/1024

Evaluate expression: 4 = 00000004

I would also like to dump out one of the functions that is using a large amount of stack space to show you how you can dig deeper into problems like this. Sometimes this level of granularity is needed when a vendor is attempting to optimize their code by showing them what caused their large allocations to occur.

Let’s review the frame listed below that used 0x2c4 bytes of stack space

       40 b8ae7f74 b8d8d22a MamaHog+0x87fe

      2c4 b8ae8238 b958ecdf MamaHog+0x422a

       24 b8ae825c b958eee0 BabyHog!HogFarm+0xaf

I have underlined the return address of this stack frame. This is the address right after the instruction where the call to MamaHog took place. After MamaHog would have completed, execution would of course have continued with the line of code right after the call. To view the call to MamaHog, let’s unassemble backwards one instruction to see the call being made.

3: kd> ub b958ecdf L1

BabyHog!HogFarm+0xa8:

             

call dword ptr BabyHog!HogFarm +0x7460 (b9596090 )  <-stores the location we called

Let’s dump out this location to see what we actually called. The code dereferences this location to make the actual call.

3: kd> dps b9596090 L1

b9596090 b8d8c0d0 BabyHog +0x30d0<-----highlighted address is the function called

3: kd> uf b8d8c0d0

b8d8c0d0 55 push ebp

b8d8c0d1 8bec mov ebp,esp

b8d8c0d3 6aff push 0FFFFFFFFh

b8d8c0d5 6840e2d9b8 push offset BabyHog +0x15240 (b8d9e240)

b8d8c0da 6808b2d9b8 push offset BabyHog +0x12208 (b8d9b208)

b8d8c0df 64a100000000 mov eax,dword ptr fs:[00000000h]

b8d8c0e5 50 push eax

b8d8c0e6 64892500000000 mov dword ptr fs:[0],esp

b8d8c0ed 81c470fdffff add esp,0FFFFFD70h

To subtract 656, we are adding -656 to esp to bump the stack up. This is where the compiler is allocating stack space for storage of local variables. I am guessing there were multiple structures being allocated directly on the stack. If the programmer had instead called ExAllocatePoolWithTag to obtain memory, we could have instead only stored the pointers to this memory on the stack. Simply using the stack for storage is of course faster than calling out to get memory. The developer should balance the performance needs of the code with the need to conserve the stack space which is a limited resource.

3: kd> .formats 0x0FFFFFD70

Evaluate expression:

  Hex: fffffd70

  Decimal: -656

So this one instruction raised the stack more than .5 kb all by itself. The function made a bunch of other pushes (some of which are listed above) which when combined with the return address pushed by the call resulted in 708 (0x2c4 was listed by kf above) bytes of stack space to be allocated by this one function.

3: kd> .formats 2c4

Evaluate expression:

  Hex: 000002c4

  Decimal: 708

Two allocations like this will use up almost 1.5kb of the 12k stack space. By itself, this isn’t necessarily an issue; however, you can see from the output above how liberal stack usage by multiple calls from stacked up drivers can quickly add up. So what have we discovered here? Without even considering the fact that this products drivers may also be responsible for some of the stack space used by calls that they may have initiated, they have used at least 4k of the 12k stack. Also, the full 12k of stack space is not available for driver use. This is because the operating system also requires stack space for such overhead as the I/O operation, the file system components, thread startup, etc. The customer removed the product which prevented further bugchecks while the vendor was being engaged to assist or provide leaner hogs.

Summary

This two part blog has covered CPU task states, stack overflows, x86 stack reconstruction techniques, and examining functions to observe the stack allocations taking place. Hopefully, this blog will help you to understand what went wrong the next time that you encounter a Stop 0x7f (EXCEPTION_DOUBLE_FAULT).

You don’t have to wait until you encounter an UNEXPECTED_KERNEL_MODE_TRAP (7f) Arg1: 00000008, EXCEPTION_DOUBLE_FAULT to get familiar with this. Open up any dump file and see if you can find the limit and base. Then observe the values of your ebp and esp registers. If you are feeling truly geeky, dump the stack range out, identify patterns, and try passing various values to kf.

Please check out our previous blog post on this topic:

https://blogs.msdn.com/ntdebugging/archive/2008/02/01/kernel-stack-overflows.aspx

Keep in mind that this two part blog has only discussed stop 0x7f bugchecks where Arg1 lists value 0x8 EXCEPTION_DOUBLE_FAULT. There are other causes for a 0x7f bugcheck. For more information, please refer to:

314102 General Causes of STOP 0x0000007F Errors

 

Bug Check 0x7F: UNEXPECTED_KERNEL_MODE_TRAP

 

 

References:

“Windows Internals, Fifth Edition” By Mark E. Russinovich, David A. Solomon, with Alex Ionescu

Chapter 9, page 786, “Kernel Stacks”

Share this post :

Comments

  • Anonymous
    March 13, 2012
    Trey Nash’s previous blog - link has "." @ end [Thanks Sam. We can't edit these older posts without breaking the formatting.  Hopefully your comment will help the next reader.]

  • Anonymous
    June 23, 2014
    This seems like a very useful tutorial in debugging kernel stack overflows, thanks. I am currently trying to debug a kernel stack overflow which happens during an interrupt and that appears to complicate things. In my case the top of the stack is correctly interpreted by windbg, but the kernel frames shown at the top of the stack only take up a fraction of the available 12k. The rest of the stack, below that, is nonsense. If I dump the raw stack I can see some user mode functions mentioned, so i wonder whether the 'nonsense' part of the stack I get with kcf/kb is the user stack (because when an interrupt occurs while running user mode the stack is switched from user stack to kernel stack). But even if that is the case, why would the kernel stack be almost full when the interrupt occurs. Any tips gratefully received! [User mode code won't use kernel stack so it will not contribute to this problem.  The limitations on the size of kernel stack only impact kernel code.  Sometimes optimized code without symbols can break the debugger's stack walking, requiring manual reconstruction of the stack as demonstrated in this article.]