NTDebugging Puzzler 0x00000005 (Better late than never)
Hello NTDebuggers, from time to time we see the following problem. It’s another access violation, and the debug notes below are from a minidump.
Here is what we need to know…
· Generally speaking what happened to cause this AV?
· What method you would use to isolate root cause of the failure?
There are a lot of ways to do this. We look forward to hearing your approach.
We will post our methods and answer at the end of the week. If you need anything please let us know.
-------------------------------------------
Microsoft (R) Windows Debugger Version 6.8.0001.0
Copyright (c) Microsoft Corporation. All rights reserved.
Loading Dump File [D:\test123.dmp]
User Mini Dump File: Only registers, stack and portions of memory are available
0:000> k 123
ChildEBP RetAddr
0017f93c 75e4edb5 ntdll!ZwWaitForMultipleObjects+0x15
0017f9d8 75e430c3 kernel32!WaitForMultipleObjectsEx+0x11d
0017f9f4 75ef2084 kernel32!WaitForMultipleObjects+0x18
0017fa60 75ef22b1 kernel32!WerpReportFaultInternal+0x16c
0017fa74 75ebbe60 kernel32!WerpReportFault+0x70
0017fb00 7732d15a kernel32!UnhandledExceptionFilter+0x1c1
0017fb08 773000c4 ntdll!_RtlUserThreadStart+0x6f
0017fb1c 77361d05 ntdll!_EH4_CallFilterFunc+0x12
0017fb44 772eb6d1 ntdll!_except_handler4+0x8e
0017fb68 772eb6a3 ntdll!ExecuteHandler2+0x26
0017fc10 772cee57 ntdll!ExecuteHandler+0x24
0017fc10 10011127 ntdll!KiUserExceptionDispatcher+0xf
*** ERROR: Module load completed but symbols could not be loaded for crash3.exe
WARNING: Frame IP not in any known module. Following frames may be wrong.
0017ff40 0040104a 0x10011127
0017ffa0 75eb19f1 crash3+0x104a
0017ffac 7732d109 kernel32!BaseThreadInitThunk+0xe
0017ffec 00000000 ntdll!_RtlUserThreadStart+0x23
0:000> lm
start end module name
00400000 0040d000 crash3 (no symbols)
6c250000 6c288000 odbcint (deferred)
6c290000 6c2f5000 odbc32 (deferred)
72a00000 72a86000 comctl32 (deferred)
74820000 749b4000 comctl32_74820000 (deferred)
75240000 75251000 samlib (deferred)
75260000 75281000 ntmarta (deferred)
754b0000 75510000 secur32 (deferred)
75510000 75570000 imm32 (deferred)
75700000 75790000 gdi32 (deferred)
757a0000 75870000 user32 (deferred)
758a0000 758a6000 nsi (deferred)
758b0000 759f4000 ole32 (deferred)
75a00000 75aaa000 msvcrt (deferred)
75ab0000 75ba0000 rpcrt4 (deferred)
75ba0000 75c1d000 usp10 (deferred)
75c20000 75c75000 shlwapi (deferred)
75d60000 75e27000 msctf (deferred)
75e30000 75f40000 kernel32 (pdb symbols)
76140000 76189000 Wldap32 (deferred)
76190000 7624f000 advapi32 (deferred)
76250000 76d1e000 shell32 (deferred)
76d20000 76d94000 comdlg32 (deferred)
76da0000 76dcd000 ws2_32 (deferred)
77280000 77287000 psapi (deferred)
77290000 77299000 lpk (deferred)
772b0000 77400000 ntdll (pdb symbols)
Good luck and happy debugging.
Jeff-
[Update: our answer. Posted 5/13/2008]
We enjoyed seeing different people’s approaches on this week’s puzzler. This was a simple module unload. We loaded a lib, did a GetProcAddress, freed the lib, and called the function. The dump was a mini dump created via .dump /m C:\dump file. There are various ways this type of scenario may arise. Obviously someone could unload a lib, but why? In most cases I’ve seen, it was due to a ref count problem in a com object. Poor accounting leading to one too many decrements, and the dll will get unloaded causing a simple crash footprint.
There are quite a few ways to track this down. First of all, if you had the debugger attached and got a full dump or /ma dump you would have seen the loaded module list. This would have been a dead giveaway and part of why we did the .dump /m. There are other options you can enable that make tracking of module loads easy under the debugger. I personally like “loader snaps” if I’m trying to track down module load shenanigans. To enable this, just go into the image section of the gflags tool and enable loader snaps for the exe in question. Now attach a debugger and watch the mode load and GetProcAddress details scroll by.
Yet another popular approach is to use process monitor. This tool is not only easy to set up, but it also gives you great logs with call stacks and other details such as registry accesses.
This puzzler provided the bare minumum data required. We did not give you much to go on because sometimes in real debugging scenarios you have to work with a lack of data. I really liked how many people questioned the source of the dump file. It really shows how familiar you all are with the various dump types.
Great work!
Comments
Anonymous
May 06, 2008
PingBack from http://windows.wawblog.info/?p=4153Anonymous
May 06, 2008
No symbols? Probably missmatched PDB's... First lets check the executable version: >lmv crash3 if this did't help... >!sym noisy >.reload after PDB's missmatch were resolved. >.ecxr [You are right, symbols aren't loaded, but they aren't needed to debug the issue.]Anonymous
May 07, 2008
(Sorry if there are duplicates - I'm not getting the "Comment Submitted" confirmation message when I submit. Please remove any duplicates.) It looks to me that crash3 called a function (0x10011127) but the module containing that function has been unloaded. The module does not appear to have been rebased. It would seem to be desirable to determine what module loaded at 0x10000000 in crash3's address space. If one is able to reproduce the problem by executing crash3 and doing live analysis, I can imagine that one may wish to profile the app using Dependency Walker; or use Process Explorer's DLL view (if the problem lends itself to this type of inspection); or use Process Monitor to look at the "Load Image" operations for crash3; or use a debugger to capture the LOAD_DLL_DEBUG_EVENTs. Or, I suppose one could do lma 0x10000000 in the dump. Once the module has been determined, it would seem that one could set a breakpoint on ntdll!LdrUnloadDll and look for 0x10000000 in Param1, and check the stack at the time of the unload.Anonymous
May 07, 2008
The comment has been removedAnonymous
May 08, 2008
From my initial response: > Or, I suppose one could do lma 0x10000000 in the dump. < It appears this type of minidump does not contain unloaded module information, so lma would not be helpful in this case.Anonymous
May 08, 2008
The comment has been removedAnonymous
May 08, 2008
The comment has been removedAnonymous
May 08, 2008
I think that you have to follow the data structure pointed to by EXCEPTION_POINTERS to eventually find your way back to where the exception happened. I keep forgetting how to do that and need to look it up each time but I just use the Debug Diagnostic Tool now which seems to have that logic baked in. http://www.microsoft.com/downloads/details.aspx?FamilyID=28bd5941-c458-46f1-b24d-f60151d875a3&displaylang=enAnonymous
May 09, 2008
Hi Skywing, Only the WER in Vista SP1 can generate the dump files as stated in the link below: http://msdn.microsoft.com/en-us/library/bb787181(VS.85).aspx So WER may not be an option if they are not using SP1 :-) I am curious that if we specify "DumpType=1(Mini dump)" in Vista SP1 WER configuration, will the unloaded modules to saved? What is the default flags used in this scenario? I was surprised to hear that windbg ".dump test.dmp" default option did not include "/u" option. Thank you for pointing this out. Jeffrey TanAnonymous
May 13, 2008
The comment has been removed