Driver Object Corruption Triggers Bugcheck 109
My name is Victor Mei, I am an Escalation Engineer in Platforms Global Escalation Services in GCR. Some customers I worked with have strong interests in debugging; but usually they got frustrated when I told them “To find the cause from this dump, you have to get the code and understand the design behind it”.
This time I am going to talk about one crash dump, on which we can use basic debugging commands and knowledge of the Windows kernel to find out the root cause:
1: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
CRITICAL_STRUCTURE_CORRUPTION (109)
This bugcheck is generated when the kernel detects that critical kernel code or
data have been corrupted. There are generally three causes for a corruption:
1) A driver has inadvertently or deliberately modified critical kernel code
or data. See https://www.microsoft.com/whdc/driver/kernel/64bitPatching.mspx
2) A developer attempted to set a normal kernel breakpoint using a kernel
debugger that was not attached when the system was booted. Normal breakpoints,
"bp", can only be set if the debugger is attached at boot time. Hardware
breakpoints, "ba", can be set at any time.
3) A hardware corruption occurred, e.g. failing RAM holding kernel code or data.
Arguments:
Arg1: a3a01f5a3763f650, Reserved
Arg2: b3b72be089e32ceb, Reserved
Arg3: ffffe001a2894a20, Failure type dependent information
Arg4: 000000000000001c, Type of corrupted region, can be
0 : A generic data region
1 : Modification of a function or .pdata
2 : A processor IDT
3 : A processor GDT
4 : Type 1 process list corruption
5 : Type 2 process list corruption
6 : Debug routine modification
7 : Critical MSR modification
8 : Object type
9 : A processor IVT
a : Modification of a system service function
b : A generic session data region
c : Modification of a session function or .pdata
d : Modification of an import table
e : Modification of a session import table
f : Ps Win32 callout modification
10 : Debug switch routine modification
11 : IRP allocator modification
12 : Driver call dispatcher modification
13 : IRP completion dispatcher modification
14 : IRP deallocator modification
15 : A processor control register
16 : Critical floating point control register modification
17 : Local APIC modification
18 : Kernel notification callout modification
19 : Loaded module list modification
1a : Type 3 process list corruption
1b : Type 4 process list corruption
1c : Driver object corruption
1d : Executive callback object modification
1e : Modification of module padding
1f : Modification of a protected process
20 : A generic data region
21 : A page hash mismatch
22 : A session page hash mismatch
102 : Modification of win32k.sys
The stack only contains one frame:
# Child-SP RetAddr Call Site
00 ffffd000`223721c8 00000000`00000000 nt!KeBugCheckEx
You will get disappointed if you attempted to find out who called KeBugCheckEx from the stack, because you will find KeBugCheckEx is the only function address on the stack.
Since there is nothing more on the stack, let’s take a close look at what WinDBG tells about Bugcheck parameters:
Arg3: ffffe001a2894a20, Failure type dependent information
Arg4: 000000000000001c, Type of corrupted region, can be
1c : Driver object corruption
Tip: Always use the latest version of WinDBG, the older versions may not tell you 1c is for Driver Object corruption.
Arg4 indicates this is driver object corruption, so the type dependent information provided by Arg3 should be the Driver object, right? Let’s check the object:
1: kd> !drvobj ffffe001a2894a20
Driver object (ffffe001a2894a20) is for:
ffffe001a2894a20: is not a driver object
Let’s try !pool
1: kd> !pool ffffe001a2894a20
Pool page ffffe001a2894a20 region is Nonpaged pool
ffffe001a2894000 size: 510 previous size: 0 (Allocated) FMcr
ffffe001a2894510 size: 50 previous size: 510 (Allocated) Wmip
ffffe001a2894560 size: 60 previous size: 50 (Allocated) NtfJ
ffffe001a28945c0 size: 60 previous size: 60 (Allocated) EtwR
ffffe001a2894620 size: 60 previous size: 60 (Allocated) EtwR
ffffe001a2894680 size: 60 previous size: 60 (Allocated) EtwR
ffffe001a28946e0 size: 60 previous size: 60 (Allocated) EtwR
ffffe001a2894740 size: 210 previous size: 60 (Allocated) Devi
*ffffe001a2894950 size: 200 previous size: 210 (Allocated) *Driv
Pooltag Driv : Driver objects
ffffe001a2894b50 size: 2b0 previous size: 200 (Allocated) Devi
ffffe001a2894e00 size: 200 previous size: 2b0 (Allocated) Driv
So the address does belong to a driver object, but what is the base address of NT!_Driver_Object? If you don’t have experience on it, a quick method is to refer to a known device object, for example:
1: kd> !drvobj \driver\acpi
Driver object (ffffe001a14df060) is for:
\Driver\ACPI
1: kd> !pool ffffe001a14df060
Pool page ffffe001a14df060 region is Nonpaged pool
*ffffe001a14df000 size: 200 previous size: 0 (Allocated) *Driv
Pooltag Driv : Driver objects
ffffe001a14df200 size: 10 previous size: 200 (Free) Free
1: kd> ?ffffe001a14df060-ffffe001a14df000
Evaluate expression: 96 = 00000000`00000060
So, looks like the offset is 0x60, let’s have another try:
1: kd> !drvobj ffffe001a2894950+0x60
Driver object (ffffe001a28949b0) is for:
\FileSystem\Ntfs
Great, we got the object.
Arg3 is ffffe001a2894a20, offset 0x70 to the Driver Object.
1: kd> ?ffffe001a2894a20-ffffe001a28949b0
Evaluate expression: 112 = 00000000`00000070
1: kd> dt nt!_DRIVER_OBJECT ffffe001a28949b0
+0x000 Type : 0n4
+0x002 Size : 0n336
+0x008 DeviceObject : 0xffffe001`a144c030 _DEVICE_OBJECT
+0x010 Flags : 0x92
+0x018 DriverStart : 0xfffff800`0d044000 Void
+0x020 DriverSize : 0x1f6000
+0x028 DriverSection : 0xffffe001`a142e2c0 Void
+0x030 DriverExtension : 0xffffe001`a2894b00 _DRIVER_EXTENSION
+0x038 DriverName : _UNICODE_STRING "\FileSystem\Ntfs"
+0x048 HardwareDatabase : 0xfffff802`64b31580 _UNICODE_STRING "\REGISTRY\MACHINE\HARDWARE\DESCRIPTION\SYSTEM"
+0x050 FastIoDispatch : 0xfffff800`0d0ae640 _FAST_IO_DISPATCH
+0x058 DriverInit : 0xfffff800`0d06e280 long Ntfs!GsDriverEntry+0
+0x060 DriverStartIo : (null)
+0x068 DriverUnload : 0xfffff800`0c8d5d24 void +0
+0x070 MajorFunction : [28] 0xfffff800`0d126a10 long Ntfs!NtfsFsdCreate+0
The bugcheck code seems to be indicating that the MajorFunction table is corrupted, let’s look at the details:
1: kd> !drvobj ffffe001a2894950+0x60 f
Driver object (ffffe001a28949b0) is for:
\FileSystem\Ntfs
Driver Extension List: (id , addr)
Device Object list:
ffffe001a144c030 ffffe001a1449030 ffffe001a144f030 ffffe001a28947a0
DriverEntry: fffff8000d06e280 Ntfs!GsDriverEntry
DriverStartIo: 00000000
DriverUnload: fffff8000c8d5d24 vicm
AddDevice: 00000000
Dispatch routines:
[00] IRP_MJ_CREATE fffff8000d126a10 Ntfs!NtfsFsdCreate
[01] IRP_MJ_CREATE_NAMED_PIPE fffff802645809ac nt!IopInvalidDeviceRequest
[02] IRP_MJ_CLOSE fffff8000d10b390 Ntfs!NtfsFsdClose
[03] IRP_MJ_READ fffff8000d061590 Ntfs!NtfsFsdRead
[04] IRP_MJ_WRITE fffff8000d05c3d0 Ntfs!NtfsFsdWrite
[05] IRP_MJ_QUERY_INFORMATION fffff8000d133ca4 Ntfs!NtfsFsdDispatchWait
[06] IRP_MJ_SET_INFORMATION fffff8000d130290 Ntfs!NtfsFsdSetInformation
[07] IRP_MJ_QUERY_EA fffff8000d133ca4 Ntfs!NtfsFsdDispatchWait
[08] IRP_MJ_SET_EA fffff8000d133ca4 Ntfs!NtfsFsdDispatchWait
[09] IRP_MJ_FLUSH_BUFFERS fffff8000d0e9e94 Ntfs!NtfsFsdFlushBuffers
[0a] IRP_MJ_QUERY_VOLUME_INFORMATION fffff8000d1356b0 Ntfs!NtfsFsdDispatch
[0b] IRP_MJ_SET_VOLUME_INFORMATION fffff8000d1356b0 Ntfs!NtfsFsdDispatch
[0c] IRP_MJ_DIRECTORY_CONTROL fffff8000d12d2f0 Ntfs!NtfsFsdDirectoryControl
[0d] IRP_MJ_FILE_SYSTEM_CONTROL fffff8000d131898 Ntfs!NtfsFsdFileSystemControl
[0e] IRP_MJ_DEVICE_CONTROL fffff8000d0ed194 Ntfs!NtfsFsdDeviceControl
[0f] IRP_MJ_INTERNAL_DEVICE_CONTROL fffff802645809ac nt!IopInvalidDeviceRequest
[10] IRP_MJ_SHUTDOWN fffff8000d1eb730 Ntfs!NtfsFsdShutdown
[11] IRP_MJ_LOCK_CONTROL fffff8000d046230 Ntfs!NtfsFsdLockControl
[12] IRP_MJ_CLEANUP fffff8000d12bde0 Ntfs!NtfsFsdCleanup
[13] IRP_MJ_CREATE_MAILSLOT fffff802645809ac nt!IopInvalidDeviceRequest
[14] IRP_MJ_QUERY_SECURITY fffff8000d1356b0 Ntfs!NtfsFsdDispatch
[15] IRP_MJ_SET_SECURITY fffff8000d1356b0 Ntfs!NtfsFsdDispatch
[16] IRP_MJ_POWER fffff802645809ac nt!IopInvalidDeviceRequest
[17] IRP_MJ_SYSTEM_CONTROL fffff802645809ac nt!IopInvalidDeviceRequest
[18] IRP_MJ_DEVICE_CHANGE fffff802645809ac nt!IopInvalidDeviceRequest
[19] IRP_MJ_QUERY_QUOTA fffff8000d133ca4 Ntfs!NtfsFsdDispatchWait
[1a] IRP_MJ_SET_QUOTA fffff8000d133ca4 Ntfs!NtfsFsdDispatchWait
[1b] IRP_MJ_PNP fffff8000d158bac Ntfs!NtfsFsdPnp
Fast I/O routines:
FastIoCheckIfPossible fffff8000d1d4090 Ntfs!NtfsFastIoCheckIfPossible
FastIoRead fffff8000d0f98e0 Ntfs!NtfsCopyReadA
FastIoWrite fffff8000d12f160 Ntfs!NtfsCopyWriteA
FastIoQueryBasicInfo fffff8000d1390c0 Ntfs!NtfsFastQueryBasicInfo
FastIoQueryStandardInfo fffff8000d123bb0 Ntfs!NtfsFastQueryStdInfo
FastIoLock fffff8000d0dd54c Ntfs!NtfsFastLock
FastIoUnlockSingle fffff8000d0dd848 Ntfs!NtfsFastUnlockSingle
FastIoUnlockAll fffff8000d1d3330 Ntfs!NtfsFastUnlockAll
FastIoUnlockAllByKey fffff8000d1d35ac Ntfs!NtfsFastUnlockAllByKey
ReleaseFileForNtCreateSection fffff8000d062814 Ntfs!NtfsReleaseForCreateSection
FastIoQueryNetworkOpenInfo fffff8000d0f051c Ntfs!NtfsFastQueryNetworkOpenInfo
AcquireForModWrite fffff8000d04b6d8 Ntfs!NtfsAcquireFileForModWrite
MdlRead fffff8000d0eb2c0 Ntfs!NtfsMdlReadA
MdlReadComplete fffff80264588594 nt!FsRtlMdlReadCompleteDev
PrepareMdlWrite fffff8000d0eb574 Ntfs!NtfsPrepareMdlWriteA
MdlWriteComplete fffff802649289c8 nt!FsRtlMdlWriteCompleteDev
FastIoQueryOpen ffffe001a17d4540 +0xffffe001a17d4540
ReleaseForModWrite fffff8000d04b4d4 Ntfs!NtfsReleaseFileForModWrite
AcquireForCcFlush fffff8000d06656c Ntfs!NtfsAcquireFileForCcFlush
ReleaseForCcFlush fffff8000d066524 Ntfs!NtfsReleaseFileForCcFlush
We found two potential issues here: DriverUnload and FastIoQueryOpen.
Use FastIoQueryOpen as an example:
1: kd> u ffffe001a17d4540
ffffe001`a17d4540 4d8bc8 mov r9,r8
ffffe001`a17d4543 4c8bc2 mov r8,rdx
ffffe001`a17d4546 488bd1 mov rdx,rcx
ffffe001`a17d4549 48b900407da101e0ffff mov rcx,0FFFFE001A17D4000h
ffffe001`a17d4553 48b83c57910c00f8ffff mov rax,offset vicm+0x6973c (fffff800`0c91573c)
ffffe001`a17d455d ffe0 jmp rax
1: kd> u fffff800`0c91573c
vicm+0x6973c:
fffff800`0c91573c 48895c2408 mov qword ptr [rsp+8],rbx
fffff800`0c915741 48896c2410 mov qword ptr [rsp+10h],rbp
fffff800`0c915746 4889742418 mov qword ptr [rsp+18h],rsi
fffff800`0c91574b 57 push rdi
fffff800`0c91574c 4883ec20 sub rsp,20h
Obviously, FastIoQueryOpen has been modified to execute code in the module vicm.sys. DriverUnload has been modified in a similar manner.
Follow the description from “!analyze “1) A driver has inadvertently or deliberately modified critical kernel code or data. See https://www.microsoft.com/whdc/driver/kernel/64bitPatching.mspx”. Kernel patch protection does not allow the MajorFunction table of certain drivers to be modified, if this data is modified the system will bugcheck as seen here. It is time to remove the vicm.sys driver. The result is positive, the machine no longer crashes.
Comments
Anonymous
November 24, 2014
Fantastic analysis, thanks for sharing. Internet has no info about this "vicm.sys", so no knowledge whether it was just malware, or buggy legitimate driver. [We usually change driver names to keep the focus of our content on debugging techniques rather than blaming specific vendors for a failure. In this example the driver was not malware.]Anonymous
April 01, 2015
Thank you a lot for sharing. Awesome!