Why is there an Ex and Io work item in WDM?
Have you ever looked at the work item APIs and wondered why there are two different
types of work items? Or for that matter, why are there so many work item APIs?
As Paul wrote
last week, the work item API set has grown for Vista. Today I will try to explain
how we got into this state.
Up until Windows 2000, there was only one type of work item,
WORK_QUEUE_ITEM. You could embed the work item
structure in your own structure and it was quite simple to use. All you to do
is call ExQueueWorkItem() and you were done. There
was one glaring problem with the way WORK_QUEUE_ITEMs worked.
You could not safely unload a driver which had queued a work item.
Safe unload is not possible with this type of work item because there is no outstanding reference on your
device or driver object. A reference on your device or driver object will keep
your driver's image from unloading. Since there is no reference on eithe robject,
the image can be unloaded before the work item has run or while the work item is executing. But what if you added your own reference
and then released it when the work item ended?
For instance, if you had code that did something like this:
typedef struct _MY_WORK_ITEM {
WORK_QUEUE_ITEM WorkItem;
PDEVICE_OBJECT DeviceObject;
} MY_WORK_ITEM, *PMY_WORK_ITEM;
NTSTATUS QueueWorkItem(PDEVICE_OBJECT DeviceObject)
{
PMY_WORK_ITEM pItem;
pItem = (PMY_WORK_ITEM) ExAllocatePoolWithTag(NonPagedPool, sizeof(MY_WORK_ITEM), tag);
if (pItem == NULL) {
return STATUS_INSUFFICIENT_RESOURCES;
}
ExInitializeWorkItem(&pItem->WorkItem, WorkItemRoutine, pItem);
pItem->DeviceObject = DeviceObject;
ObReferenceObject(DeviceObject);
ExQueueWorkItem(&pItem->WorkItem, DelayedWorkQueue);
return STATUS_SUCCESS;
}
VOID WorkItemRoutine(PVOID Context)
{
PMY_WORK_ITEM pItem = (PMY_WORK_ITEM) Context;
PDEVICE_OBJECT pDevice = pItem->DeviceObject;
// ... do work ...
ExFreePool(pItem);
ObDereferenceObject(pDevice);
}
The problem is that there is still code execute to execute after the ObDereferenceObject(pDevice);
and the ending } as seen in this disassembly, so
there is still a short window of time where your driver could be unloaded while
your driver is still executing code.
0:000> u WorkItemRoutine+0x23
WorkItemRoutine+0x23
// Put the parameter into ecx and call ObDeferenceObject
000843e3 8b4dfc mov ecx,dword ptr [ebp-4]
000843e6 ff1564a00a00 call dword ptr [wdf01000!_imp_ObfDereferenceObject (000aa064)]
// We still have to execute this code to return to the caller! It is during
// these 3 instructions that the driver can unload
000843ec 8be5 mov esp,ebp
000843ee 5d pop ebp
000843ef c20400 ret 4
To address this problem a new work item type, PIO_WORKITEM, was added.
If the management of the reference was taken care of for the driver in another module, the driver
would not have this problem anymore. This is exactly what PIO_WORKITEM and
IoQueueWorkItem() does. Upon queueing the work
item, the I/O manager takes a reference on the device object and then releases it
after the work item routine returns back to the I/O manager. This means
that all of your driver's work item code runs while the reference is held, including
the code to return to the caller and it is now possible to safely unload a driver
using this new work item type.
So, the problem is solved right? Well, technically yes, but the new
PIO_WORKITEM type introducted a regression of sorts. The
actual size of the IO_WORKITEM structure is not
exposed publicly which means you can longer embed a work item structure in your
own structure. This results in having to allocate a context and to allocate the
work item separately. This introduces another point of failure and makes the
initialization and destroy code more complex. Here is the previous code snippet
modified to use the new work item type:
typedef struct _MY_WORK_ITEM {
PIO_WORKITEM WorkItem;
// ...other context fields...
} MY_WORK_ITEM, *PMY_WORK_ITEM;
NTSTATUS QueueWorkItem(PDEVICE_OBJECT DeviceObject)
{
PMY_WORK_ITEM pItem;
pItem = (PMY_WORK_ITEM) ExAllocatePoolWithTag(NonPagedPool, sizeof(MY_WORK_ITEM), tag);
if (pItem == NULL) {
return STATUS_INSUFFICIENT_RESOURCES;
}
pItem->WorkItem = IoAllocateWorkItem(DeviceObject);
if (pItem->WorkItem == NULL) {
ExFreePool(pItem);
return STATUS_INSUFFICIENT_RESOURCES;
}
// ...initialize the rest of pItem...
IoQueueWorkItem(pItem->WorkItem, IoWorkItemRoutine, DelayedWorkQueue, pItem);
return STATUS_SUCCESS;
}
VOID IoWorkItemRoutine(PDEVICE_OBJECT DeviceObject, PVOID Context)
{
PMY_WORK_ITEM pItem = (PMY_WORK_ITEM) Context;
// ... do work ...
IoFreeWorkItem(pItem->WorkItem);
ExFreePool(pItem);
}
To address the embedded work item "regresssion, Vista introduced
IoSizeofWorkItem() (which you can read about
in Paul's article which I referenced at the top of this entry). In conclusion,
it is not hard to see why there are two different types of work items and so
many work item APIs in WDM. The problem set has grown over time and the OS
has evolved to solve those problems.
Comments
- Anonymous
August 07, 2006
"This results in having to allocate a context and to allocate the work item separately. This introduces another point of failure and makes the initialization and destroy code more complex"
Sounds to me like more justification for supporting C++ as a first class kernel-mode development language. - Anonymous
August 08, 2006
C++ doesn't reduce the actual points of failure... it just abstracts them away from the dev and turns understandable failures to less understandable ones. If the memory allocation fails, it fails regardless of the language. - Anonymous
August 08, 2006
The comment has been removed - Anonymous
August 08, 2006
Yesterday I wrote
about the evolution of work items. Work items evolved because there was a need
to... - Anonymous
August 08, 2006
"C++ doesn't reduce the actual points of failure... it just abstracts them away from the dev and turns understandable failures to less understandable ones."
This is complete nonsense. C++ allows you to create resources that properly manage their own lifetimes. C does not.
"In fact, I would say that wrapping an io workitem in a C++ would make things more complicated. operator new() is passed the fixed size of the object, so you must overload operator new() and use IoSizeofWorkItem() to compute the right size. If you do this, you have now created a "finalized" class, you cannot derive from this class b/c it has a variable length. "
This can't go in the constructor why? - Anonymous
August 09, 2006
I don't understand where is the regression with the new IoWorkItem interface.
the "previous code snippet modified to use the new work item type" is safe and bug free as far as I can see - Anonymous
August 09, 2006
The comment has been removed - Anonymous
August 09, 2006
The comment has been removed