Why Your User Mode Pointer Captures Are Probably Broken

There is a problem that I suspect is pretty widespread in the majority of driver code. The problem is the improper capturing of user mode pointers. I decided to write a blog about it and try to get a feel for if I am right or not. J  I figure that if people comment with “of course we knew that you moron!” then I’ll assume that I am totally wrong. If not then I hope this will help at least one person.

User mode pointers passed to kernel mode must point to data that is wholly contained in user mode address space. Checking this property of a user mode pointer is called probing. Contrary to popular belief a probe does not touch the memory pointed to by the pointer, it just does an address range calculation on it. The calculation is basically “(pointer + LengthOfDataPointedTo) must be less than the highest legal user-mode (UM) address”. The reason we need to probe is to make sure that a UM component can’t write or read kernel space. When a user passes a pointer into a kernel mode component, the pointer is copied onto the kernel stack as part of the calling mechanism (sometimes called a “system call”, sysenter, a “trap”, etc.). The user has no way to change the value of that variable once it is passed in – so we can validate the pointer with confidence – since we know that its value won’t change underneath us. However, if that pointer is a pointer to a structure with embedded pointers, those internal pointers can be changing asynchronously from another thread. This is problematic because we need to validate all of the embedded pointers in a passed in structure. This is where capturing comes in to play. We capture the embedded pointers by storing their value in kernel mode space – usually the stack – by reading the embedded pointers through the already validated pointer. Once the embedded pointer is captured – we probe it, lather – rinse – repeat for the entire depth of the embedded pointers tree.

Consider this code:

typedef struct _USER_DATA {

    PULONG_PTR Data1;

    PULONG_PTR Data2;

} USER_DATA, *PUSER_DATA;

NTSTATUS

Foo(

    PUSER_DATA Data

    )

{

    PULONG_PTR CapturedData1;

    ULONG_PTR Data1Value;

    PULONG_PTR CapturedData2;

    ULONG_PTR Data2Value;

   

    //

    // See if this is user mode –in a driver PreviousMode

    // would normally be read from a field in the IRP and the

    // pointer would come from the Type3InputBuffer field

    //

    if (ExGetPreviousMode() != KernelMode) {

   

        try {

            //

            // Probe the passed in structure

            //

            ProbeForRead(Data,

                         sizeof(USER_DATA),

                         __alignof(USER_DATA));

            //

            // Capture the embedded pointers

            //

            CapturedData1 = Data->Data1;

            CapturedData2 = Data->Data2;

            //

            // Probe the first captured pointer

            //

           

            ProbeForRead(CapturedData1,

                         sizeof(ULONG_PTR),

                         __alignof(ULONG_PTR));

           

            //

            // Probe the second captured pointer

            //

           

            ProbeForRead(CapturedData2,

                         sizeof(ULONG_PTR),

                         __alignof(ULONG_PTR));

            //

            // Read the first embedded pointer

            //

            Data1Value = *CapturedData1;

            //

            // Read the second embedded pointer

            //

            Data2Value = *CapturedData2;

            //

            // More of your code here that does really cool stuff…

            //

        } except (EXCEPTION_EXECUTE_HANDLER) {

            return GetExceptionCode();

        }

    }

    return STATUS_SUCCESS;

}

Everything seems to be OK with this code. We probe the structure pointer, capture the embedded pointers to local variables and then probe them. But wait - let’s think about our ever important capture code a little deeper. The most important attribute of our capture code is that it stores the embedded pointer in a location where the user can’t modify it. If it didn’t, then we would be in really bad shape. So the question is – does our capture code in fact guarantee that the embedded pointers will be in a location such that they can’t be modified from user mode? Unfortunately, the answer is NO! How is this possible? We told the compiler that we wanted to store the pointers locally. However – we didn’t do anything to tell the compiler that it was critical that they were stored locally. So the compiler can freely skip the local storage of the embedded pointers and just refetch them from user mode through the original pointer upon each reference, if it so chooses. This is potentially disastrous for our kernel mode code and not at all what we expected or intended.

So what can we do to get the behavior we expect? Well – we have to tell the compiler the truth about the code that we are writing. That’s right – the truth. We are lying in our code. Our code has implicitly told the compiler that our embedded pointers can’t change asynchronously. This is a lie. They can change. So in order for our code to be correct, we need to change it to a truthful representation of itself. How do we tell the compiler that our pointed to structure can change? Well – there are a couple of ways. The most straightforward way is to mark the passed in parameter with the keyword volatile. volatile when applied to a parameter or variable definition, tells the compiler that that memory location’s contents can change asynchronously – so all reads and writes to it must really happen and in the order they are specified in the code. This facility was put into the C language to deal with code that reads and writes memory that can change in a different scope (i.e. hardware device registers, device memory, shared memory) and we can take advantage of its semantics for our user mode pointer captures. With hardware - a reordered, omitted or combined read or write could lead to real life disasters. For hardware, reads as well as writes have side effects; this is completely analogous to our code. A read can have the side effect of violating our security mechanism.

So we can fix our code by changing our routine like so:

NTSTATUS

Foo(

    volatile USER_DATA* Data

    );

By changing the pointer Data, to be a pointer to a volatile structure we will force all reads and writes to the structure to really happen in our code (bonus points for explaining why we can't use "volatile PUSER_DATA" as our parameter type instead of "volatile USER_DATA*" - aren't they the same thing? :D ). However, if we have existing interfaces that we must maintain - it prevents us from being able to do this. What to do? Well – there is another way to get the behavior we want. We can cast at the capture site. This technique is called using “volatile glasses”. Here is an example:

     //

     // Capture the embedded pointers

     //

     CapturedData1 = ((volatile USER_DATA*)Data)->Data1;

     CapturedData2 = ((volatile USER_DATA*)Data)->Data2;

This will cause the compiler to perform the capture as if the variable Data had been declared volatile. Using this technique prevents the compiler from re-fetching from the passed in pointer because we told the compiler the truth. We said “Hey compiler – this thing that Data points to can change asynchronously, so you’d better not be playing any funny games with it”. And the compiler will honor that. It has to if it honors the volatile keyword. We would then have to do the same thing for the internal reads as well:

     //

     // Read the first embedded pointer

     //

     Data1Value = *(volatile ULONG_PTR*)CapturedData1;

     //

     // Read the second embedded pointer

     //

     Data2Value = *(volatile ULONG_PTR*)CapturedData2;

Again, we are telling the compiler the truth here – that the ULONG_PTR value can change asynchronously and it needs to really capture it locally.

This is a really esoteric topic – but very important IMHO. Please let me know your thoughts on this – I am highly interested. Thanks!

Comments

  • Anonymous
    March 31, 2008
    Wow, very good point. This is yet another reason why accessing user-mode pointers on your own is so dangerous and not encouraged. I am willing to bet that currently, almost no one in the 3rd party driver community that uses embedded user-mode pointers uses the volatile modifier. Thanks, Eran.

  • Anonymous
    March 31, 2008
    You and I think exactly alike! That is why I thought I would write about this topic. :) I am hoping that most compilers aren't taking advantage of this optimization today. I am really hoping this is a future proofing thing. Thanks!

  • Anonymous
    April 01, 2008
    I agree that this needed pointing out. I even had to check the standard to see if using "volatile glasses" is really allowed (it is, though the opposite isn't). > bonus points for explaining why we can't use "volatile PUSER_DATA" as our parameter type instead of "volatile USER_DATA*" With a volatile pointer to nonvolatile target, the compiler has to make sure that the pointer will be evaluated again, but if the pointer's current value points to a target that happens to already have been fetched then the compiler can use the prefetched value without checking to see if that's still accurate. With a nonvolatile pointer to volatile target, a previous fetch of the pointer itself can be used again for bonus points, but the target has to be fetched again to get its current value.  This is what we need here.

  • Anonymous
    April 01, 2008
    And Norman takes all the bonus points! :)

  • Anonymous
    April 02, 2008
    Of course we could get around all this by not embedding pointers in the first place...

  • Anonymous
    April 02, 2008
    "Of course we could get around all this by not embedding pointers in the first place..." Even a simple array of pointers will have the same problem.  Simple arrays are easier to use and understand ... except that they have the same problem so they won't really be easier to use and understand ...

  • Anonymous
    April 03, 2008
    How about offsets into the passed datablock...

  • Anonymous
    April 03, 2008
    The comment has been removed

  • Anonymous
    April 27, 2008
    Interesting. For comparison (and this isn't to say either way is better), the Linux kernel doesn't generally deal with this issue, as all accesses to user memory go through special copy_to_user and copy_from_user functions, which are coded in assembler. Anyone who fails to do this is greeted with a kernel OOPS and stack trace as soon as a page fault occurs - since Linux doesn't use exceptions in the kernel, the copy_*_user functions simply have magic offsets that the PF handler is aware of, and dark magic occurs to pass back an appropriate error code :)