共用方式為


Unsafe code, Stacks and IA64...

Sometimes the differences between platforms can show up in interesting ways. Last week I was looking at a bug that was filed about a difference in error mode between IA64 and x64/x86 platforms… I thought the investigation led me down an interesting path so I thought I’d share it with you.

What might you assume the thread stack layout to look like on x86, x64, IA64? Well, fundamentally they all look pretty similar, something like this (artistic license has been taken):

                        -- top of stack

0x9000 -- Frame A

0x8000 -- Frame B

0x7000 -- Frame C

0x6000 -- Frame D [call graph looks like: A()->B()->C()->D()]

0x5FF8 <return address>

0x5FF0 BYTE* ptr

0x5F00 BYTE[] a (stack allocated byte array, size F0)

0x5000 <space>

0x1000 Soft guard

0x0000 Hard guard

Of course that has been significantly simplified for the purposes of this discussion, and some of the addresses might be a little bogus as I just made them up. The interesting take-away’s however are:

1) The stack “grows” down. That is D() is called by C() and therefore D’s frame is at a lower address than C’s.

2) At the end of the stack is a “guard” region, this is to implement stack overflow exception handling. If you touch the soft guard the OS will raise a stack overflow exception and you will be given the stack space that is the guard region to deal with it.

3) After the soft guard is the hard guard. The hard guard is always unallocated memory which will cause an AV if you touch it, in fact, if you’re dealing with a stack overflow caused by touching the soft guard and you use up too much stack and touch the hard guard then you’ll get an AV which will take down the process.

4) There is no stack guard region at the top of the stack to protect you from stack underflow, up there is just some random memory, could be another thread stack, could be the managed heap, could be the end of memory…

If you read off the top of your stack the results are undefined, but you can safely assume that if you keep reading then at some point you’ll get an AV, or so it seems.

Who would read off the top of the stack you might ask? Probably no one, but yesterday I ran into a test case that was doing just that, it would create a stack based byte array and then pass a pointer to the first element of the array to our unsafe string constructor which takes a byte* and a length. Instead of giving it the actual length of the byte[] that was created on the stack, the test case would proceed to pass a length like Int32.MaxValue or some other such huge (and incorrect) thing.

What happens behind the scenes in the BCL at this point isn’t exactly rocket science, we create a string and proceed to read bytes out of the passed in byte[], it is very similar in concept to if you wrote the following c# code yourself:

public unsafe void EventuallyBlowUp()

{

            SByte* p1 = stackalloc SByte[256];

            SByte temp;

            for (int i=0; i<Int32.MaxValue; i++)

{

            temp = *(p1 + i);

}

}

That’s over simplified, really we make a string after doing some range checks and such and then memcpy the data from the byte* into the string (there’s a reason that this code is marked as unsafe). Note that while the stack grows down, our reading of the data from the SByte[] results in addresses that grow up. Therefore at some point if the offset gets big enough we read off the top of the stack and into random memory.

In this specific test case we were looking for the “expected” AV to happen and be converted into our new AccessViolationException (I think this is new in v2.0). But on IA64 it wasn’t, instead it was coming back as a StackOverflowException. Confusion ensued… For a while I was convinced we had something weird going on where in this random case we had two thread stacks next to each other and for some reason instead of getting the expected AV when we hit the hard guard for the next stack we were skipping into its soft guard and getting a stack overflow instead, the problem however didn’t turn out to be nearly so convoluted.

First a little background, the IA64 platform actually has two stacks for a thread, the “normal” stack and the “backing store”. I really should get around to writing up a piece on the IA64 calling convention and by association the interesting thing that is the backing store and rotating register stack… but for now it is enough to know that IA64 has this other thing called the backing store which is used for storing register values to memory from registers that have been allocated by a function for use as input, locals and output… And this backing store is laid out in memory such that it is next to and contiguous with the “normal stack”… And it grows up instead of down. The picture looks something like this:

0x1a000 Backing store hard guard

0x19000 Backing store soft guard

0x14000 <space>

0x13000 -- Frame D rotating register store

0x12000 -- Frame C rotating register store

0x11000 -- Frame B rotating register store

0x10000 -- Frame A rotating register store

                        -- “top” of “backing store” stack

                        -- top of “normal” stack

0x9000 -- Frame A

0x8000 -- Frame B

0x7000 -- Frame C

0x6000 -- Frame D [call graph looks like: A()->B()->C()->D()]

0x5FF8 <return address>

0x5FF0 BYTE* ptr

0x5F00 BYTE[] a (stack allocated byte array, size F0)

0x5000 <space>

0x1000 Soft guard

0x0000 Hard guard

When we have code like that which we saw above, and we run it on an IA64 box the result of running off the top of our “normal” stack (where the byte[] is allocated) is different. Instead of immediately running into random memory (and presumably AVing), we will consistently run into a known piece of memory that is the backing store stack. And, as we continue reading up that stack eventually we will run into the backing store soft guard region and cause the OS to issue a stack overflow exception which the CLR will convert to a managed StackOverflowException and return to the code in EventuallyBlowUp(). Maybe EventuallyBlowUp’s caller deals with the stack overflow, maybe not, of course the same can be said for the AV.

The moral of the story, it’s difficult to completely abstract away the underlying platform. In this case we had a discussion about whether or not to “fix the bug” in the string constructor such that it would always return an AV by checking whether or not the requested start offset and length when used with the given pointer (if it was stack allocated) would result in stack underflow. We decided for now to leave it like it is because it’s unsafe code and the current implementation makes the failure mode match that of a programmer writing similar unsafe code themselves.

Fixing the general unsafe code stack underflow case is of course far from trivial, and of debatable value.

Comments

  • Anonymous
    April 12, 2004
    First off, IIRC the OS does not restore the soft guard once it's raised a STATUS_STACK_OVERFLOW exception. If you overflow again, instead of taking another STATUS_STACK_OVERFLOW, you just get a STATUS_ACCESS_VIOLATION. The C run-time has the _resetstkoflw function to fix this. This may have changed in Windows XP/Server 2003; my references relate to Win2k.

    Secondly, I'm surprised that the OS doesn't have any protection between the register backing store and the regular program stack.

    I was going to link to some information on Raymond Chen's blog about IA64 calling conventions, but then I noticed you already had a trackback to that post (from your 'Some good 64bit background info' post).
  • Anonymous
    April 12, 2004
    Mike -- you're correct about what happens if you don't reset the soft guard region. The AV that occurs however happens because if you "overflow" again you run through the soft guard (which now acts just like normal stack pages) and hit the hard guard which is still unallocated and causes an AV. In the runtime we now handle unmanaged stack overflow exceptions (such that they can be dealt with) and propegate a managed StackOverflowException that you have a chance to catch and handle. Part of that handling is to reset the soft guard region.

    I too was surpised that there isn't any protection between the backing store and the regular stack, but it does make some kind of sense given that they don't make any promises about stack underflow.

    -josh
  • Anonymous
    January 04, 2008
    PingBack from http://actors.247blogging.info/?p=3901