Evil Compiler Tricks, and Checking for Pointer Math
My favorite programming geek hobby being integer overflows, this caught my eye –
"gcc silently discards some wraparound checks" https://www.kb.cert.org/vuls/id/162289
Basically, what it says is that code which looks like this:
============ snip ==============
char *buf;
int len;
gcc will assume that buf+len >= buf.
As a result, code that performs length checks similar to the following:
len = 1<<30;
[...]
if(buf+len < buf) /* length check */
[...perform some manipulation on len...]
are compiled away by these versions of gcc
============ /snip ==============
Apparently, the compiler may be allowed by the C/C++ standard to do this, as pointers wrapping are not defined. However, the standard also says that optimizations may not change externally observable behavior in an application (aside from making it go faster), so I'm not so sure this is completely legal, but I Am Not A Standards Geek (IANASG), so I'll let them fight it out. The CERT advisory says to do this:
if((uintptr_t)buf+len < (uintptr_t)buf)
being alarmed by this problem, I then consulted what I do in SafeInt, which is:
template < typename T, typename U, typename E >
T*& operator +=( T*& lhs, SafeInt< U, E > rhs )
{
// Cast the pointer to a number so we can do arithmetic
SafeInt< uintptr_t, E > ptr_val = reinterpret_cast< uintptr_t >( lhs );
// Check first that rhs is valid for the type of ptrdiff_t
// and that multiplying by sizeof( T ) doesn't overflow a ptrdiff_t
// Next, we need to add 2 SafeInts of different types, so unbox the ptr_diff
// Finally, cast the number back to a pointer of the correct type
lhs = reinterpret_cast< T* >( (uintptr_t)( ptr_val + (ptrdiff_t)( SafeInt< ptrdiff_t, E >( rhs ) * sizeof( T ) ) ) );
return lhs;
}
So let me explain what exactly we're doing here, and why the CERT advice is bad (except where sizeof(T) == 1). First thing is to cast the pointer to an unsigned type that can always hold the bits in a pointer, and store that in a SafeInt. This happily coincides with the fact that unsigned int overflow behavior is actually specified. The second thing to do is to ensure that the offset multiplied by sizeof(T) does not overflow, and store the result in a ptrdiff_t (we could be moving backwards in the array), and finally check whether the addition ends up overflowing.
If you write code that only checks to see if the result of a pointer addition is greater than what you started with, your compiler might just remove the check, but what is much worse is that your code might not be working even if the compiler does not remove it. To write really correct code, you have to check BOTH the multiplication and the addition operations. As an aside, a problem I like to give to people as a puzzle is to write the correct code to determine if a + b * c yields a valid result or not. Surprisingly few people get it, but I think we can see here just how often we need to get exactly this operation right. Here's a trick that works for 32-bit unsigned code:
Unsigned __int64 result = (unsigned __int64)a + (unsigned __int64)b * (unsigned __int64)c;
If( (unsigned __int32)(result >> 32) )
return error;
Comments
Anonymous
April 04, 2008
David LeBlanc and I (and a bunch of others) just had a little email exchange about some fascinating integerAnonymous
April 04, 2008
The gcc optimization is legal because 5.7.5 says "If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, [then...]; otherwise, the behavior is undefined." Since the first part is not satisfied (you can't have an array that straddles 0), the second half is operative: The behavior is undefined. According to 1.2.3.12, the standard imposes no requirements on undefined behavior. They even call out ignoring the situation entirely as an acceptable response. In other words, the language requires you to validate the parameters to the + operator BEFORE you perform the operation, not after.Anonymous
April 05, 2008
One question I have is whether that check is ever sufficient for real code. What I am thinking is that you really need to know that 'len' is not greater then the size of the buffer so you don't read/write some random memory that is not part of the buffer. Furthermore, I neither know of any valid way to get a buffer where the buffer wraps around the end of memory nor do I think there is any reason that could possibly be needed. So, based on that, if you check that 'len' is less then the size of buffer, the check for wraparound is redundant and unnecessary. [dcl] that's another way of doing it. you still have to make sure that the multiplication doesn't wrap in either case. This is why I'm ambivalent about fully supporting pointer math within SafeInt - it is more properly done by using another class we have internal to Office, which is BoundedPointer - what it does should be obvious from the name. STL iterators now also solve the problem nicely. There could be cases where it's more efficient to do this first, then other checks later. [dcl] You could also hit a case where I check to make sure the result is in the buffer, but it actually wrapped, and I'm not at the spot in the buffer I ought to be. So you always have to check both that you got there the right way AND that you're still in the buffer. If either condition fails, you have a problem. It is possible to write checks for some code that will do both at once, and that's obviously desirable.Anonymous
April 05, 2008
While already a nice entry point for exploits, how about 16-bit targets? Hitting 2^16 is pretty easy. [dcl] 16-bit code is always a great place to look when auditing. tends to be easy pickings. You also often run into programmer error because they didn't take into account int promotion.