다음을 통해 공유


Ptrdiff_t is evil

Well, not really, but here's a code problem that confounded some really smart devs – and it looks so simple!

void IncPtr( unsigned int cElements )

{

    if( m_pMax - m_pCurrent > cElements )

        m_pCurrent += cElements;

    else

        throw;

}

OK, so here's the question – if an error has happened, and m_pCurrent is > m_pMax, which implies the difference in the pointers is negative, which code branch do we execute? Assume cElements is a reasonably small number.

Hmmm – the immediate answer would be that the difference gets cast to an unsigned int to be compared with cElements, if it is negative, then it becomes large, and it is not less than cElements, so we throw, so this code is safe, right?

The answer, unfortunately, is a solid maybe. Back in engineering school, I got acquainted with something called dimensional analysis where you worked something out based on dimensions. For example, if you want to know how to get gallons from some number of miles and miles/gallon, figuring (miles / (miles/gallon)) shows the answer is in gallons. A similar approach for integers is often helpful. Let's look at the types involved in the if statement. First, what is the type of a pointer difference? That's a ptrdiff_t – which is a signed number that is the same number of bytes as a pointer, which means that on a 64-bit build, it is an __int64, and on a 32-bit build, it is a 32-bit int.

What we now have is:

If( ptrdiff_t < unsigned int)

If you have a 32-bit build, this works out to:

If( int < unsigned int)

Which then implies:

If((unsigned int)int < unsigned int)

The cast gives you an implied check for the lhs being less than zero (assuming reasonably small values for the unsigned number), and negative numbers will now fail, and since this is an error, this is what we want, and life is good. Do note that if you're compiling with all the warnings on, this will cause a warning, which you'd probably ignore, or cast away, being oblivious to the impending doom that is approaching.

Now consider 64-bit:

If(__int64 < unsigned int)

This gets cast very differently…

If( __int64 < (__int64)unsigned int)

You won't get a warning on your 64-bit build because the cast from unsigned int (like the cast from unsigned short to int) preserves value, and the assumption is that the comparison will always be correct. Under the error condition outlined here, the problem is that we're now not catching the error, we'll add to a pointer that's already probably bogus, and things will get worse from here.

As you can see, which branch gets executed depends on whether you're building 32 or 64-bit!

The solutions and lessons are –

  • Be wary of pointer math in porting code to 64-bit – ptrdiff_t is negative, and changes size.
  • Using the bit flipping of negative to very large positive effect is really programming with side effects, and being clever, neither of which are good practices.
  • Explicitly determine which path to take when dealing with negative numbers in a comparison

BTW, SafeInt will be available very shortly on CodePlex – I have just one more internal hoop to jump before I can post it.

Comments

  • Anonymous
    September 02, 2008
    PingBack from http://blog.a-foton.ru/2008/09/ptrdiff_t-is-evil/

  • Anonymous
    April 05, 2009
    The comment has been removed

  • Anonymous
    June 25, 2009
    Someone wrote me, and pointed out: You give the following example: void IncPtr( unsigned int cElements ) {    if( m_pMax - m_pCurrent < cElements )        m_pCurrent += cElements;    else        throw; } The logic of this conditional statement seems messed up.  Suppose the variables have been initialized like this: static int array[10]; m_pCurrent = array + 5; m_pMax = array + 10; i.e. m_pMax points to the end of the array and m_pCurrent points somewhere in the middle.  From the names, I expect this is how the variables would be used. Then, something calls IncPtr(2).  Now, m_pMax - m_pCurrent is 5, and 5 < 2 is false, so the function throws (assuming it's called from a catch clause) as if there had been an error.  However, it would have been perfectly safe to increment m_pCurrent by 2. On the other hand, if something calls IncPtr(99), then 5 < 99 is true, and the function happily does m_pCurrent += cElements, incrementing the pointer far past the end of the array. Thus, the test should be inverted.  And if you do that, then the behaviour in 64-bit builds, where the unsigned int gets converted to ptrdiff_t without any compiler warnings, will actually become correct; and the 32-bit builds that trigger a warning will exhibit the bug. [dcl] This has now been fixed. The overall point that the branch is different depending on build is the important bit.

  • Anonymous
    July 15, 2010
    Which is why ptrdiff_t and size_t should always go hand in hand.  The error is that cElements should be of type size_t. [dcl] That and overloading the bounds check for the pointer difference to rule out negative value by using a 'clever' side-effect. Also, I thought that, technically, it's possible for sizeof(ptrdiff_t) != sizeof(void*), but rather you're only guaranteed that sizeof(intptr_t) == sizeof(void*).  In practice, though, you'll probably get by with either. [dcl] I think you're right, but as you point out, in practice they're the same.

  • Anonymous
    August 06, 2010
    A better comparison is to test the sum vs the max:  if ( (m_pCurrent + cElements) > m_pMax ) throw; [dcl] Except that any pointer addition outside of the buffer isn't defined by the standard, and could be optimized out, which is exactly what gcc did a while back. Regardless of what the compiler does, we shouldn't write non-standard code.   Your issue with  If ( ptrdiff_t < unsigned int ) becoming  If (__int64 < unsigned int ) is the result of Microsoft's decision to use the LLP64 Data Model, not because ptrdiff_t is evil. [dcl] That's incorrect. It's the result of the C standard and how operator casts work.

  • Anonymous
    August 22, 2013
    Interesting point. But the condition should instead be coded like this: if (m_pCurrent + cElements < m_pMax)        m_pCurrent += cElements;