Jaa


Working with inexact error contracts

Last time I lamented that the fact that std::map::erase() is not permitted to fail made it hard to come up with an example.

But the vector pattern has the basic problem I wanted to discuss.  First I'll write the obviously correct C code and then we'll see how it translates into a world where you're using helper types/functions instead of plain ol' C arrays.

void reverse_array(size_t count, int array[]) {
size_t i, j;
i = 0;
j = count;
while (i < j) {
j--;
int t = array[i];
array[i] = array[j];
array[j] = t;
i++;
}
}

Pretty simple, right?  Now let's use some helpers and see how much it helps.

int reverse_array(vector_t *array) {
int err; // hey, what's going on? Why do we need this?
size_t count, i, j;
vector_get_size(array, &count);
i = 0;
j = count;
while (i < j) {
int t, t1;
j--;
// let's do a literal transcription here of what we would do if we used operator []
// from any of the popular class libraries.
if ((err = vector_get_at(array, i, &t)) != 0) return err;
if ((err = vector_get_at(array, j, &t1)) != 0) return err;
if ((err = vector_put_at(array, i, t1)) != 0) return err;
if ((err = vector_put_at(array, j, t)) != 0) return err;
i++;
}
return 0;
}

Pretty darned non-transactional, eh?  A trivial transformation to remove some of the failure modes would be to capture the array indexing once each for i and j and then just have a few pointer dereferences...

int reverse_array(vector_t *array) {
int err; // hey, what's going on? Why do we need this?
size_t count, i, j;
vector_get_size(array, &count);
i = 0;
j = count;
while (i < j) {
int t;
int *pi, *pj;
j--;
if ((err = vector_at(array, i, &pi)) != 0) return err;
if ((err = vector_at(array, j, &pj)) != 0) return err;
t = *pi; *pi = *pj; *pj = t;
i++;
}
return 0;
}

But if we're going to go that far, we can just get the array pointer from the vector:

void reverse_array(vector_t *array) {
size_t i, j, count;
int *elements;
vector_get_size(array, &count);
vector_get_array(array, &elements);
i = 0;
j = count;
while (i < j) {
j--;
int t = elements[i];
elements[i] = elements[j];
elements[j] = t;
i++;
}
}

Woah.  Hey, it doesn't fail any more, does it?  But who write the code this way?  Consider the C++ equivalent; something like (yes, someone better might use iterators but I'm not one with the zen of STL... at least I recognize my naivety which I doubt most people would)

void reverse_array(std::vector<int> &array) {
size_t i, j;
i = 0;
j = array.size();
while (i < j) {
j--;
int t = array[i];
array[i] = array[j];
array[j] = t;
i++;
}
}

How do you feel about that?  What exactly is the contract for the indexing operator in various class libraries?  The STL contract seems to indicate that the results of operator [] are defined when the index is between 0 and size-1; maybe this is correct.  Since the CLR makes no restrictions on the failure contract for functions, you have to assume they can fail.  Java seems to appropriately lock down the array access failures; they go beyond the STL contract to specifically call out that an IndexOutOfBoundsException is raised.  (You can debate which contract is better...)

Maybe the happy translation is:

void reverse_array(vector_t *array) {
size_t count, i, j;
vector_get_size(array, &count);
i = 0;
j = count;
while (i < j) {
int t;
int *pi, *pj;
j--;
if ((err = vector_at(array, i, &pi)) != 0) bugcheck(err);
if ((err = vector_at(array, j, &pj)) != 0) bugcheck(err);
t = *pi; *pi = *pj; *pj = t;
i++;
}
return 0;
}

Paranoia?  Good engineering?  I've been focussing on writing the code in C; writing in other languages can make the algorithm look simpler but hides a lot.

A few key points to note:

  • The old fashioned code has no obvious failure modes.  (Of course the hardware can fail but hey what are you going to do about this?)
  • Fancy new languages are hiding a lot of failure points
  • What's the contract of the reverse_array() function?  Can it fail or not?  Should try { } catch(...) { bugcheck(); } be added?

Comments

  • Anonymous
    June 19, 2005
    Michael,

    you seem to be misinterpreting things. The C version can fail the same way the C++ or Java version can. All versions fail when the count argument exceeds actual array size. The only difference is:

    - the C++ and Java versions exhibit failfast behavior: if the index is out of bounds, they throw an exception

    - the C version is failslow - it will shuffle around the bytes in memory even though they are in the wrong places, leading to catastrophic failure later on.

    There is no such thing as a no-fail piece of code. Just forget about that concept. Your article above doesn't really have a point. Sorry.