Udostępnij za pośrednictwem


Precedence vs order, redux

Once more I'm revisting the myth that order of evaluation has any relationship to operator precedence in C#. Here's a version of this myth that I hear every now and then. Suppose you've got a field arr that is an array of ints, and some local variables index and value:

int index = 0;
int value = this.arr[index++];

When all is said and done, value will contain the value that was in this.arr[0], and index will be 1, right?

Right. Now, the myth. I often hear this explained as "because the ++ comes after the variable, the increment happens after the array is dereferenced."

Wrong! "After" implies a relationship based on a sequence of events in time, and the logical sequence of events is extremely well-defined. The sequence of events for this program fragment is:

1) store zero in "index"
2) fetch a reference to this.arr and remember the result
3) fetch the value in "index" and remember the result
4) add one to the result of step 3 and remember the result
5) store the result of step 4 in "index"
6) look up the value in the reference from step 2 at the index from step 3 and remember the result
7) store the result of step 6 in "value"

I emphasize that the logical sequence is well-defined because the compiler, jitter and processor are all allowed to change the actual order of events for optimization purposes, subject to the restriction that the optimized result must be indistinguishable from the required result in single-threaded scenarios. Reorderings are sometimes observable in multithreaded scenarios. Analyzing the consequences of that fact is insanely complicated. I might blog about those issues at some point if I feel brave. But for normal scenarios, you should assume that the ordering of events in time precisely follows the rules laid out in the specification.

In fact, you can demonstrate that the increment happens before the indexing with this little self-referential gem:

int[] arr = {0};
int value = arr[arr[0]++];

What happens? First we fetch arr[0], which is 0, and remember that. Then we increment arr[0], so arr[0] becomes 1. Then we fetch the value of arr[0] because we remember the 0, and we get 1.  If we had done the increment after the (outer) indexing then the result would be zero because the increment would not happen until after the value had been fetched.

-- Eric is on vacation; this posting was prerecorded --

Comments

  • Anonymous
    August 10, 2009
    You appear to be defining "order of operations" to be "the order in which side effects from evaluating an expression become visible", instead of the more common meaning of "the order in which you apply substitution rules, defining operator precedence".   You might want to substitute a different phrase to avoid confusion with the more common meaning.  For example, "evaluation order vs precedence" -- or even the slightly humorous "order in which operations take place vs order of operations", which plays on different meanings of "order "and "operations" Or maybe I'm the only who was confused by the "X vs X" title.

  • Anonymous
    August 10, 2009
    Footnote: Raymond Chen used the phrase "order of evaluation", which might have been what you intended.  That sounds better than what I suggested.  Citation: http://blogs.msdn.com/oldnewthing/archive/2007/08/14/4374222.aspx

  • Anonymous
    August 10, 2009
    Eric himself used "order of evaluation" in the linked article.

  • Anonymous
    August 10, 2009
    I like to think of the postfix operators as first incrementing/decrementing, then returning the original value. Pretty easy concept to grasp, no magic here. (In C++ this is pretty obvious, because that's just what you do if you overloaded the ++ operator.)

  • Anonymous
    August 10, 2009
    The one that confused me is this C# code which gives different results in C & C# int[] data={11,22,33};int i=1;data[i++]=data[i]+5; I couldn't find a bit in the C# spec that said anything about when increment should happen. I had presumed that the rvalue would be evaluated first followed by the lvalue. Any ideas where I missed it.

  • Anonymous
    August 11, 2009
    Related question on stackoverflo.com: http://stackoverflow.com/questions/1260227/int-arr0-int-value-arrarr0-value-1

  • Anonymous
    August 11, 2009
    @Peter Ibbotson: The statement:  data[i++] = data[i] + 5; isn't valid C, as the order of evaluation is implementation defined (or undefined, I forget which).  It may work, it might not work, and the behavior may change between compiler vendors (and/or between different compiler versions from the same vendor). In short, don't do that in C and C++. It's fine in C# (and Java, among others) because C# explicitly specifies order of evaluation.  C and C++ do not.

  • Anonymous
    August 12, 2009
    I really recommend that you see the question on stackoverflow http://stackoverflow.com/questions/1260227/int-arr0-int-value-arrarr0-value-1 I posted this question after I read the article. I always used to think that post increment happens after everything else. I stand corrected, post increment happens IMMEDIATELY. Yes! The variable to be incremented is incremented immediately, but it's old value is still used for evaluating the next expression. If that next expression dereferences the value of the variable (the incremented value) it will use the incremented value not the old value. e.g. int x=0, y=0; int z = x+ (y++) +y; what will happen first is that y will be incremented (to be 1 in this case) then it's value will be stored (somewhere a temp variable for example) afterwards we use the old value (stored in the temp) to evaluate the next expression (adding x to whatever the return value of (y++) which is 0 as mentioned previously)  this expression will now return 0 (0 + 0 =0) we take this value (0) and use it to evaluate the next part of the expression (adding y). we add 0 to y (what is the value of y now? Yes! It's 1). So the result of z will be 0 + 1 = 1.

  • Anonymous
    August 12, 2009
    even more fun, just use LinqPad and run the following. int i = 0; string.Format("{0} - {1} - {2}", i, i++, i++).Dump("i-i-i"); i.Dump("i-After"); gives you ▪ i-i-i 0 - 0 - 1 ▪ i-After 2

  • Anonymous
    August 31, 2009
    We tried this exact experiment once in college, as a survey of C compilers. Some gave 0, some gave 1, it was quite interesting. We also tried: int x = 0; int value1 = x++ + ++x; This wasn't totally reliable either. It gives either 1 or 2, for much the same reason. (This should be 2 in C#.) Combining these two experiments we constructed a program that gave anything from 0 to 3, depending on how the compiler was feeling. Unspecified semantics are great!