Jaa


Precedence vs Associativity vs Order

Raymond has written about this, I have written about Raymond writing about it, but I still frequently get questions from people who are unclear on the difference between precedence, associativity and evaluation order.

I suspect that this confusion arises from the difference between how most people are trained to evaluate arithmetical expressions versus how compilers generate code to evaluate arithmetical expressions. As a child it was drilled into me that the way to evaluate an arithmetical expression was to recursively apply the PEDMAS rule. That is, evaluate anything in parentheses first. Then evaluate any exponentiations. Then divisions, multiplications, additions and subtractions, in that order. So if you had 4 x 5 x (20 - 12 / 3) you start by evaluating what's in parentheses: 20 - 12 / 3. In there, there are no parens or exponents, so start with the division. Replace 12 / 3 with 4 to get 20 - 4. Then evaluate the subtraction to get 16. Now we have the value for the parens, and we are down to 4 x 5 x 16. Evaluate one of the multiplications -- but wait, we do not know what order to evaluate the multiplications in. But we can do it in any old order, so lets say 5 x 16 is 80, so we have 4 x 80, which is 320, done.

You'll notice that the algorithm that I was taught emphasizes that you do the work on whatever is in the deepest set of parentheses first, no matter what. Many people believe that arithmetical expressions in programming languages work the same way. They do not.

Rather, in programming languages, parentheses indicate how the results of evaluations are combined, but not necessarily the order in which the calculations are carried out. In languages with no side effects, the order of evaluation is irrelevant. But in languages where side effects might occur, order becomes relevant.

The evaluation of an arithmetical expression is controlled by three sets of rules: precedence rules, associativity rules, and order rules.

Precedence rules describe how an underparenthesized expression should be parenthesized when the expression mixes different kinds of operators. For example, multiplication is of higher precedence than addition, so 2 + 3 x 4 is equivalent to 2 + (3 x 4), not (2 + 3) x 4.

Associativity rules describe how an underparenthesized expression should be parenthesized when the expression has a bunch of the same kind of operator. For example, addition is associative from left to right, so a + b + c is equivalent to (a + b) + c, not a + (b + c). In ordinary arithmetic, these two expressions always give the same result; in computer arithmetic, they do not necessarily. (As an exercise can you find values for a, b, c such that (a + b) + c is unequal to a + (b + c) in C#?)

Now the confusing one.

Order of evaluation rules describe the order in which each operand in an expression is evaluated. The parentheses just describe how the results are grouped together; "do the parentheses first" is not a rule of C#. Rather, the rule in C# is "evaluate each subexpression strictly left to right".

The expression F() + G() * H() is equivalent to F() + (G() * H()), but C# does NOT evaluate G() * H() before F(). Rather, this is equivalent to:

temp1 = F();
temp2 = G();
temp3 = H();
temp4 = temp2 * temp3;
result = temp1 + temp4;

Another way to look at it is that the rule in C# is not "do the parentheses first", but rather to parenthesize everything then recursively apply the rule "evaluate the left side, then evaluate the right side, then perform the operation".

This is not a rule of C++. In C++, F(), G() and H() can be evaluated in whatever order the compiler chooses, so long as it combines the results in the right way. A legal C++ compiler might do left to right, right to left, parentheses first, whatever the compiler writer felt like.

The way this topic usually comes up is when someone has an expression chock full of side effects -- assignments, increments, decrements, pointer stores and so on, which they are attempting to convert from C++ to C#, and report the "bug" in the C# compiler to me that C# does not follow the "rules" of C++. Which is ironic, because since there are not actually any rules in C++ about order of evaluation between two sequence points. Thus the bug actually is that their code was never portable C++, and only worked because the code author happened to know (or guess) what ordering the compiler writer chose.

Comments

  • Anonymous
    May 23, 2008
    a = "Test"; b = 3; c = 6; (a + b) + c == "Test36" a + (b + c) == "Test9"

  • Anonymous
    May 23, 2008
    Not sure the exact value of x necessary for this in c#, but I'ma guess... a = 10 ^ x b = -a c = 5 (a+b) + c == 5 a + (b+c) == 0

  • Anonymous
    May 23, 2008
    Double a = 1; Double b = -1; Double c = Double.Epsilon; (a + b) + c == Double.Epsilon a + (b + c) == 0

  • Anonymous
    May 23, 2008
    Associativity is significant even when computing by hand: (3-2)-1 is different from 3-(2-1).

  • Anonymous
    May 23, 2008
    substraction is a non-associative operation as is division that's why the above example does not evaluate to the same result

  • Anonymous
    May 23, 2008
    a = -1, b= 1, c=Int32.MaxValue

  • Anonymous
    May 24, 2008
    The comment has been removed

  • Anonymous
    May 25, 2008
    The comment has been removed

  • Anonymous
    May 26, 2008
    I've seen it stated before, but does anyone really believe in PEDMAS as stated?  3-1+1 is 3, not 1. Similarly, 4*3/2 is 6, not 4, even under integer math. Essentially, addition and subtraction have equal precedence, as do division and mutliplication.

  • Anonymous
    June 03, 2008
    I know it's been a few days, but I still felt the need to reply. fwiw i'm not a math wiz either, but i know some bits. @Grant, PEDMAS is just a mnemonic but not the whole story, D & M have equal priority (left to right) as do Addition and Subtraction (left to right) - the grade 7 math blog agrees with me :) P E DM AS Addition is associative and commutative. Treat the -1 as being + (-1) and then apply the rule. Some examples of commenters above made the mistake of putting the brackets in the wrong place and ignoring the negative quantities. When done incorrectly, the answer is incorrect. Instead, I think you'll agree that the following are equivalent: 3 + (-1) + 1 = 1 + (-1) + 3 = 1 + 3 + (-1) similarly @ oldnewthing, you're correct that the statements you made are incompatible, because fundamentally hey are two different expressions. You mistakenly wrote 3 + -2 + -1 to an non-equivalent form: 3 + -1* (2 + -1) whereas correct bracketing could be 3 + -1 * (2 + 1) --> (distributive property) ax + bx = x(a+b) where x = -1 also equivalent expressions: 3-2-1 3 + -2 + -1 = 3 + -1 + -2 = -2 + 3 + -1 = 3 + (-2 + -1) = (3 + -2) + -1 ... Additionally, incorrect answers by computing devices with limited precision shouldn't be inferred as a mathematical proof of how numbers work. Computers can lie, and often do if you don't watch what you're doing. In a computer: 4000000000.0F + 1.0F = 4E+09 In real life: 4,000,000,000,000 + 1 = 4,000,000,000,001 When in doubt, use brackets. It lets both the compiler and programmers know what you mean. @steve - nice one. very funny.

  • Anonymous
    June 06, 2008
    Hi, Eric. Look at this example: using System; class A {    static void Main()    {        var x = F(1) + F(2) + F(3);    }    static A F(int x)    {        Console.WriteLine(x);        return null;    }    public static A operator +(A x, A y)    {        Console.WriteLine("+");        return null;    } } It prints: 1 2

3 + So, the first + operator evaluates before the F(3) operand. Is it required by the specification, or it is not specified? In other words, can a conforming implementation give the following result? 1 2 3 + + Thanks, nikov

  • Anonymous
    June 09, 2008
    The observed order is required. Look at it this way.   F(1) + F(2) + F(3) is equivalent to ((F(1) + F(2)) + F(3) which is equivalent to (A.op+(F(1), F(2)) + F(3) which is equivalent to A.op+(A.op+(F(1), F(2)), F(3)) And now the order of operations is clear. The values of A, op+ and A.op+ are resolved by the compiler and have no side effects so we'll ignore those.  The other operations are resolved in order from left to right. So first we compute F(1), then we compute F(2), then we compute the result of the inner op+, because that expression is to the left of F(3).  F(3) is computed, and now we have enough information to pass arguments to the outer op+.

  • Anonymous
    June 13, 2008
    "(As an exercise can you find values for a, b, c such that (a + b) + c is unequal to a + (b + c) in C#?)" Well, this isn't exactly what you was after, but I thought it was a humorous misinterpretation of your request :)  Here A + B + C isn't even the same as A + B + C. Regards, Matt    public class Launcher    {        static int a = 1;        static int A        {            get            {                return a *= a;            }        }        static int B        {            get            {                return a++;            }        }        static int C        {            get            {                return a += 5;            }        }        static void Main(string[] args)        {            int ans1 = (A + B) + C;            int ans2 = (A + B) + C;            int ans3 = A + (B + C);        }    }