The C abstract machine
I mentioned yesterday that the C/C++ language was defined to operate on an abstract machine. At the time I didn't know of an online reference to the C or C++ language standard, but a little birdie pointed me to this, which is a draft of the C language specification.
In section 5.1.2.3, you find:
The semantic descriptions in this International Standard describe the behavior of an
abstract machine in which issues of optimization are irrelevant.Accessing a volatile object, modifying an object, modifying a file, or calling a function
that does any of those operations are all side effects, which are changes in the state of
the execution environment. Evaluation of an expression may produce side effects. At
certain specified points in the execution sequence called sequence points, all side effects
of previous evaluations shall be complete and no side effects of subsequent evaluations
shall have taken place. (A summary of the sequence points is given in annex C.)In the abstract machine, all expressions are evaluated as specified by the semantics. An
actual implementation need not evaluate part of an expression if it can deduce that its
value is not used and that no needed side effects are produced (including any caused by
calling a function or accessing a volatile object).When the processing of the abstract machine is interrupted by receipt of a signal, only the
values of objects as of the previous sequence point may be relied on. Objects that may be
modified between the previous sequence point and the next sequence point need not have
received their correct values yet.The least requirements on a conforming implementation are:
* At sequence points, volatile objects are stable in the sense that previous accesses are
complete and subsequent accesses have not yet occurred.
* At program termination, all data written into files shall be identical to the result that
execution of the program according to the abstract semantics would have produced.
* The input and output dynamics of interactive devices shall take place as specified in
7.19.3. The intent of these requirements is that unbuffered or line-buffered output
appear as soon as possible, to ensure that prompting messages actually appear prior to
a program waiting for input.What constitutes an interactive device is implementation-defined.
More stringent correspondences between abstract and actual semantics may be defined by
each implementation.<The standard goes on and gives examples and clarifications associated with those examples>
There are more details scattered throughout the document, but this is the section that defines the machine. A couple of things that I find quite cool in it:
Section 2 which states that side effects of previous evaluations must be complete at sequence points (for example at the ; at the end of a statement) What that means is that a compiler is free to reorder operations in any way it chooses but it can't reorder them across sequence points IF the operations have side effects.
So if you have:
*p++ = p[i++];
printf("p is: %p", p++);
the compiler could generate about 4 different versions (depending on what order in which the post increments occur), but the first statement can't affect the printf statement.
Section 3 states that a compiler can reorder code if there are no side-effects. It also says that the compiler can discard code if it figures it's not used.
What's also fascinating is that the standard says nothing about concurrency or multiple execution units - the C language definition defines what happens within ONE execution unit. It is also explicitly mute about other execution units. Why is this important?
Consider the following code:
static int myFirstSharedValue, mySecondSharedValue;
myFirstSharedValue = 5;
mySecondSharedValue = 42;
if (mySharedValue == 5)
{
<do stuff>
}
The compiler could optimize the assignment of mySecondSharedValue to occur AFTER the if test (assuming that nothing in <do stuff> depends on the value of mySecondSharedValue. The compiler also might reorder the assignment to put the second assignment first!
What's worse, the processor might chose to reorder how the data was saved in memory. As long as the read of mySecondSharedValue for that execution unit returns the value of 42, it doesn't actually matter when the value of 42 is saved in memory. It might easily be before the first value is written. As long as you've only got one thread running, it doesn't matter.
On the other hand, if you have multiple threads that read those values, it would be easy to depend on the write to myFirstSharedValue happening before the write to mySecondSharedValue - after all, that's what the code says, and that's what the language defines.
But the language is defined for the abstract execution unit above, and that might not match the real system.
This is why people who try to write correct lock-free programming end up tearing their hair out, and why it's so hard to write lock free code.
Btw, Herb Sutter's been chairing a working group that's chartered to define the abstract machine definition for Windows programs, some of his work can be found here.
Comments
Anonymous
May 16, 2007
> So if you have: > p++ = p[i++]; > printf("p is: %p", p++); > the compiler could generate about 4 different versions No. If you had a program which obeyed the rules of the standard[] then the compiler could generate about 4 different versions. When you have a program which violates the standard, the standard says that results are undefined. The standard imposes no limits on what kind of generated code can result from your program. Look more closely at p++ = p[i++] There is a sequence point before that expression starts executing and another sequence point after that expression finishes. In between, your program modifies p once, which by itself would be allowable. But also in between, your program accesses the old[**] value of p for a purpose other than computing the new value of p, so you are violating the standard. Results are undefined. [ Some people proposed the phrase "strongly conforming" but it never caught on. Anyway this condition is more stringent than simply conforming to one implementation, but far less stringent than strictly conforming and producing identical results on all implementations.] [** If you wish you can propose an ordering of operations in which your expression accesses the new value there instead of the old value. However, the compiler doesn't have to choose the same ordering. Your expression allows code to be generated to access the old value and wipe your hard disk.]Anonymous
May 16, 2007
The comment has been removedAnonymous
May 16, 2007
The comment has been removedAnonymous
May 16, 2007
BTW: But do you think that using code like *p++ = p[i++] is good way of writing code? :-) IMHO robust quality code means avoiding unreadable or ambiguous statements; I do prefer splitting that statements into smaller and more readable ones...Anonymous
May 17, 2007
The comment has been removedAnonymous
May 17, 2007
My memory is fuzzy, but I thought assignment was also a sequence point and, since it was right associative, it would always evaluate the right side first. Thus, despite being cryptic, Larry's example would be well-defined. *p++ = p[i++]; The rhs (p[i++]) would be fully evaluated (including the side effect), THEN the lhs (*p++), THEN the assignment. Perhaps I'm suffering from false memories and wishful thinking. Aside from that, isn't that volatile storage class supposed to be the Band-Aid that keeps all this working in a multi-threaded environment? If you tag myFirstSharedValue and mySecondSharedValue with volatile, doesn't that solve the threading problems that an aggressive optimizer might have introduced?Anonymous
May 17, 2007
The comment has been removedAnonymous
May 17, 2007
Sys64738: I don't think anyone reading (or writing, for that matter) this blog would consider writing such code.Anonymous
May 17, 2007
So does this type of problem go away when you use a functional programming style? I know very little about FP, but have heard that it makes concurrency "easy" to deal with. Can somebody in the know make some comments on this? I'd be interested in what you have to say.Anonymous
May 17, 2007
Jonathan, I'm suspect that the problems are orthagonal. In many procedural languages (C# for example), the concept of specified but not defined or unspecified but defined don't happen, for functional languages, I suspect there are languages that have defined but not specified behaviors.Anonymous
May 17, 2007
Thursday, May 17, 2007 9:59 AM by Andrew Rogers > Furthermore, the prior value shall be read only to determine > the value to be stored. Your quotation is correct. > Here's an example of an expression that's unspecified but not undefined: > p = (++q) + (2 * q); No. The compiler is free to read the prior value of q for the purpose of computing 2*q and computing the new value of p, before the compiler writes the new value of q as a side effect of computing ++q. You correctly quoted why this results in undefined behaviour.Anonymous
May 17, 2007
Norman, You're right; I should have been more careful. To demonstrate the unspecified nature of subexpression evaulation giving rise to non-deterministic behaviour, I believe we can use a function call and so not run foul of the rule in 6.5 p2, as functions have a sequence point before the call: === Section 6.5.2.2 (Function calls): Semantics 10 The order of evaluation of the function designator, the actual arguments, and subexpressions within the actual arguments is unspecified, but there is a sequence point before the actual call. === static int q = 0; int inc_q() { return ++q; } int p = (inc_q()) + (2 * q); Is my toy example right now? :-)Anonymous
May 17, 2007
Friday, May 18, 2007 2:21 AM by Andrew Rogers > Is my toy example right now? :-) That's about the point where I stopped trying to interpret the standard and returned to practical programming ^_^Anonymous
May 17, 2007
> Is my toy example right now? :-) OK, yes I think it's right. But my previous answer to this question was better ^_^Anonymous
May 21, 2007
sigh So many hairs to split, so little time... Seriously: I consider this knowledge important when reading/debugging existing code, not writing new code. Code whose function depends on this, even if correct, is not code that I want to maintain or debug.Anonymous
May 22, 2007
The comment has been removedAnonymous
May 23, 2007
Wednesday, May 23, 2007 2:56 AM by PavelS > Can you, please, show me on some real-life example what can > go wrong if the volatile specifier is used as Adrian suggested? The standard leaves it to the implementation to define the meaning of volatile. An unfriendly but valid implementation can still let a lot of things go wrong. If an implementor wants to be friendly to programmers who want volatile to be most meaningful, then the implementor has to figure out how to persuade each CPU how to override its normal caching procedures, including making sure that each other CPU will lose its cache of an obsolete value. There are rumours that some programmers want their code to be slowed down by a factor of 5 instead of 100, so some implementors have to choose what features will be provided by volatile.Anonymous
May 27, 2007
The comment has been removedAnonymous
May 29, 2007
Ah, I found very nice forum conversation in which Jim Dempsey clearly explains on example what's going on with volatile from compiler point of view and from CPU point of view in VS2005 and in pre-VS2005. Then there were only few google searches about memory barriers and things are much clearer now. Link: http://softwarecommunity.intel.com/isn/Community/en-US/forums/30231929/ShowThread.aspx P.Anonymous
July 30, 2007
Hello, Did ...
- you mean to declare mySharedValue
- (likely) use myFirstSharedValue or
- I completely miss the point? Toby
- Anonymous
July 30, 2007
#2. Sorry about that.