Why no ++?
Oddly enough, my posting on how to handle a mad crush has become the third most popular article I've written so far. Who knew there were so many 10 year old girls interested in programming language design? I may have to turn this into an agony column. I've posted some more techniques for handling a mad crush, for the benefit of all you 10 year old girls reading my blog.
In other news, I was going through my email archive the other day and I found a discussion from back in the days when we were designing the VB.NET syntax that I thought might give some insight into the kinds of considerations that go into some of the smaller decisions.
A lot of people have opined that VB.NET and C# are in many ways the "same language", just with different syntax. If you were to take a chess board and change it from black and white to red and blue and you renamed all the pieces, but didn't change the more fundamental rules about the ways the pieces moved, the new game would in a very real sense be "the same game as chess," right? Whether VB.NET and C# really are in that sense mere syntactic variations on each other is a debatable point -- I think it is not nearly so cut-and-dried as some people think. There are a lot of reasons I think that, and I don't want to go into all of them today. One of the most interesting and important to me though is the most ineffable -- the "spirit" of a language in many ways both transcends its syntax and suffuses some small decisions in interesting ways.
Let's consider just one small example. VB6 did not have the
+= operator familiar to all C-like-language developers. In VB6 if you wanted to add ten to a number you said
x = x + 10
Looking at that, there's some textual redundancy there. Clearly the intention of the programmer is "increment the variable by ten", so why do we have to state the variable twice? This is an extremely common, simple operation so we can have a short-cut syntax for it that expresses it more compactly. (Such a shortcut, whereby a relatively cumbersome but legal syntax is replaced by a simpler syntax that adds no additional real representational power to the language is called a syntactic sugar.)
x += 10
And indeed,
Since incrementing a variable by one is also an extremely common operation, C-like languages have an even more compact syntactic sugar for this operation:
x++ or ++x
When the decision was made to add the
+= operator to VB.NET, one of our consultants commented
I am pleased to see the
+= construct (and I assume the other similar constructs) but the increment and decrement operators ++ -- are missing; this seems like an oversight. […] if the += etc. are allowed then it is natural that the ++ be allowed.
Actually, it isn't as natural as you might think, because VB.NET and C# are NOT mere syntactic variations on each other, and nor should they be. The reason why VB.NET doesn't have the increment operator illustrates the difference in spirit between VB.NET and C#. In fact, it is NOT the case that a C-like k++ is syntactic sugar for a VB-like k += 1 This is because in VB there has always been a strong line drawn between statements and expressions. Not so in languages like C. This is a perfectly legal C (and, for that matter, JScript) fragment, for instance:
{
2 + 2;
}
Not a particularly useful fragment, one that would probably produce a compiler warning, but legal -- because in C, a bare expression with no side effects is a legal statement.
In C it is also legal to go to the other end of the spectrum and say
x = (k += 1);
k += 1
is an expression in C which returns the assigned value and as a side effect assigns the value to k (and then to x). Most sensible people never use this fact about the += operator, but it is nonetheless true in C because C makes weak distinctions between statements and expressions.
That's not the case in VB.NET. In VB
k += 1 is not an expression, it is a statement, and ne'er the twain shall meet. Such a bizarre construction is a syntax error in VB.NET.
But
k++ is an expression. You can say in C
x = (k++) * (++k);
And get both the side effects (two increments), the multiplication and the assignment to
x. This is an extremely common usage in C-like languages, so if we were going to add the ++ operator, then developers would expect that k++ is an expression in VB.NET, not a statement. (The fact that they are expressions explains why there are two forms -- one form returns the value of the variable before it is incremented, the other after.)
If you make the increment operator only legal in statements
-- alone on a single line -- then saving that keystroke is clearly not worth the testing effort that would go into this -- much less the dev effort, the documentation effort, the localization effort, the specs that would have to be written, etc.
But if you make it a legal expression in VB.NET, you run into all kinds of problems. This operator causes problems in C-like languages, and it would cause the same problems in VB.NET.
Consider a hypothetical world in which
k++ and ++k are legal expressions. We would have to come up with some definition of what this code does:
Function Foo(ByRef x, ByVal y)
Foo = x * y
x = 10
End Function
k = 100
z = Foo(k++, ++k)
First of all, what numbers get passed to
Foo? Is this the same as
z = Foo(100, 101)
-- stick k on the stack, do ++k, stick k on the stack, do k++
or this?
z = Foo(100, 102)
-- stick k on the stack, do k++, do ++k, stick k on the stack
or this?
z = Foo(101, 101)
-- do ++k, stick k on the stack, stick k on the stack, do k++
Second, does the
k++ pass a reference to k to the byref parameter, or does it pass it by value? Suppose it passes it by reference -- then what does k equal when Foo returns? Do we do it like this:
Stick
k on the stack, do ++k, stick k on the stack, do k++, call Foo, assign 10 to k
or this:
Stick
k on the stack, do ++k, stick k on the stack, call Foo, assign 10 to k, do k++, so it's 11.
Or, if it passes by value then
k is never set to 10, so we're fine -- k is definately 102 when we're done as it is incremented twice.
We could come up with some answers to these questions, but they would be rather arbitrary. The answers are apparently sufficiently inobvious that the C++ standard leaves several of them unanswered, and therefore such code is not portable! The question about function parameter list evaluation order is dismissed as
"The order of evaluation of arguments is undefined; take note that compilers differ. The order of evaluation of the postfix expression and the argument expression list is undefined." The question about whether k++ runs before or after the function runs is answered: "All side effects of argument expressions take effect before the function is entered" And with regard to whether k++ returns a reference to k -- in C++, the ++ and -- operators take lvalues and return lvalues, so yes, if VB.NET worked like C++ in this regard, evaluating k++ as an argument would have to pass a reference to k to a function expecting a byref argument.
Clearly these problems are solvable -- we solved them in C#, obviously! -- but you have to ask yourself how valuable saving that keystroke is if it adds these kinds of complexities to the language semantics. Adding complexity isn't necessarily bad, but you should get value for your additional complexity, not a single keystroke saved!
k++
simply is not very BASIClike. Maybe that marks me as hopelessly old-fashioned that I'm saying that anything is not BASIClike when we have a BASIC that has object polymorphism! Call me old-fashioned then -- but I think that VB.NET is not and should not be "C# without braces" and that there are expression idioms which don't particularly make sense in VB. Heck, I would argue that ++ is an idiom which does not work particularly well in C, C++, Java, JScript or C#. Sure, ++ in C lets you write very dense code that can only be read if you understand the particular idiom of C -- but dense code is not necessarily fast code or maintainable code or readable code or correct code. Personally I only ever use ++ in loop incrementers and when walking strings a byte at a time -- I'll gladly change ++'s to +=1's.
I'm all for adding syntactic sugar that makes code less verbose.
+= is a great example of that as it adds nothing new to the language, it just makes an existing common operation lexically shorter. It's a real sugar. But increment operators are not mere sugar; they add new functionality, opening up immense cans of worms at huge cost for small gain.
Comments
Anonymous
July 19, 2004
The comment has been removedAnonymous
July 19, 2004
Prefix, Dan. Prefix.Anonymous
July 19, 2004
I learned BASIC on an Altair and very much appreciate all efforts to retain the "spirit" of the language.
The main difference, in my opinion, between the two languages is the balance between power and complexity. The optimum choices for both is not an easy thing.Anonymous
July 19, 2004
The comment has been removedAnonymous
July 19, 2004
Hey, did you just send me a mash note? :-)
Thanks, that's a nice thing to say.
So what do you do when you have a mad crush on a boy?Anonymous
July 19, 2004
I wonder if a post on "what to do when you have a mad crush on a girl" might actually be more useful to the demographic for your blog. :-) With all respect to readers such as " A girl, but not 10 years old."Anonymous
July 19, 2004
> Prefix, Dan. Prefix
Whoops, my bad. Never post after midnight ...Anonymous
July 19, 2004
>>Note that this means that nonsense like k += --(--k++)++); is legal C++
Nope, it's not. ++/-- operators require l-value but do not return them. So --k++ is not legal. You also can't pass k++ by reference. Also using several side-effects on the same variable in one statement is hardly used by anyone in his right mind, because the result is unpredictable.
Don't know how this behaves in C# thought.Anonymous
July 19, 2004
When I did support for borland I remember someone having a problem with converting a two character string to an integer.
char x; int y;
y=((x++)-'0')*10 + *(x++)-'0';
Since the ++ bit is only guaranteed to happen
after the statement executes this actually turned into:
y=(*x-'0')*10 + *x -'0'; x+=2;
Perfectly legal according to the ANSI spec.Anonymous
July 19, 2004
Are you sure += is only syntactic sugar?
What about:
x.longCalculation().y += 1
I sure hope longCalculation isn't called twice.Anonymous
July 19, 2004
This reminds me of a discussion I had a couple of years ago when I suggested that the various "End <construct>" statements could have been reduced to simple "End" statements in VB.NET. It was quite possible to do, of course, would save a few characters when typing, and would even simplify automated code generation.
It was a very un-Basiclike idea though; that small added redundancy can make a huge difference to comprehension of code.Anonymous
July 20, 2004
Some1: Whoops. You are correct. I've removed the error. Thanks!Anonymous
July 20, 2004
The comment has been removedAnonymous
July 20, 2004
The comment has been removedAnonymous
July 20, 2004
You're on the right track.
A more rigorous way to think about it though would be to consider the grammar of the language. Let me give you a quick example of what I mean. Part of the grammar for a simple language might be something like
* A PROGRAM is a list of STATEMENTS.
* A STATEMENT is either an ASSIGNMENTSTATEMENT or a CONDITIONALSTATEMENT
* An ASSIGNMENTSTATEMENT is VARIABLENAME = EXPRESSION
* An EXPRESSION is an ADDEXPRESSION or a MULTIPLYEXPRESSION or a NUMBER or a VARIABLENAME
* An ADDEXPRESSION is an EXPRESSION followed by + or -, followed by another EXPRESSION
* etc.
In this grammar "x = 1 * 3 + z * 4" is a PROGRAM that consists of one STATEMENT, an ASSIGNMENTSTATEMENT, which consists of an ADDEXPRESSION that consists of two MULTIPLYEXPRESSIONS... etc.
So what does this have to do with VB vs C? Well, in C, a STATEMENT can be a lot of things, including just an expression. In VB, a STATEMENT cannot be just an EXPRESSION.
This is particularly important in VB because in a STATEMENT, = means assignment, but in an EXPRESSION it means comparison. (That then explains why C has a different operator for comparison, ==.)Anonymous
July 20, 2004
>In VB, a STATEMENT cannot be just an EXPRESSION.
Thanks! I get it.Anonymous
July 20, 2004
The comment has been removedAnonymous
July 20, 2004
Different thoughts on the operators.Anonymous
July 20, 2004
Such an argument -- that there is a relatively simple morphism between C# and VB which preserves computations -- is certainly interesting from a computer science perspective.
However, the vast majority of users of both C# and VB are not professional computer scientists and they really don't care much about notions like expressivity!
As you point out, the most expressive class is that of the Turing Complete languages. The fact that any computable problem can be solved in either language is not particularly germane to the designed-in differences between the languages! The designed-in differences have a lot more to do with practical differences between different developer constituencies than any particular class of computational tasks.
I'm hoping Paul Vick talks about this a bit in his blog later this week. It's a big topic.Anonymous
July 20, 2004
Thanks for the reply... My argument was that, while all TC languages can do "the same" thing, they do not share the same expressiveness. For example, if I were to translate some pure lambda calculus to Scheme, it would be fairly simple, as Scheme is a superset (of sorts) of it. Likewise, any Scheme program could be turned into pure Lambda caculus. But now suppose that we take the program converted to Scheme, and then throw some assignments into it, giving it side-effects here and there. It could still be translated to lambda calculus, but it would require global changes to the code. Thus, Scheme is more expressive than lambda calculus, though computationally they can do the same thing.
With the C# and VB models, changes from one to the other are easy, and changes made to the converted programs are easy and always local (however, I'm not an expert with VB.NET, so there may be a special feature I'm missing). Thus, there really only is a superficial difference between the two.
You're absolutedly right that the common programmer doesn't care about these finer language points: I know so many people who think inner-classes in Java are a pain because they require "too much typing." The semantics of Java inner-classes are superbe: they are basically lexical closures from Scheme; but if you were to use them as you would lambda in Scheme, the code would look quite large, and the meaning would be less aparent. So, we end up with iterators and less expressive features that are, at least, easier to type.Anonymous
July 21, 2004
I've missed the syntax:
{ref. expression} := * {op} {expression}
That is available in Burroughs Algol (Unisys ClearPath NX/LX family machines).
The "*" basically tells the compiler to re-use the reference on the stack that resulted from resolving the reference expression whether a scalar entity or an array element, etc.
More flexible then +-/-= because it can be used to set/clear bitfields, etc. as well as merely inc/dec. Also it might be a bit more "Basic-like" then +=/-=, though it doesn't (necessarily) touch the issues surrounding ++var or var++, et al.
After all, that Algol compiler allows embedded assignments in expressions though their use is frowned on in most instances:
{ref1} := {exp1} {op1} ({ref2} := * {op2} {exp2})
... and so on.Anonymous
July 22, 2004
I always hate it when someone says "syntactic sugar" because it is usually condescending and dismissive, and indicates that I'm not going to get the language feature I want! :)Anonymous
July 22, 2004
Eric based on what you've already written you are probably not going to agree me, but...
I hear what you are saying about the spirit of the BASIC language, but I would argue that the spirit should not be maintained such that you lose sight of the real goals. For example, just because an SUV is not a sports car does not mean it shouldn't incorporate technologies from sports cars that improve handling and performace if and when it helps it achieve its own goals.
For me, I think VB.NET should be the easiest tools to create readable, robust, and maintainable business systems with business systems being anything a business might want to implement for internal use and an ISV might want to implement for a vertical (yes there are holes in this definition, but its late so forgive me.) Operating systems and printer drivers come to mind as being not business systems, but almost everything above that does. VB should be able to be used to implement robust and powerful server software, for example.
So just because VB originally maintained statements and expressions as seperate and distinct should keep you from evolving the language. I don't see any reason why VB can't be evolved to allow more powerful expressions. The dBase language was very much a statement vs. expression language, and one of the best programming languages I've ever used was a dBase compiler called Clipper that extended the language in exactly that way (I know, I wrote a ~1000 page book on it!) Clipper v5.0 added support for ++. --, and inline assignment using the ":=" operator as well as a lot of other "syntactic sugar" <shudder; I hate that phase>.
So I actually agree that ++ and -- don't make sense to be added on their own. But what I think it would make sense would be to support richer expressions in VB such as an inline assignment operator like ":=" and to also support ++ and --. Allowing inline assignment in an expression would not require that you treat all statements is if they were expressions. As you've already made quite clear, VB.NET and C# are not exactly alike so this aspect would not have to behave exactly alike. Look at Clipper; it was very much a statement language snd they made it work very nicely (too bad 10 years ago Computer Associates bought Clipper and not Microsoft... :)
Well I tried...Anonymous
July 23, 2004
> I think VB.NET should be the easiest tools to create readable, robust, and maintainable business systems
I 100% agree. C-style operators work against that goal by making the language more dense. More dense means less readable and maintainable, and less maintainable means less robust. Short cuts make long delays!Anonymous
July 28, 2004
The comment has been removedAnonymous
May 17, 2007
Eric, you are wrong. Again! You wrote "And with regard to whether k++ returns a reference to k -- in C++, the ++ and -- operators take lvalues and return lvalues, so yes, if VB.NET worked like C++ in this regard, evaluating k++ as an argument would have to pass a reference to k to a function expecting a byref argument." Prefix increment/decrement operators in C++ return l-value, postfix increment/decrement return r-value. Therefore, it is illegal to write i++ ++ in C/C++. A r-value cannot be bound to a non-const reference variable in C++. Continuing this reasoning, it is also illegal to call f( k++ ), if k has been declared like this: void f( int& ); but is legal for the cases, void f( int ); or void f( const int& ); VC8 produces a warning for the above code, I don't remember the exact warning level but compile with /W4 option to see it.Anonymous
September 24, 2009
actually i found this problem existing both in c and c++ when i wrote int a=5 int b=++a + (++a + a--); the answer of b was 21 when i wrote int b,a=5; b=++a + (++a + a--); the value of b came out to be 18 n when i tried this same problem in gcc compiler the answer was 20 can anyone explain why the answers are varying? please its important The ++ operator both returns a value and causes a side effect. The C# specification carefully defines the order in which expressions that have side effects are processed, so that every implementation of C# gives you the same answer. However, the C and C++ specifications explicitly do not specify the order in which side effects occur. A conforming C compiler is allowed to run the side effects in your expression in any order it chooses. And therefore any two compilers can disagree on the meaning of this expression. You should therefore avoid such expressions, because you cannot know that the compiler will actually do what you want. If you are interested in this topic, you should look up "sequence point" on wikipedia; that will get you started on understanding exactly what is specified in C and what is left up to the compiler. -- Eric