Jaa


Foolish consistency is foolish

Once again today's posting is presented as a dialogue, as is my wont.

Why is var sometimes required on an implicitly-typed local variable and sometimes illegal on an implicitly typed local variable?

That's a good question but can you make it more precise? Start by listing the situations in which an implicitly-typed local variable either must or must not use var.

Sure. An implicitly-typed local variable must be declared with var in the following statements:

var x1 = whatever;
for(var x2 = whatever; ;) {}
using(var x3 = whatever) {}
foreach(var x4 in whatever) {}

And an implicitly-typed local variable must not be declared with var in the following expressions:

from c in customers select c.Name
customers.Select(c => c.Name)

In both cases it is not legal to put var before c, though it would be legal to say:

from Customer c in customers select c.Name
customers.Select((Customer c) => c.Name)

Why is that?

Well, let me delay answering that by criticizing your question further. In the query expression and lambda expression cases, are those in fact implicitly typed locals in the first place?

Hmm, you're right; technically neither of those cases have local variables. In the lambda case, that is a formal parameter. But a formal parameter behaves almost exactly like a local variable, so it seems reasonable to conflate the two in casual conversation. In the query expression, the compiler is going to syntactically transform the range variable into an untyped lambda formal parameter regardless of whether the range variable is typed or not.

Can you expand on that last point a bit?

Sure. When you say

from Customer c in customers select c.Name

that is not transformed by the compiler into

customers.Select((Customer c) => c.Name)

Rather, it is transformed into

customers.Cast<Customer>().Select(c => c.Name)

Correct. Discussing why that is might be better left for another day.

Indeed; the point here is that regardless of whether a type appears in the query expression, the lambda expression in the transformed code is going to have an untyped formal parameter.

So now that we've clarified the situation, what is your question?

C# is inconsistent; var is required on an implicitly-typed local variable (regardless of the "container" of the local declaration), but var is illegal on an implicitly-typed lambda parameter (regardless of whether the lambda parameter is "natural" or is the result of a query expression transformation). Why is that?

You keep on asking "why?" questions, which I find difficult to answer because your question is secretly a "why not?" question. That is, the question you really mean to ask is "I have a notion of how the language obviously ought to have been designed; why is it not like that? "  But since I do not know what your notion is, it's hard to compare its pros and cons to the actual feature that you find inconsistent.

The problem you are raising is one of inconsistency; I agree that you have identified a bona fide inconsistency, and I agree that a foolish, unnecessary inconsistency is bad design. Our language is designed to be easy to comprehend; foolish inconsistencies work against comprehensibility. But what I don't understand is how you think that inconsistency ought to have been addressed.

I can see three ways to address that inconsistency. First, make var required on lambdas and query expressions, so that it is consistently required. Second, make var illegal on all locals, so that it is consistently illegal. And third, make it optional everywhere. What is the real "why not?" question?

You're right; I've identified an inconsistency but have not described how I think that inconsistency could be removed. I don't know what my real "why not?" is so let's look at all of them; what are the pros and cons of each?

Let's start by considering the first: require var everywhere. That would then mean that you have to write:

from var c in customers join var o in orders...

instead of

from c in customers join o in orders...

And you have to write

customers.Select((var c) => c.Name)

instead of

customers.Select(c => c.Name)

This seems clunky. What function does adding var in these locations serve? It does not make the code any more readable; it makes it less readable. We have purchased consistency at a high cost here. The first option seems untenable.

Now consider the second option: make var illegal on all locals. That means that your earlier uses of var on locals would become:

x1 = whatever;
for(x2 = whatever; ;) {}
using(x3 = whatever) {}
foreach(x4 in whatever) {}

The last case presents no problem; we know that the variable declared in a foreach loop is always a new local variable. But in the other three cases we have just added a new feature to the language; we now have not just implicitly typed locals, we now have implicitly declared locals. Now all you need to do to declare a new local is to assign to an identifier you haven't used before.

There are plenty of languages with implicitly declared locals, but it seems like a very "un-C#-ish" feature. That's the sort of thing we see in languages like VB and VBScript, and even in them you have to make sure that Option Explicit is turned off. The price we pay for consistency here is very different, but it is still very high. I don't think we want to pay that price.

The third option -- make var optional everywhere-- is just a special case of the second option; again, we end up introducing implicitly declared locals.

Design is the art of compromising amongst various incompatible design goals. In this case, consistency is a goal but it loses to more practical concerns. This would be a foolish consistency.

Is the fact that there is no good way out of this inconsistency an indication that there is a deeper design problem here?

Yes. When I gave three options for removing the inconsistency, you'll notice that I made certain assumptions about what was and was not allowed to change. If we were willing to make larger changes, or if we had made different design decisions in C# 1.0, then we wouldn't be in this mess in the first place. The deeper design problem here is that the fact that a local variable declaration has the form

type identifier ;

This is not a great statement syntax in the first place. Suppose C# 1.0 had instead said that a local variable must be declared like this:

var identifier : type ;

I see where you are going; JScript.NET does use that syntax, and makes the type annotation clause optional. And of course Visual Basic uses the moral equivalent of that syntax, with Dim instead of var and As instead of :.

That's right. So in typing-required C# 1.0 we would have

var x1 : int = whatever;
for(var x2 : int = whatever; ;) {}
using(var x3 : IDisposable = whatever) {}
foreach(var x4 : int in whatever) {}

This is somewhat more verbose, yes. But this is very easy to parse, both by the compiler and the reader, and is probably more clear to the novice reader. It is crystal clear that a new local variable is being introduced by a statement. And it is also nicely reminiscent of base class declaration, which is a logically similar annotation. (You could have just as easily asked "Why does the constraining type come to the left of the identifier in a local, parameter, field, property, event, method and indexer, and to the right of the identifier in a class, struct, interface and type parameter? " Inconsistencies abound!)

Then in C# 3.0 we could introduce implicitly typed locals by simply making the annotation portion optional, and allowing all of the following statements and expressions:

var x1 = whatever;
for(var x2  = whatever; ;) {}
using(var x3  = whatever) {}
foreach(var x4 in whatever) {}

from c in customers select c.Name
from c : Customer in customers select c.Name
customers.Select(c => c.Name)
customers.Select((c : Customer) => c.Name)

If this syntax has nicer properties then why didn't you go with it in C# 1.0?

Because we wanted to be familiar to users of C and C++. So here we have one kind of consistency -- consistency of experience for C programmers -- leading a decade later to a problem of C# self-consistency.

The moral of the story is: good design requires being either impossibly far-sighted, or accepting that you're going to have to live with some small inconsistencies as a language evolves over decades.

Comments

  • Anonymous
    June 25, 2012
    The comment has been removed

  • Anonymous
    June 25, 2012
    Great post.  Your moral of the story, in my opinion, really applies to any software system.

  • Anonymous
    June 25, 2012
    Interesting, I never noticed this inconsistency before... This is probably a good sign: if you don't notice it, it's not really an issue.

  • Anonymous
    June 25, 2012
    I /really/ want the "var" keyword to come to VB.  As a VB programmer I hate having to turn Option Infer on for either the entire project, or the entire file.  It would be so much better to be able to turn type inference on at the variable level only when you need it, like you can in C#, and leave it off everywhere else.

  • Anonymous
    June 25, 2012
    The comment has been removed

  • Anonymous
    June 25, 2012
    This is a pretty interesting post.  Thanks, Eric.  Like Thomas Levesque, I never really paid attention to the inconsistency and agree with his verdict: not an issue.  After reading your forcing the hypotethical questioner to sort through the response 'himself', I think I'd be wary of asking you a vague question!

  • Anonymous
    June 25, 2012
    The comment has been removed

  • Anonymous
    June 25, 2012
    @Bradley Uffner I don't understand what you are asking for. Even with Option Infer On I can still write "Dim x As Integer = 5" if I want to. I can't think of any reason to not use Option Infer at the project level, especially if you are using Option Strict.

  • Anonymous
    June 25, 2012
    "C# is inconsistent; var is required on an implicitly-typed local variable (regardless of the "container" of the local declaration), but var is illegal on an implicitly-typed lambda parameter " Is is jus me or ... but I don't see any inconsistencies here: "implicitly-typed local variable " -> "implicitly-typed lambda parameter ", "variable" -> "parameter"

  • Anonymous
    June 25, 2012
    I like this other guy. He writes in a very pleasant font colour

  • Anonymous
    June 25, 2012
    Another great, thought-provoking post.  Creating an appeal for C and C++ programmers surely still plays a significant role in technology adoption.

  • Anonymous
    June 25, 2012
    Google's go solves this by introducing a new operator := as follows var x = BlaBla.Bla() would become x := BlaBla.Bla() That's quite nice.

  • Anonymous
    June 25, 2012
    glad you didn't go with the last option, first thing I would have said is "what's with all those redundant var's!"

  • Anonymous
    June 25, 2012
    @Eamon Nerbonne: That looks more like Pascal than C.

  • Anonymous
    June 25, 2012
    The last alternative would also seem to be excessively repetitive if you want to declare multiple variables in one declaration statement. Instead of int a, b, c; you'd now have something along the lines of var a : int, b : int, c : int; which most would probably argue is acceptable for int. Suppose however that it's a long type name (perhaps fully qualified, starting right at global::) with a few generic type parameters. What then? And no fair bringing up C's "char* p, q;" here!

  • Anonymous
    June 26, 2012
    @MK: Then I'd argue for typedefs in C# (something I'd argue for anyway - along with partially constrained generics aka generic typedefs).

  • Anonymous
    June 26, 2012
    > var a : int, b : int, c : int; In practically every language that uses this syntax, you can (or have to) write: var a, b, c: int; Indeed, if you restrict var such that it only allows declaring variables of a single type in a single var, the above is unambiguous - if the type is there, it applies to all listed variables, and if not, then it means "infer types" (which would then be illegal for the lack of initializers here). Indeed, though sometimes this changes. In VB6, "Dim Curly, Larry, Moe as Stooge" meant to annotate Moe as Stooge and Curly and Larry were variants. In Visual Basic .NET, all three are stooges. -- Eric

  • Anonymous
    June 26, 2012
    The comment has been removed

  • Anonymous
    June 26, 2012
    @Carl Daniel: The 'using alias = class;' syntax fulfills most typedef scenarios already, e.g. using ShortName = NameSpace.OthernameSpace.IncrediblyLongClassNameNoOneWantsToTypeFiftyTimes; :)

  • Anonymous
    June 27, 2012
    @Jonathan Allen I want Option Infer Off, and still have the ability to make the compiler infer a single variable.  I don't like programming with "option infer on", it feels sloppy and makes the code harder to read and understand, but there are some places, mostly linq, where it is required for a specific variable.

  • Anonymous
    June 27, 2012
    RaceProUK: Yes, the type aliasing ability of "using" is great, but it only works within a single file. The value of typedef is greatly amplified by C's #include mechanism, making it really easy to have a single typedef apply throughout your project. If you want to alias a type globally (possibly to make it easy to change), there's no easy way to do that.

  • Anonymous
    June 28, 2012
    @Gabe I agree with the linux styleguide and most other C/C++ I've seen there: 95% of all typedefs only make the code much harder to read for no purpose whatsoever. Especially in C# where you can use well named classes anyhow and don't have void* everywhere there's almost never a reason for them in a well written code base (i.e. the situations where a typedef would make good sense in C/C++ rarely ever come up in C#).

  • Anonymous
    July 02, 2012
    Your are talking about implicit parameters not variables. And from that point of view it is consistent. You can't write: public void Something(var value) {} Also in the matter of delegates it does make sense not be able to use var since a C# delegate is not simply a function / method pointer as in c++ for example, people should know that.

  • Anonymous
    July 03, 2012
    How can one answer "Why is var sometimes required?" without at least mentioning unnameable types (I think Eric's already written several posts elaborating on these, linking is of course better than repeating the explanation)?

  • Anonymous
    July 03, 2012
    But why 'var' declaration doesn't allow initialization to null ?

  • Anonymous
    July 15, 2012
    The comment has been removed

  • Anonymous
    August 10, 2012
    The comment has been removed

  • Anonymous
    August 15, 2012
    Just for the record - the original quote seemed already good enough: A foolish consistency is the hobgoblin of little minds. I do wish Emmerson had also thrown in the opposite: A foolish inconsistency is the hobgoblin of arrogant designers. :) Either way, the operative word there is 'foolish'...

  • Anonymous
    August 22, 2012
    The comment has been removed