Compartilhar via


C# 3.0 is still statically typed, honest!

Since LINQ was announced I've seen a lot of really positive feedback and a lot of questions and concerns. Keep 'em coming! We need this feedback so that we can both correct misconceptions early and get the design right now.

One of the misconceptions that I've seen a lot over the last few days in forums, blog posts and private emails is a confusion about what the new "type inferencing" feature implies for the type safety of the language. Apparently we have not been sufficiently clear on this point: C# 3.0 will be statically typed, just like C# 1.0 and 2.0. The var declaration style does not introduce dynamic typing or duck typing to C#.

I think the confusion may arise from familiarity with other languages such as JScript. In JScript this is perfectly legal:

var foo = new Blah();
foo = 123;
foo = "hello";

JScript is a dynamically typed language. You can assign any value of any type to a var.

In C# 3.0, the var statement means "look at the type of the thing assigned to the variable, and act as though the variable was declared with that type." In other words, in C# the code above is just a syntactic sugar for

Blah foo = new Blah();
foo = 123;
foo = "hello";

which of course would produce a type error on the second and third lines.

If you take a look at section 26.1 of the C# 3.0 specification you'll see that the var statement has a lot of restrictions on it to ensure that the compiler always has enough information to make the correct type inference. Namely:

  • the declarator must include an initializer, so that we can infer the type of the variable from the type of the initializer
  • the initializer has to be something that we can figure out the type of – not null or a collection initializer

Compare this to JScript .NET, which has a much stronger type inference mechanism. JScript .NET does not require initializers in var statements; the compiler tracks all assignments to the variable and infers the best type. If, say, only strings are assigned to a variable then it will infer the string type. JScript .NET also infers return types of functions by a similar mechanism. But the goal of the JScript .NET type inference mechanism was to increase the performance of legacy dynamically typed code. If we can infer a type and thereby generate faster, smaller code, we do so. If not, we don't.

Then why introduce this syntactic sugar in C# 3.0? C# doesn't have a body of legacy dynamic code like JScript and already generates efficient code.

There are two reasons, one which exists today, one which will crop up in 3.0.

The first reason is that this code is incredibly ugly because of all the redundancy:

Dictionary<string, List<int>> mylists = new Dictionary<string, List<int>>();

And that's a simple example – I've written worse. Any time you're forced to type exactly the same thing twice, that's a redundancy that we can remove. Much nicer to write

var mylists = new Dictionary<string, List<int>>();

and let the compiler figure out what the type is based on the assignment.

Second, C# 3.0 introduces anonymous types. Since anonymous types by definition have no names, you need to be able to infer the type of the variable from the initializing expression if its type is anonymous.

We'll discuss the reasoning behind anonymous types in another post.

Comments

  • Anonymous
    September 27, 2005
    The comment has been removed
  • Anonymous
    September 27, 2005
    Any way we can get 'var' replaced with 'dim' and the '=' replaced with 'as'? :)
  • Anonymous
    September 27, 2005
    That's hilarious! I will share your suggestion with the language design committee, but I don't think they'll go for it.
  • Anonymous
    September 27, 2005
    The comment has been removed
  • Anonymous
    September 27, 2005
    The more minimal

    foreach (var x in "") ;

    shows this as well.
  • Anonymous
    September 27, 2005
    Here's a different way to get an ICE with inferencing

    static void Main(string[] args)
    {
    var x = new[]{};
    }
  • Anonymous
    September 27, 2005
    And the program

    static void Main(string[] args)
    {
    var x = (object[]) new[]{null};
    }

    compiles but gives the extraordinarily mysterious "bad image format exception".
  • Anonymous
    September 27, 2005
    Any good reason a function can't be 'var' typed and have its return type inferred from its return statement?

    I would like to do something like this:

    var divmod = Div(x, y);

    var Div(int x, int y)
    {
    return new { Quotient = x / y, Remainder = x % y };
    }

    In other words, this would make it really easy to return multiple values from a function without having to declare a type ahead of time or use cumbersome out parameters.
  • Anonymous
    September 27, 2005
    Gabe: Separate compilation is an obvious limit to the amount of type inferencing that can occur (unless you always plan to compile your entire program at one go, which doesn't scale well).
  • Anonymous
    September 27, 2005
    Why can I do

    int[] x = {1, 2, 3};

    but I have to do

    var x = new[] {1, 2, 3};

    when I want to do

    var x = {1, 2, 3};

    ?

    The new[] seems like a bit of syntactic cruft in this case since it just adds type information that is already inferrable.
  • Anonymous
    September 27, 2005
    Is there any reason you've chosen the 'var' keyword unlike the C++ standards people who are doing much the same thing with the 'auto' keyword?

    Is it because you were first, or you were doing it simultaneously or is it just to be different?

    I only ask because it can be a pain if you have to use both languages where there is a different keyword for the same thing.

    On the flipside I suppose javascript people would ask the same question if you'd used "auto", but from my point of view it represents a different concept so should maybe have a different name. Also I suppose "auto" isn't a very good name, but the C++ people went with it to avoid adding a new keyword since they already had one that no one uses.
  • Anonymous
    September 28, 2005
    ++ to Stewart's comment.

    While I'm not jumping for joy over the term 'auto' it's a better term than 'var' (IMO).

    Raimond Brookman mentioned the term 'infer' over at http://blogs.msdn.com/danielfe/archive/2005/09/22/472884.aspx . I'd like this over 'auto'.
  • Anonymous
    September 28, 2005
  1. If your code initializes to a return value of a method you can no longer tell what type it is:

    var item = MyCollection[key];

    what type is item? Is it object (MyCollection defined as IDictionary) or a string (MyCollection is StringCollection) or any other type (including primitives) if MyCollection is a generic collection? Is it what it's supposed to be?

    Even worse:

    var item = MyService.LookupItem(param);

    Without knowing the MyService class the reader of the code has simply no way of knowing the type of item.

    2. Your code will strongly type to the actual type even though a supertype would be more appropriate.

    var collection = new SortedDictionary();

    Would type the collection to OrderedDictionary even though IEnumerable is what you wanted.
  • Anonymous
    September 28, 2005
    It's possible to write hard-to-read code in any language. In your first example, I would say "if it's hard to figure out what the type is by reading the initializer, and it is important that the reader know the type, then call out the type."

    In your second example, I would say that if you want the variable to be typed as a less-derived type, then nothing is stopping you from typing it however you want.

    Remember, inference is a convenience feature. You don't have to use it if you don't want to or if you feel that it makes your code less clear, or if it doesn't have the semantics you want.

    The argument that some people will misuse it and therefore its bad doesn't hold much water with me. C# is an "enough rope" programming language -- there are many, many idioms in C# that can be abused, and we trust that our developers are professional enough not to do so.

  • Anonymous
    September 29, 2005
    Just because there are things that can be abused does not mean we have a green light to add more :) Personally I don't share the view that a programming language should be designed to require as little typing as possible. It should be designed to be unambiguous so there's as little guessing involved as possible.

  • Anonymous
    September 29, 2005
    The comment has been removed

  • Anonymous
    September 30, 2005
    Eric, good explanation why type inference is a good thing. I agree with the previous comment that "infer" would be better a better keyword than "var", but it doesn't matter very much.

  • Anonymous
    December 07, 2005
    A couple things I would like to see. Currently anonymous Delegates like the List<>.ForEach the statement is:

    myList.ForEach(delegate(int x)
    {
    //dosomething
    });

    this would be much clearer if we had something like this... myList.ForEach { |x| doSomething };


    Secondly... Duck Typing does make a better generic code. MS needs to catch up on this or face another Java Problem when dealing with languages like ruby and python.

    My third comment, Dynamic Types, adding functions at runtime. As a developer I WANT to do this. why? Look at Ruby's ActiveRecord and then tell me its not neeeded.

    Sincerly,

    Paul

  • Anonymous
    May 05, 2006
    Slightly off topic, but as you still refer to JScript.Net - is it still a viable language for Microsoft in the near future?

    Personally I love the language for coding both sides of the client/server divide. I just have more options on the server side (an get to type my variables, build classes etc), but still retain the same overall syntax.

    Also, coding on a Mac doesn't offer the Visual Studio route!

  • Anonymous
    July 06, 2006
    I'm really not happy with this 'var' stuff.

    Just give the type that it is.

    Is this one of these 'gee we better make this work for the scripters' move?

  • Anonymous
    July 06, 2006
    No, I explained the two reasons why we are adding type-inferred declarations in the post: to make redundant declarations shorter and to enable anonymous types. Making the language easy for scripters to use is not a design goal of C#.

    "Just give the type that it is." doesn't help if the type doesn't have a name.  Now, we could force all tuple types to be declared, but then you potentially have to declare a new boilerplate type for every query result.

    Look at it this way.  You have a query:

    AgeAndPhoneNumber results = from c in customers where c.city=="London" select new {c.Age, c.PhoneNumber};

    and now you want to add address to that.  Do you really want to have to define a new type AgeAndPhoneNumberAndAddress?  Or do you just want to say

    var results = from c in customers where c.city=="London" select new {c.Age, c.PhoneNumber, c.Address};

    and let the compiler take care of the typing for you?

    That's the consequence of forcing users to declare types -- they have to declare types, and that's a pain.

  • Anonymous
    September 28, 2006
    I'd suggest borrowing a page from Nemerle on this one: use 'def' to mean 'readonly var', and sidestep the squick factor of reassigning something that isn't manifestly typed.

  • Anonymous
    January 11, 2007
    We interrupt the discussion of how the difference between lambda expression and anonymous method convertibility

  • Anonymous
    June 28, 2010
    @Eric Lippert: Why can't you have a way to write the tuple type explicitly, like Haskell's "(Int, [String])" and so on?