Compartilhar via


Type inference woes, part two

So what's the big deal anyway? The difference between the spec and the implementation is subtle, only affects a few specific and rather unlikely scenarios, and can always be worked around by inserting casts if you need to. Fixing the implementation would be a breaking change, it seems like a small and simple change to the specification, so why don't we just update the specification the next time we get the chance and be done with it?

The big deal is that this is a small, isolated, corner case problem for C# 2.0, but it becomes much more visible in C# 3.0. Essentially the question here is "given a set of expressions of various types, how do we infer a unique unified type?" In C# 2.0 this question comes up only in the context of the ?: operator, and the set always has two elements. In C# 3.0, this question comes up all over the place and the sets can be arbitrarily large. For example:

  • When an implicitly typed local variable declaration statement contains several declarations, does

    const short s = 123;
    var x = 0, y = s;

    infer that x and y are short, or int, or is this an error?

  • Does the implicitly typed array initializers

    const short s = 123;
    var x = new[] {0, s};

    infer x to be short[], int[], or is this an error?

  • When a lambda expression is passed to a generic method we must infer the return type of the lambda from the set of expressions returned from its body:

    public static IEnumerable<T> Select<S, T>(IEnumerable<S> collection, Func<S, T> projection){//...
    ...
    const short s = 123;
    var x = Select(blahCollection, c => { if (c.Foo > c.Bar) return 0; else return s; });

    If S is inferred to be Blah, is x inferred to be IEnumerable<short>, or IEnumerable<int>, or is this an error?

  • When multiple lambda expressions are passed to a generic method, can we unify unequal inferred types?

    public static IEnumerable<T> Join<O, I, K, R>(
    IEnumerable<O> outer,
    IEnumerable<I> inner,
    Func<O, K> outerkey,
    Func<I, K> innerkey,
    Func<O, I, R> selector) { // ...

    var x = Join(customers, orders, c => c.Id, o => o.CustId,
    (c, o) => new {c.Name, o.Amount});

Suppose Customer.Id is int and Order.CustId is Nullable<int>. Do we infer that K is Nullable<int>, or produce an error?

Enquiring minds want to know the answers to these questions, and it seems sensible that we should come up with a single algorithm that answers all of them. And if we're going to do that, then it seems desirable that ?: ought to use the same algorithm we come up with for all of the above.

After the Memorial Day break I'll discuss some of the algorithms that we're considering, and what benefits and drawbacks they have. Have a good weekend!

Comments

  • Anonymous
    May 26, 2006
    The comment has been removed
  • Anonymous
    May 27, 2006
    I'd agree with James on the first point.  Since "var" doesn't imply type, variable initializer lists shouldn't restrict type and should be considered short-hand for "var x = <value>; var y = <anothervalue>; ..."

    For point two, I would expect it to be consistent with the way the language currently handles literals and type coercion; if the literal fits within the type of the variable it is assigned to, then just cast it down, i.e.: Int16 x = 0;  Zero will obviously fit within Int16 despite technically being Int32. Since var has no type it must be implied through its initialization, with arrays I would expect it to be consistent with the comma operator's associativity rules: evaluating from left to right.  In your example, I would expect x to be an array of ints without error because a short value will not overflow (where "Int64 g = 0;var x = new[] {0, g}" would produce an error because x would be inferred to be int[] and g could overflow).

    Point 3 is a little trickier because there really are no associativity rules to use as a baseline, other than from top to bottom.  So, me personally, I would expect top-down evaluation.  Since the first return is a zero literal, and is an int, the result should be implicitly considered IEnumerable<int>.

    Point 4: does lambda always compile into IL, like Generics; or, is it more like C++ templates?