Jaa


Covariance and Contravariance, Part Eleven: To infinity, but not beyond

UPDATE: Andrew Kennedy, author of the paper I reference below, was good enough to point out some corrections and omissions, which I have addressed. Thanks Andrew!

As I've discussed at length in this space, we are considering adding covariance and contravariance on delegate and interface types parameterized with reference types to a hypothetical future version of C#. (See my earlier articles in this series for an explanation of the proposed feature.)

Variance is highly useful in mainstream cases, but exposes a really irksome problem in certain bizarre corner cases. Here's just one example.

Consider a "normal" interface contravariant in its sole generic type parameter, and a "crazy" invariant interface which extends the normal interface in a strange way:

public interface IN<in U> {}
public interface IC<X> : IN<IN<IC<IC<X>>>> {}

This is a bit weird, but certainly legal.

Before I go on, I want to digress and talk a bit about why this is legal. Most people when they first see such a beast immediately say "but an interface cannot inherit from itself, that's an illegal circularity in the inheritance chain!"

First off, no, that is not correct. Nowhere does the C# specification make this kind of inheritance illegal, and in fact, a weak form of it must be legal. You want to be able to say:

interface INumber<X> : IComparable<INumber<X>> { ... }

that is, you want to be able to express that one of the guarantees of the INumber<X> contract is that you can always compare one number with another one. Therefore, it must be legal to use a type's name in a type argument of a generic base type.

However, all is not rosy. This particularly gross kind of inheritance that I give as an example is in fact illegal in the CLR, even though it is not illegal in C#. This means that it is possible to have the C# compiler generate an interface type which then cannot be loaded by the CLR. This unfortunate mismatch is troubling, and I hope in a future version of C# to make the type definition rules of C# as strict or stricter than those of the CLR. Until then, if it hurts when you do that, don't do it.

Second, unfortunately, the C# compiler presently has numerous bugs in its cycle detector such that sometimes things which kinda look like cycles but are in fact not cycles are flagged as cycle errors. This just makes it all the more difficult for people to understand what is a legal cycle and what isn't. For example, the compiler today will incorrectly report that this is an illegal base class cycle, even though it clearly is not:

public class November<T> {}
public class Romeo : November<Romeo.Sierra.Tango> {
public class Sierra {
public class Tango {}
}
}

I have devised a new (and I believe correct!) cycle detection algorithm implementation, but unfortunately it will not make it into the service release of the C# 3 compiler. It will have to wait for a hypothetical future release. I hope to address the problem of bringing the legal type checker into line with the CLR at the same time.

Anyway, back to the subject at hand: crazy variance. We have the interfaces defined as above, and then give the compiler a little puzzle to solve:

IC<double> bar = whatever;
IN<IC<string>> foo = bar; // Is this assignment legal?

I am about to get into a morass of impossible-to-read generic names, so to make it easier on all of us, I am going to from now on abbreviate IN<IC<string>> as NCS. IC<double> will be abbreviated as CD. You get the idea I'm sure.

Similarly, I will notate "is convertible to by implicit reference conversion" by a right-pointing arrow. So the question at hand is true or false: CD→NCS ?

Well, let’s see. Clearly CD does not go to NCS directly. But (the compiler reasons) maybe CD’s base type does.

CD’s base type is NNCCD. Does NNCCD→NCS? Well, N is contravariant in its parameter so therefore this boils down to the question, does CS→NCCD ?

Clearly not directly. But perhaps CS has a base type which goes to NCCD. The base type of CS is NNCCS. So now we have the question does NNCCS→NCCD ?

Well, N is contravariant in its parameter, so this boils down to the question does CCD→NCCS ?

Let’s pause and reflect a moment here.

The compiler has “reduced” the problem of determining the truth of CD→NCS to the problem of determining the truth of CCD→NCCS! If we keep on “reducing” like this then we’ll get to CCCD→NCCCS, CCCCD→NCCCCS, and so on.

I have a prototype C# compiler which implements variance – if you try this, it says “fatal error, an expression is too complex to compile”.

I considered implementing an algorithm that is smarter about determining convertibility; the paper I reference below has such an algorithm. (Fortunately, the C# type system is weak enough that determining convertibility of complex types is NOT equivalent to the halting problem; we can find these bogus situations both in principle and in practice. Interestingly, there are type systems in which this problem is equivalent to the halting problem, and type systems for which the computability of convertibility is still an open question.) However, given that we have many other higher priorities, it’s easier to just let the compiler run out of stack space and have a fatal error. These are not realistic scenarios from which we must sensibly recover.

This is just a taste of some of the ways that the type system gets weird. To get a far more in-depth treatment of this subject, you should read this excellent Microsoft Research paper.

Comments

  • Anonymous
    May 07, 2008
    "expression is too complex to compile" is actually what my brain said, too. Indeed, the whole subject is fascinating and your posts on the issue are on the "toread" list. For now, I just can't manage to assign null to a struct - a feat that Nullable<T> seems to be perfectly capable of. How can I implicitly convert a null to a struct? the closest I get gives me CS0553...sorry I ranted here, but it was bugging me and your post right from the compiler front hit me on my Feed :) Thanks for your valuable and brain-crunching posts!

  • Anonymous
    May 07, 2008
    Nullable<T> does not assign a null to a struct. Nullable<T> is nothing more nor less than simply a struct something like: struct Nullable<T> where T : struct {  readonly T t;  readonly bool hasValue;  ... constructors, accessors, and so on } That's all folks. The rest is just compiler and runtime magic.  The compiler turns "if (n == null)" into "if (n.HasValue == false)", turns "int n = null" into "n = new Nullable<int>();", etc. Perhaps what you want to do is not "convert a null to a struct" but rather "get the default value for a struct"?  In that case you can just say: T t = default(T); If T is numeric, default(T) is zero.  If T is a reference type, default(T) is null.  bool -- false.  Nullable type -- nullable type with hasValue=false. struct -- struct with all its fields set to THEIR default types.

  • Anonymous
    May 07, 2008
    The comment has been removed

  • Anonymous
    May 07, 2008
    I think Church shows that this kind of the problem,i.e. to decide which expressions are equivalent, is an unsolvable problems.

  • Anonymous
    May 07, 2008
    Hi Eric, thank you for your answers. I think I expressed myself too bland yesterday, I was more or less aware of the issues... I was looking for a way as a .NET developer to get an analogous code statement to compile (In the wake of me having a lot of fun with the implicit operator). Anyway, most importantly for me was the info that the compiler does some magic, so there is no way to get the effect via developer-made .NET code. That's fine by me, but good to know. Cheers and thanks :)

  • Anonymous
    May 09, 2008
    Is it not so that since double and string have no hierarchical relation, co- and contravariance are not relevant for determining A<double> -> B<string>, whatever A and B may be? Covariance preserves ordering, contravariance reverses ordering, but neither creates ordering, as far as I see.

  • Anonymous
    May 10, 2008
    Joren, it is in fact so, but that is not exactly the question at hand. The question is not whether or not the cast is valid. The question is how to write the compiler to know that. On the same note, think of as example for the halting problem (even though it is not equivalent, as Eric said): write a program that can tell if the following method stops or not: int thing(int x) { return thing(x + 1); } Surely, both you and I can see that this method will never stop (disregarding the overflow exception). But for a computer program, it would not be that obvious.

  • Anonymous
    May 11, 2008
    Re: The question is how to write the compiler to know whether or not the cast is valid. Yes, I understand that. Eric said the problem is not solvable in general, so returning the error 'too complex to compile' is a fine approach. But I thought: is it not at least possible to write a compiler that can follow the reasoning from the string-double relation (or other simple relations) to the interface relation, in stead of only trying to reduce the generic interface relation to a simple relation (which, as my previous post implied, can fail while the other method works)? It seems that trying the problem A<X> -> B<Y> from both sides (both from the relation between X and Y, and the relation between A and B) will lead to more classes of cases that the compiler can handle without giving up. Of course the question remains whether anything whose validity can't be ascertained from the relation between A and B alone will ever be practically useful. I'd say 'no' is most likely, but I've hardly investigated the problem any further.

  • Anonymous
    May 22, 2008
    Rather than place the links to the most recent C# team content directly in Community Convergence , I

  • Anonymous
    June 19, 2008
    I wonder if some future CSC could allow implicit interface implementation in variant way? Like this: class MyPOD : IClonable {  public MyPOD Clone() { ... } // this could implicitly implement IClonable.Clone } Eric, can such variance be specified on IClonable interface?

  • Anonymous
    June 19, 2008
    That kind of variance is called "return type covariance". As I mentioned early on in this series, (a) this series is not about that kind of variance, and (b) we have no plans to implement that kind of variance in C#.  

  • Anonymous
    July 03, 2008
    Hello Eric, After my last re-reading of your co[ntra]variance series, I believe I came to a new point. After all there are only 3 possibilites:

  1. Clear covariance (IEnumerator<T>).
  2. Clear contravariance (IComparer<T>).
  3. Undecidable (IList<T>). All the 3 possibility are easily detectable by compiler and I believe for the first two we would like the compiler will decide the variance automatically. We need a solution for the third case only. To see the solution, I would like first analyze how we selected which case the given interface belongs to. We analyzed the method signatures in which T is involved:
  4. If in all of them T is 'out' parameter, the interface is covariant.
  5. If in all of them T is 'in' parameter, the interface is contravariant.
  6. If there is a mix - undecidable. My solution is to say the compiler how I'm going to use my variable and enable either covariant or contravariant invocations: IList<Animal> aa = whatever; IList<in Giraffe> ag = aa; IList<out Giraffe> agX = aa; //fails to compile ag user sees: 'int ag.IndexOf(Giraffe)' and 'object ag[int]'. IList<Giraffe> ag = whatever; IList<out Animal> aa = ag; IList<in Animal> aaX = ag; //fails to compile aa user sees: 'int aa.IndexOf(<only null can be passed>)' and 'Animal aa[int]'. Now I can say in a type safe manner: class X<T> { T _t; void Test(IList<out T> at) { T t = at[0]; //clearly t is 'out' parameter } void Test(IList<in T> at) { //overloaded method! int i = at.IndexOf(_t); //clearly _t is 'in' parameter } } ... new X<Animal>().Test(new List<Giraffe>()); //first method is called new X<Giraffe>().Test(new List<Animal>()); //second method is called Kosta
  • Anonymous
    July 07, 2008
    I found a very interesting presentation, which shows different options, pros and cons for the implementation: http://research.microsoft.com/~akenn/generics/ECOOP06.ppt. It's clear that my approach is a variation on Java theme. Interestingly enough that in order to create a covariant/contravariant projection in C#, currently one will need to define a special read/write interface. While this is somehow acceptable with one parameter,  what is suggested if I have 2 or more, i.e.: MyClass<T1,T2,T3>{} To cover all the possibilities I need to define 2^3 interfaces. Yes, may be some of them are not useful, but I easily may fail into a situation where I need declare 4-5 interfaces. How is it addressed?

  • Anonymous
    July 13, 2008
    does this cover situations like ObservableCollection<T1<T2>> ?

  • Anonymous
    July 26, 2008
    I'm curious, on what basis does the CLR decide that public interface IN<in U> {} public interface IC<X> : IN<IN<IC<IC<X>>>> {} is illegal, and does it become legal if U is made invariant? OT: why the heck isn't return type covariance on the table? In terms of usefulness, it's in the same ballpark as variance in generics but a vastly simpler concept and dead easy to implement (except for compatibility issues). I've wanted return type covariance for years :(

  • Anonymous
    September 09, 2008
    A post by Jeremy Miller caught my eye this morning in regards to extension methods in Javascript . While

  • Anonymous
    October 11, 2008
    Eric, your blog is fabulous.  I may be too late for the next version of C#.  So far Covariance/Contravariance seems to encompas type parameters.  If possible expand the scope of Covariance/Contravariance to include delegates. --Avi

  • Anonymous
    October 19, 2008
    One thing i miss from C# is the C++ TypeDef statement. That would make the whole thing a bit more readable!, without causing 'expression too complex to parse' in my brain.

  • Anonymous
    November 14, 2008
    On the subject of return-type covariance, it initially seemd to me like you can try to get the benefit of such a feature for a specific case, like cloning, as follows: // covariant interface for cloning: interface IClonable<+T> { T clone(); } // an example use case in a hypothetical compiler: interface IExpression : IClonable<IExpression> {} class Add : IExpression, IClonable<Add> { public Add clone(); } The clone() method on class Add should be able to satisfy both the IClonable<Expression> and IClonable<Add> interfaces. Clients can clone an arbitrary expression, but if they have something statically known to be an Add then they know that cloning will return another Add. Looking at that case, though, it is clearly similar to the case in a previous blog post here that was almost universally agreed to be a compile-time error. The Add class has two different variations of IClonable in its inheritance hierarchy. You could claim that Add isn't a problem, since it satisfies the two interfaces with an identical implementation, so casting an Add to an IClonable<object> wouldn't be ambiguous. But what if there had been an Expression base class (instead of an IExpression interface) which supplied a default implementation for IClonable<Expression>.

  • Anonymous
    December 18, 2008
    So nicely step by step blogged by Eric Lippert for &quot;Covariance and Contravariance&quot; as &quot;Fabulous

  • Anonymous
    February 13, 2011
    The comment has been removed

  • Anonymous
    February 13, 2011
    The comment has been removed