Covariance and Contravariance in C#, Part Nine: Breaking Changes

Artigo
11/02/2007

Today in the last entry in my ongoing saga of covariance and contravariance I’ll discuss what breaking changes adding this feature might cause.

Simply adding variance awareness to the conversion rules should never cause any breaking change. However, the combination of adding variance to the conversion rules and making some types have variant parameters causes potential breaking changes.

People are generally smart enough to not write:

if (x is Animal)
DoSomething();
else if (x is Giraffe)
DoSomethingElse(); // never runs

because the second condition is entirely subsumed by the first. But today in C# 3.0 it is entirely sensible to write

if (x is IEnumerable<Animal>)
DoSomething();
else if (x is IEnumerable<Giraffe>)
DoSomethingElse();

because there did not used to be any conversion between IEnumerable<Animal> and IEnumerable<Giraffe>. If we turn on covariance in IEnumerable<T> and the compiled program containing the fragment uses the new library then its behaviour when given an IEnumerable<Giraffe> will change. The object will be assignable to IEnumerable<Animal>, and therefore the “is” will report “true”.

There is also the issue of existing source code changing semantics or turning compiling programs into erroneous programs. For example, overload resolution may now fail where it used to succeed. If we have:

interface IBar<T>{} // From some other assembly
...
void M(IBar<Tiger> x){}
void M(IBar<Giraffe> x){}
void M(object x) {}
...
IBar<Animal> y = whatever;
M(y);

Then overload resolution picks the object version today because it is the sole applicable choice. If we change the definition of IBar to

interface IBar<-T>{}

and recompile then we get an ambiguity error because now all three are applicable and there is no unique best choice.

We always want to avoid breaking changes if possible, but sometimes new features are sufficiently compelling and the breaks are sufficiently rare that it’s worth it. My intuition is that by turning on interface and delegate variance we would enable many more interesting scenarios than we would break.

What are your thoughts? Keep in mind that we expect that the vast majority of developers will never have to define the variance of a given type argument, but they may take advantage of variance frequently. Is it worth our while to invest time and energy in this sort of thing for a hypothetical future version of the language?

Comments

Anonymous
November 02, 2007
I think breaking scenarios would be rare. And the scenarios where it does break probably makes a bug visible at compile time and arguably could have been written differently so it won't be affected by these changes. I agree that a feature like this is sufficiently compelling to be worth the breaking changes. ...otherwise, you could never introduce co-/contra-variance...
Anonymous
November 02, 2007
I'm really conflicted. I think the feature is worthwhile to have, but I'm not sure that by itself it meets the -100 bar in terms of value versus added conceptual complexity. In combination with other variance-related features it becomes more of a no brainer, but I've played that broken record enough by now ;) Having said that, I don't think the breaking changes by themselves are enough of a reason to not include this feature.
Anonymous
November 02, 2007
I think that covariance of interfaces is one of the biggest missing features currently in C# -- When trying to design elegant type-safe libraries, we currently have to revert back to having our interfaces derive from a non-generic version simply to put them in a collection... This leads to lots of kludgy code that could be eliminated with variance.
Anonymous
November 02, 2007
I can imagine several cases where overload resolution might pick a different method or become ambiguous. E.g. class BaseClass {} class DerivedClass : BaseClass {} void Test(IEnumerable<BaseClass> a) {} void Test(IEnumerable b) {} // non-generic "fallback" method // call with: Test(new List<DerivedClass>()); However, in the places where I've seen such things, it was always done to work around C# not supporting variance. The implementation of the non-generic method would often be void Test(IEnumerable b) { Test(b.Cast<BaseClass>()); }
Anonymous
November 02, 2007
This is a fascinating look into language design, but I'm having a hard time coming up with real-world uses for variance. Does anyone have a brief, real-world example of code that would be improved by variance?
Anonymous
November 02, 2007
Do it. Most people won't even see it's there because it will just work as they consume classes and interfaces that used the feature. But the point is, today people see it's not there when they hit problems. I think it's the mark of a truly great feature when it simplifies your life without you even noticing. And if somehow you can spot the broken old code at compile-time, bonus points. Of course, if existing assemblies suddenly start breaking that could be quite puzzling to debug.
Anonymous
November 02, 2007
Could the IDE perhaps spot this code during migration of the project to C#4? "This code will behave differently from prior versions because an IEnumerable<Giraffe> is now an IEnumerable<Animal>". Catching all the conceivable outcomes of an "is" check is probably halting-problem complete (and worse, because you may not even have access to the callers of a public API to know what types might be passed in) but it should certainly be possible to catch cases where overload resolution will have different results - just evaluate the overload resolution with and without taking variance into account, and if the results differ, flag the code for the user to examine.
Anonymous
November 02, 2007
I think the feature should make into a future version of language. C++ has covariant return types for ages and that feature is useful for library designers most of the time. Your plan is much more ambitious. There will be huge hurdles even if it does make it past the -100 point mark. But on the other side, I have seen code written by inexperienced people who have never heard of virtual functions, love 'as' and 'is', in the end their code is littered with type casts.
Anonymous
November 02, 2007
The comment has been removed
Anonymous
November 03, 2007
I'd like to see it done. Imagine we would not have variance in arrays, how much casting and copying would we have to do? Now if I get the chance to have the same for IEnumerable (only without array covariance's problems), that alone would be worth the trouble in the context of LINQ. I've missed variance before. (And I'll probably keep missing it until you have it implemented for generic classes too, instead of just interfaces) Breaking changes might occur, I'd take the risk. Flagging the places in the source code for a one-time review, as Stuart suggested, is a good idea. Now here's another suggestion: I though about how you could handle the problem of array covariance and came to think about the "const" keyword (which C# doesn't have, and which is probably too hard to add now, but stay with me for a moment). Another option I was thinking about is an IReadOnlyList<T> interface. I'd have liked that before, because I'd find it more elegant that nesting some IsReadOnly property. But with covariance it would make so much more sense: I could get the (hypothetical) covariance of IEnumerable and the simpler/faster list access using an indexer, Count etc. How could those work together? Most functions that take arrays as parameters treat them as read-only constructs. Now changing their parameters from T[] to IEnumerable<T> would make everything slower, plus you'd have to rewrite your code. But changing them to IReadOnlyList<T> probably would do nothing bad to performance. We could even have IArray<T>, which could provide a "Length" property instead of Count. (Additional thinking required for n-dimensional arrays, but maybe interfaces for ranks 2 and 3 would be sufficient.) OK, I change all my "constant" T[] input parameters to IArray<T>. I might even use a tool that supports this. What next? I could turn on a compiler warning that jumps in my face every time I assign an Giraffe[] to a Animal[] variable or parameter! Assuming that arrays are either used as read-only references (in which case they should be declared as IArray<T>) or as modifiable array (in which case they should be considered invariant), this might fly. Admittedly, I haven't thought that through. Anyway, this would be a nice opt-in for people who care about the problems of array covariance, pretty easy to implement in the compiler, and it should not affect anybody who just wants to compile their old code. On the cost side, you'd probably have to modify a lot of BCL method signatures to make this useful. When you propagate those warnings to errors and mark fully checked assemblies, the JIT might be able to skip type verification on assignments too. Althogh I worry less about this. (I'd also consider an invariant syntax for declaring array parameters and variables, like "invariant Animal[]", or alternatively, a modifiable interface like IModifiableArray; or call the covariant, constant version IReadOnlyArray and the invariant modifiable version IArray. Whatever.)
Anonymous
November 03, 2007
4th paragraph, "that nesting some IsReadOnly property" should be "than testing".
Anonymous
November 03, 2007
no, wait, the part about performance is wrong. arrays are treated natively by the JIT, which should be much faster than calling an indexer via a vtable. probably costs more than type checking on assignments... so we'd need "real" C++ like "const" support in order to make this usable in a general sense. which is unlikely. Have there been any discussions about introducing const in the CLR/C# lately? I know that many people think that const doesn't pull its own weight in C++, and I tend to agree. But the CLR could enforce it, prevent casting-away of const in sandboxed scenarios via CAS policies, so this might be a really interesting security feature. Also, considering how functional programming favors immutable objects, this might make things easier for parallel processing (PLINQ).
Anonymous
November 04, 2007
The comment has been removed
Anonymous
November 04, 2007
I just got through reading this 9-part article series. I haven't read many MSDN articles or blogs, so I have to ask: what is the -100 test? Aside from that, I like that each article is short - as this can be a very complex subject. I feel that the syntax would be far simpler, and far more familiar to simply use IFoo<+R, -A>. It's already like that in the CLR, and everyone who knows what variance is knows the +/- syntax. However, I'd bet that everyone has to puzzle over the alternatives as they are all new. For anyone who's been annoyed by C#'s lack of variance, adding it will be a major improvement. For those who have no idea what variance is, I suspect they won't be the least bit bothered by it. I always thought IAction<Mammal> could be assigned to IAction<Animal> when the type parameter is an argument. I was very confused the first time I tried compiling such code and C# stated the types were incompatible. I had to stare at the error for a long time, then go ask someone why my code wouldn't compile! In short - the answer was: "While obvious, C# does not support variance. You probably have no idea what variance is, but all you have to know is you can't do that even though it seems like you should be able to. I've been harping on Microsoft for years to fix this problem." Meaning, including variance should be the logical default, and not supporting variance seems more like the exception. Experts hate the lack of variance, beginners are confused by the lack of variance (they have to "learn" that it is unsupported, as opposed to "learning" about how to use variance). All this discussion, and the current design of C# seems to imply that variance is this "fancy new feature". I argue that the lack of variance support is an anti-feature -- something the language designers went out of their way to annoy you! And finally, I think the +/- syntax is simple, intuitive for those who will write such code, and won't get in the way. It seems like something that was supposed to be there all along, (as proven by the CLR support), and C# just took it out to "baby" you. Although really it's just plain annoying, and this discussion wouldn't even be happening if it was included to begin with.
Anonymous
November 04, 2007
"-100 points" refers to this article by former C# team member Eric Gunnerson: http://blogs.msdn.com/ericgu/archive/2004/01/12/57985.aspx
Anonymous
November 04, 2007
Re: Your plan is much more ambitious. There will be huge hurdles even if it does make it past the -100 point mark Actually, its just the opposite. Variance on interfaces and delegates will be easy to implement because the CLR already supports it natively and has since generics were introduced. C# is just not taking advantage of it yet. The CLR does NOT support variance on virtual overrides natively, so implementing that would require a lot more work on both the design and implementation side.
Anonymous
November 04, 2007
The lack of variance support is one of my top two complaints of C#. It is so lacking that I regularly need to drop down to the IL level to do work. Whidbey's introduction of generics was great; however, it only went part way. Variance is absolutely needed, even if it may break some code in very rare cases. PS My number two complaint is the inability to declare generic overloaded operators because there is no way to know what is addable, subtractable etc.
Anonymous
November 04, 2007
Welcome to the thirty-fifth edition of Community Convergence. This week we have an interview with C#
Anonymous
November 04, 2007
The comment has been removed
Anonymous
November 04, 2007
Separation of concerns. You have two changes you'd like to make:

A language change, to allow variance to be defined for generics. Correct me if I'm wrong, but no current C# code is broken by this (if you're using an assembly from another language which takes advantage of variance, I'm assuming you already get the variance behaviour in C#).
A library change. In this post, you talk about a breaking change to IEnumerable, and a breaking change to a hypothetical IBar. Obviously this is badness. Breaking changes are badness almost by definition. Now, as far as I (a non-C#-using guy) am concerned, (1) is a good thing. As far as I can see, the only downsides are a) the -100 points, and b) the added cost to C# developers of understanding variance (and usually they won't have to). But (2) is nowhere near as clear-cut. Contrary to what others have said, assuming my assumption above is valid, you can't just say "this problem is trivially solvable by ruling that variance is off if it's a CLR2 assembly" since this may break C# code which already uses generics with variance from other languages (including hand-coded MSIL, I guess). (2) has -100 points of its own, and I don't see it getting the necessary +100 to justify it.

Anonymous
November 05, 2007
The comment has been removed
Anonymous
November 05, 2007
The comment has been removed
Anonymous
November 06, 2007
I notice that you seem to mix 2 things when discussing the breaking change:

Adding co- & contravariance itself does <b>not</b> break anything
Changing existing classes / interface to include variance <b>is</b> a breaking change. So what to do? Add variance and restrain from changing existing interfaces. Just as the introduction of generics delivered a generic version of IENumerable, we now would get additionally a variance version of IENumerable...

Anonymous
November 06, 2007
mbuzina, how would you call this new interface? IEnumerable2<T>? (recalling the horror of COM...) IEnumerable and IEnumerable<T> are easily separable (in fact, they have different names under the hood). IEnumerable<T> and IEnumerable<+T> are not. Creating two interfaces would also mean that you have to understand the difference going forward (i.e., everyone must understand covariance). It would make the language a great deal uglier. All this just to prevent a few breaking lines of code that can easily be fixed by bringing the tests in an order that would have been more logical in the first place? I don't know... I say let's have those bugs and fix them. Hopefully, they are very rare anyway!
Anonymous
November 06, 2007
The comment has been removed
Anonymous
November 07, 2007
Yeah, way too late considering they've committed to releasing C# 3.0 this month.
Anonymous
November 07, 2007
The comment has been removed
Anonymous
November 07, 2007
The comment has been removed
Anonymous
November 07, 2007
Eric, I never seriously considered it possible to do this with .NET 3.5. But what do you think about the variance problem we already have when we mix arrays with IEnumerable<T>? (As opposed to generic collections.) Would a breaking change really make this any worse? There are probably less than 5 people in the world who produced such code and are aware that this behavess completely differently for List<T> and T[]...
Anonymous
November 11, 2007
The comment has been removed
Anonymous
November 11, 2007
Jon I'd rather learn a few new language features than a heap of libraries and their individual ways of dealing with what's left from the language. The language is at the core of what we're doing, and a lot of people are dedicating way to little time to learning it (as opposed to learning IDEs and designers, frameworks and libraries, VS guidance automation stuff and VSTS, ...) C# is the language for code-based productivity. (I'm sure I read that somewhere.) People who prefer the complexity in the stuff orbiting around the language are better served with VB.NET - the languages are finally beginning to actually enter their different Roadmaps instead of looking like the same language with two different syntax-skins (Sun liked to say that, and I'm glad it's no longer true). For code-based productivity, you need powerful features. Sooner or later I hope we'll even see some meta-programming or AOP mechanisms in C#. You really don't have to understand them at every level just to benefit, most of the stuff ist just for sophisticated library developers anyway. Which is true for variance too, btw. You complain that it was hard to explain why a List<string> could not be returned as an IEnumerable<object>. Now what makes you think that it's harder to explain why this is now possible? People who don't like to think about stuff like that are not going to complain that they don't get compiler errors anymore. Just like they don't complain that covariance works for arrays now. How hard do you think it is to explain that while it won't work for List<string>, you can return a string[] as an IEnumerable<object> today, and make people not only understand the difference, but also be aware that they could write that works differently for arrays and for collection classes? Just bugs waiting to happen. And then I imagine this in the context of LINQ, where most of the people will have no idea of what a from/select statement is transformed into, have no real idea of how IEnumerable/IQueryable, extension methods and Lambdas work together to actually compile this, but will find it to "just work" most of the time. Except when they run across the missing covariance of IEnumerable, that is.
Anonymous
November 19, 2007
While it seems to me all the focus is on List<T> (the co-called generic co/contra-variance), please keep in mind that some of us are also wanting the simpler co/contra-variance for sub-methods. public class SubC : SuperC public class A public virtual SuperC Method1(SubC subc) public class B : A public override SubC Method1(object subc) Which, if I understand the concept correctly, does not require any additional syntax. Certainly B has always been allowed to return a more restrictive subset of values. Its just a matter of letting us express it so that the type-system is made aware of that fact. E.g. B b = GetB(); b.Method1().SomeMethodOnlyAvailableOnSubC(); As for letting B handle wider parameters, not sure if that would require any extra syntax. Seems like it wouldn't, that it should just be allowed. B has to meet the contract that A defines. If B does anything above and beyond that, there is no harm, it has not violated A's contract by doing more. I was hoping that VS2008 would have sub-method co/contra-variance, but its not there in beta 2. Any chance that the final release will? Or is that logic pretty much all tied in with the generic co/contra-variance (i.e. C# 4.0)?
Anonymous
November 19, 2007
There will be no features in C# in the final release that were not in the final beta. Adding features after final beta means shipping features that have never been beta tested, and we try very hard not to do that. In this series I explicitly did NOT discuss "override variance". I am well aware that a lot of people want this feature, and I may do another series on it in the future, but that's not what I've been talking about here. Override variance is completely orthogonal to interface/delegate variance. They have nothing to do with each other (except insofar as interface variance might make more scenarios eligible for override variance.) And there is no such thing as C# 4.0. Remember, this is all hypothetical discussion at this point. We have not announced any such product, so it is premature to be discussing specifics of its feature set!
Anonymous
December 18, 2008
So nicely step by step blogged by Eric Lippert for "Covariance and Contravariance" as "Fabulous

Compartilhar via

Covariance and Contravariance in C#, Part Nine: Breaking Changes

Comments

Recursos adicionais