You did it!

Artikkeli
08/13/2005

As many of you may know, we recently announced a pretty big change to the C# 2.0 language. The full details of the change can be found at Soma's blog but i'll include the information here.

We designed the Nullable type to be the platform solution, a single type that all applications can rely on to uniformly represent the null state for value types. Languages like C# went ahead and built in further language features to make this new primitive feel even more at home. The idea was to blur the subtle distinction between this new value-type null and the familiar reference-type null. Yet, as it turns out, enough significant differences remained to cause quite a bit of confusion.

We soon realized the root of the problem sat in how we chose to define the Nullable type. Generics were now available in the new runtime and it seemed quite simple to use this feature to build up a new parameterized type that could easily encode both a value type and an extra flag to describe its null state. And by defining the Nullable type also as a value type we retained both the runtime behaviors and most of the performance of the underlying primitive. No need to special case anything in the runtime. We could handle it all as just an addition to the runtime libraries, or so we thought.

As several of you pointed out, the Nullable type worked well only in strongly-typed scenarios. Once an instance of the type was boxed (by casting to the base ‘Object’ type), it became a boxed value type, and no matter what its original ‘null’ state claimed, the boxed value-type was never null.

int ? x = null;

      object y = x;

if (y == null) { // oops, it is not null?

        ...

      }

It also became increasingly difficult to tell whether a variable used in a generic type or method was ever null.

void Foo<T>(T t) {

if (t == null) { // never true if T is a Nullable<S>?

       }

    }

Clearly this had to change. We had a solution in Visual Studio 2005 Beta2 that gave users static methods that could determine the correct null-ness for nullable types in these more or less ‘untyped’ scenarios. However, these methods were costly to call and difficult to remember to use. The feedback you gave us was that you expected it to simply work right by default.

So we went back to the drawing board. After looking at several different workarounds and options, it became clear to all that no amount of tweaking of the languages or framework code was ever going to get this type to work as expected.

The only viable solution was one that needed the runtime to change. To do that, it would require concerted effort by a lot of different teams working under an already constrained schedule. This was a big risk for us because so many components and products depend on the runtime that it has to be locked down much sooner than anything else. Even a small change can have significant ripple effects throughout the company, adding work and causing delays. Even the suggestion of a change caused quite a bit of turmoil. Needless to say, many were against the proposal for very credible reasons. It was a difficult decision to make.

We were fortunate that so many here were willing to put in the extra work it took to explore the change, prototyping it and testing it, that a lot of the uncertainty and angst was put to rest, making the decision to go ahead all that much easier.

The outcome is that the Nullable type is now a new basic runtime intrinsic. It is still declared as a generic value-type, yet the runtime treats it special. One of the foremost changes is that boxing now honors the null state. A Nullabe int now boxes to become not a boxed Nullable int but a boxed int (or a null reference as the null state may indicate.) Likewise, it is now possible to unbox any kind of boxed value-type into its Nullable type equivalent.

int x = 10;

object y = x;

int ? z = (int?) y; // unbox into a Nullable<int>

Together, these changes allow you to mix and match Nullable types with boxed types in a variety of loosely typed API’s such as reflection. Each becomes an alternative, interchangeable representation of the other.

The C# language was then able to introduce additional behaviors that make the difference between the Nullable type and reference types even more seamless. For example, since boxing now removes the Nullable wrapper, boxing instead the enclosed type, other kinds of coercions that also implied boxing became interesting. It is now possible to coerce a Nullable type to an interface implemented by the enclosed type.

int ? x = 0;

       IComparable<int> ic = x; // implicit coercion

The reason i'm bringing this up is that i wanted to call out something specific that Soma mentions:

In the past, I have talked about how your feedback is a critical part of us building the right product. Recently, we took a big DCR (Design Change Request) into Visual Studio 2005 that was in response to your feedback. This was a hard call, because it was a big change that touched many components including the CLR. Nonetheless, we decided to take this change at this late stage in the game because a) this was the right product design and I always believe in optimizing for the long-term and b) I had confidence in the team(s) to be able to get this work done in time for Visual Studio 2005. This is a classic example of how we are listening to your feedback that results in a better product for all of us.

I cannot stress to you how true and honest a statement this is. This issue would not have been addressed had it not been for the amazing feedback we recieved from some amazingly helpful people. There were several that i can think of, but i definitely wanted to call out one person in specific:

Stuard Ballard took the time on several occasions to send us the message that our Nullable solution was unsatisfactory. However, instead of just saying "it sucks" and leaving it at that. He willingly engaged us and took quite a lot of time to write up a full and detailed explanation of why is sucked, and why he felt that it was an unnacceptable solution for him and the rest of the development community. He even wrote up a great blog post on the subject that drilled down into many different areas where our Nullable implementation was unsatisfactory. This page was sent out to the entire language design group where we discussed it on many occasions. While we were aware of the limiations of our original Nullable implementation, we had previously existed in a sort of limbo state where we felt the problems were unfortunate, but acceptable. And, when we were considering the cost of "doing it right", we felt that this might be a case where it was OK to get it slightly wrong since we could do it so cheaply. Great community members like Stuart told us, unequivocally that it wasn't.

Thanks Stuart! Thanks for letting us know that you woudn't let us settle for "good enough." With your help we'll have made the VS2005 release that much better for everybody. When it comes to C# 3.0 i hope that we'll be doing a lot more of this since the benefits are so fantastic to all.

Comments

Anonymous
August 12, 2005
To encourage us to give more feedback, maybe the excellent feedbacks (like Stuart) deserve rewards such as this:

<img src="http://www.googlestore.com/images/products/GO0135.jpg" alt="Icon Stix and Magnet Stonz" />

I got 2 sets from Microsoft and can't get tired of them.
Anonymous
August 12, 2005
While I appreciate Stuart efforts to assembly complete chart of issues that went wrong - I was one of the first who reported this :-)

http://lab.msdn.microsoft.com/productfeedback/viewfeedback.aspx?feedbackid=FDBK19417

Shame on Peter ( http://blogs.msdn.com/peterhal/archive/2005/01/19/356577.aspx ) who closed this original report with wrong code sample and was not willing to answer on my comments ! :-(
Anonymous
August 12, 2005
While you're all in the mood to listen to customer feedback, why not continue this great effort by listening to the large chunk of feedback given to you about ASP.NET projects?
Anonymous
August 13, 2005
What's never been demonstrated throughout the discussion of nullable types is why we need a nullable stack-allocated type; what scenario makes this actually useful, and why is that scenario so significant that it needs all this mess to support?

All the example code using nullable types suggests that they're rarely useful, and on those occasions that one would need such a thing, could more than satisfactorily be satisfied by simply allowing references to valuetypes, such as through reference type wrappers.
Anonymous
August 13, 2005
The comment has been removed
Anonymous
August 13, 2005
I saw from Soma's blog that VS is taking a DCR to fix the issues about Nullable types that is being talked...
Anonymous
August 13, 2005
"DrPizza, I've written reference type wrappers to value types and used them intensively for 2 years now."
What for? In Java the only time I really use the wrappers is when I absolutely have to (for example, to use them as a key or value in a Map or similar situation).

"They're an absolutely essential part of the work I do on a day to day basis. The per-type wrapper classes are okay, but they suffer from some of the same problems that the original platform nullable type did (although not as bad) - especially with regard to casting them through Object. "
I take this to mean that the "stack allocated" property is thus not very important?

What I don't understand is why .NET has this perverse distinction between "value types" and "reference types" and, more specifically, encodes the distinction as part of the type of the class (rather than using for example an annotation at variable declaration time, as C++). In C++ one can choose semantics one wants (value, nullable pointer (*), non-nullable pointer (&)) for any type, using consistent syntax for doing so (values are always unadorned, nullable pointers are always *, non-nullable pointers are always &).
Anonymous
August 13, 2005
DrPizza, you won't like my answer because we've already established that you don't like nulls in database schemas either, but I use them as part of an O-R mapping tool to represent database columns that are nullable ints, etc.

I still believe that null is the right way to represent a situation where there really is no value, and that that situation is fairly common. I also believe that it's far too common for developers to use 0 or -1 or other "special" values, and that's a bad idea because it doesn't let the runtime help you by throwing an exception when you try to use it as if it's not special.

Given those beliefs, the ability to represent a nullable instance of any type is pretty vital :)

You're correct that, for me, the stack allocated part is completely irrelevant. However, it's been pointed out to me that the memory overhead of a reference-based Nullable is a factor of four over a value-based one on a 64-bit platform, and so I sympathize with the decision that that was unacceptable.

I always found that the C/C++ behavior of making referenceness versus valueness part of the variable, rather than the value, was very confusing, and leads to the need for every API to specify how its return value needs to be freed, or how long it can live for. I find that the CLR (and Java which does exactly the same thing except the set of value types is hardcoded) providing a universal model for this is very valuable. You are, of course, free to disagree - but a lot of people seem to like it, as I do.
Anonymous
August 13, 2005
The comment has been removed
Anonymous
August 13, 2005
Frans: "While you're all in the mood to listen to customer feedback, why not continue this great effort by listening to the large chunk of feedback given to you about ASP.NET projects? "

So far, not one has given me feedback about ASP.Net projects.

Have you felt that your feedback on other ASP.Net blogs has not been heard?
Anonymous
August 13, 2005
Damien: "The one thing I like about the nullable types mess is the ?? operator - I can use it on non-nullable types to save keystrokes. "

really? I thought ?? only works on nullable types. I'll have to check that on monday.
Anonymous
August 13, 2005
DrPizza: "such as through reference type wrappers"

In the java world this is made possible as there is a finite number of value types (int, boolean, etc.), and it's possible to simply provide reference wrappers for each one (Integer, Boolean, etc.). However, in the .Net world it's unbounded and it wouldn't really be acceptable to customers to have to provide a reference alternative to each value type.

The nullable type allows each value type to be treated as a reference type without things like a 4x overhead on some platforms. It's able to do this by being a special intrinsic type and taking advantage of known facets of value types.
Anonymous
August 13, 2005
Tag: You absolutely were helpful here. And if i tried to list everyone then i knew i would miss some people. So i just accepted that and decided to go with one person this time.

Please don't take it personally! You know that i value all this feedback :)
Anonymous
August 13, 2005
"DrPizza, you won't like my answer because we've already established that you don't like nulls in database schemas either, but I use them as part of an O-R mapping tool to represent database columns that are nullable ints, etc. "
I'm hardly unique in not liking nulls in databases. Nulls have screwy arithmetic rules and generally make the database's abstractions much less useful.

"I still believe that null is the right way to represent a situation where there really is no value, and that that situation is fairly common. I also believe that it's far too common for developers to use 0 or -1 or other "special" values, and that's a bad idea because it doesn't let the runtime help you by throwing an exception when you try to use it as if it's not special. "
Oh, I quite agree. But I'm not suggesting you should do that either. No, I'm rather suggesting that you shouldn't use nulls in databases if at all possible.

"You're correct that, for me, the stack allocated part is completely irrelevant. However, it's been pointed out to me that the memory overhead of a reference-based Nullable is a factor of four over a value-based one on a 64-bit platform, and so I sympathize with the decision that that was unacceptable. "
I would think that the overhead would rather depend on the size of the value type, wouldn't you?

"I always found that the C/C++ behavior of making referenceness versus valueness part of the variable, rather than the value, was very confusing, and leads to the need for every API to specify how its return value needs to be freed, or how long it can live for."
No, not really. You're conflating between location (heap, stack, static memory, whatever) with semantics. C and C++ do this a bit, so the conflation is understandable, but they're not as closely tied together as people think; I can get stack allocation with nullable pointer semantics with the address-of operator, I can get heap allocation with value semantics with the dereference operator, and so on and so forth; semantics and storage are orthogonal. And the issue of lifetimes largely disappears in an environment such as .NET anyway.

"I find that the CLR (and Java which does exactly the same thing except the set of value types is hardcoded) providing a universal model for this is very valuable. You are, of course, free to disagree - but a lot of people seem to like it, as I do. "
But it doesn't provide a universal model any more. Now the semantics are encoded as part of the variable declaration; just in an irregular manner. For value types, int means "value semantics", int? means "nullable pointer semantics". For reference types, the only choice you have is nullable pointer semantics.

As such C# becomes extremely unclear. Sometimes the semantics are defined by the class (reference types). Sometimes they're defined by the declaration. And it uses the same syntax to mean different things (RefType bar means "nullable pointer semantics" whereas ValType bar means "value semantics").
Anonymous
August 13, 2005
"In the java world this is made possible as there is a finite number of value types (int, boolean, etc.), and it's possible to simply provide reference wrappers for each one (Integer, Boolean, etc.). However, in the .Net world it's unbounded and it wouldn't really be acceptable to customers to have to provide a reference alternative to each value type. "
But why would they need to?

In Stuart's database scenario, his database wouldn't be emitting arbitrary user-defined value types anyway. It'd be emitting the basic primitives. So the built-in supplied wrappers would be perfectly sufficient.

Like I said before, MS has provided no compelling demonstration of the value of nullable types, so perhaps there's some other useful scenario that I'm missing where the ability to have nullable arbitrary value types is useful. But no-one's explained what that might be, and in the situation Stuart suggested, there doesn't seem to be any necessity.

"The nullable type allows each value type to be treated as a reference type without things like a 4x overhead on some platforms."
Then--if that really matters, and it's not clear that it does--implement them better.
Anonymous
August 14, 2005
The comment has been removed
Anonymous
August 14, 2005
Oh, and yes, you're right that the overhead depends on the size of the value types. But since int will be easily the most commonly nullabled type, the overhead on that is especially important to worry about.
Anonymous
August 14, 2005
Sorry for answering your points in so many separate posts, I keep thinking of new things to add.

I agree that semantics, storage and lifetime are largely orthogonal (although obviously stack allocation precludes lifetime beyond the enclosing scope).

However, in actual practical coding I don't find that there's any real need for most of the possible combinations. The only variations I ever use are reference, immutable non-nullable value and immutable nullable value.

Note that things that are immutable behave identically regardless of whether under the hood they're implemented as reference or value, which is why "wrapping" a value type in a reference type is normally an acceptable way to implement "immutable nullable value". String is an example of an immutable type that's already a reference type and nullable.

What it comes down to is that in an ideal world I want nullable and non-nullable variables of every type (yes, at the variable level), but mutability versus immutability is a property of the type itself. An immutable type can be value or reference and that's purely an implementation detail; a mutable type should always be reference, in my book.

This isn't because there are really strong reasons against using other combinations, but rather that I've never found compelling reasons why we do need them. In the interests of keeping complexity down, therefore, we should leave them out. If you want a complex language with all the flexibility in the world, use C++ - it's a perfectly valid language choice but it's not C#. I think there's a place for a language in between the (IMHO) dumbed down level of VB and the highly advanced level of C++, and that it's right for such a language to expose only the more commonly needed combinations of semantics.
Anonymous
August 17, 2005
"Then--if that really matters, and it's not clear that it does--implement them better. "

We did. THat's what System.Nullable is.

A "better" implmentation of the Nullable value concept that has the performance that our customers wanted.
Anonymous
August 22, 2005
The comment has been removed
Anonymous
August 23, 2005
The comment has been removed
Anonymous
August 24, 2005
The comment has been removed
Anonymous
August 24, 2005
The comment has been removed
Anonymous
August 24, 2005
The comment has been removed
Anonymous
August 25, 2005
The comment has been removed
Anonymous
August 25, 2005
The comment has been removed
Anonymous
August 25, 2005
The comment has been removed
Anonymous
August 26, 2005
The comment has been removed
Anonymous
August 27, 2005
The comment has been removed

Jaa

You did it!

Comments

Lisäresursseja