Nullable syntax

Had a long talk with Renaud today about nullable types and the interesting ideas they've been pushing through the language. Specifically we've added a few nicities in teh compiler to make using nullable value types as easy as using the actual value type.

For example, in C# you can type:

 
    int i = GetSomeInt();
    int j = GetSomeInt();
    int k = i + j;

    short a = GetSomeShort();
    int b = a;

If you were to use Nullables then you'd have to write:

 
    Nullable<int> i = GetSomeNullableInt();
    Nullable<int> j = GetSomeNullableInt();
    Nullable<int> k = i.HasValue && j.HasValue ? new Nullable<int>(i.Value + j.Value) : (Nullable<int>)null;

    Nullable<short> a = GetSomeNullableShort();
    Nullable<int> b = a.HasValue : new Nullable<int>(a.Value) : (Nullable<int>)null;

Pretty verbose and unweildy. In C# 2.0 you can now write that as:

 
    int? i = GetSomeNullableInt();
    int? j = GetSomeNullableInt();
    int? k = i + j;

    short? a = GetSomeNullableShort();
    int? b = a;

Far far far easier than the version where you have to write out Nullable and pretty close to the code that would have been written in the original non-null case. The two additions to the language that have been added are "nullable conversions" and "lifted operators". The "nullable conversion" is where we allow predefined/user conversions on a type T to be used on a Nullable<T>. i.e. since there is a predefined conversion from byte to int, there is automatically a conversion from byte? to int?. Similarly, for all operators on value types (like op_plus, etc.) there is automatically the operator on the nullable type that will do the appropriate null checking. So if you have "int +(int i, int j)" there now is "int? +(int? i, int? j)". If any of the arguments are null, then the result is null. If all the arguments are non-null then underlying operation is applied to the actual values.

We were thinking about how that made using nullable types far easier and how it allowed you think of a nullable struct as the underlying struct... except for in one way. Operators and conversions are made available to you automatically, however methods/properties/fields are not. i.e. you can't do this:

 
     public struct Vector {
           public Vector Normalize { get; }
           public double Dot(Vector v);
     }

...

   Vector v1 = GetSomeVector();
   Vector v2 = v1.Normalize;
   double d = v1.Dot(v2);

...

   Vector? v1 = GetSomeNullableVector();
   Vector? v2 = v1.Normalize;
   double? d = v1.Dot(v2);

This is because properties and methods aren't lifted up into the nullable type. This is kind of a shame (IMO). I have an entire graphics library written about primitives like colors/points/vectors/matrices, none of which will have the benefit. So I'll have to write:

 
    Matrix? m = GetSomeMatrix();
    Vector? v = m.HasValue ? m.EigenVector : (Vector?)null;

... //instead of

   Vector? v = m.EigenVector;

This woudl be especially bad if I had to do something like "m.EigenVector.Normalize" checking for 'HasValue' every time. One of the issues with lifting preopties is the ambiguities that would arise. consider the following:

 
     struct S {
           bool HasValue { get; }
     }

     ...

     S? s;
     bool b = s.HasValue;

Does the 'HasValue' access refer to Nullable<S>.HasValue or S.HasValue. Renaud, Luke and I talked about this and were considering possible syntax to take care of this situation. One thing Renaud thought was that we could use the -> syntax to accomplish this. For example you could write things out like:

 
   Vector? v1 = GetSomeNullableVector();
   Vector? v2 = v1->Normalize;
   double? d = v1->Dot(v2);

Because you could never have a "T?*" (a Nullable<T>*) then the -> operator would never be ambiguous. However, Luke thinks that bringing in the arrow operator would be incredibly bad because of all the weight that that carries with the C#/C/C++ meaning with pointer dereferencing. However, I see a certain symmetry with pointers and nullable types. For example, you have:

 
     int* i;
     int j = (*i).CompareTo(4);
     int k = i->CompareTo(4);

    ...

    int? i;
    int j = (?i).CompareTo(4);
    int k = i->CompareTo(4);

In the second to last line the (?i) acts to pull the value out of the nullable (or throw if it's null). This is similar to how (*i) will dereference the pointer to the int, or throw if the pointer is null. An alternative to the -> syntax would be to have something like => . It's concise, non-ambiguous, doesn't seem to carry any baggage along with it (well, perl probably uses it for something, but then again perl has a construct for anything). The addition of this syntax would allow for Nullable types to be interchanged with regular value types with very little overhead in syntax to support it. What do you think?

Edit: Daniel raised a good point about the nicities of having sepcialized syntax to lift a method call onto the actual underlying value. Specifically, say you had the following definition of Nullable:

 
public class Nullable where T : struct {
     public readonly T Value;
     public Nullable(T value) {
          this.Value = value;
     }
}

Then if you did:

 
     MyStruct? nullableMe;
     MyStruct me = nullableMe.Value;
     me.DoSomethingThatChangesMyState()
     MethodWithNullableMyStructParameter(nullableMe); 

Then you'd have a problem because the changed state in "me" wouldn't be visible in "nullableMe". However, if you had:

 
     MyStruct? nullableMe;
     nullableMe->DoSomethingThatChangesMyState()
     MethodWithNullableMyStructParameter(nullableMe); 

Then everything would be ok. (I think). Mutable structs are still really confusing and you should definitely _not_ use them :-)
However, if you do then it will work in the nullable system as well. I don't have a copy of the spec in front of me, but I'm pretty sure that accessing a struct through a field does not give you a copy of the struct but a reference to the actual struct itself.

Comments

  • Anonymous
    June 15, 2004
    It would be interesting syntax, to say the least. Adding the ?i syntax for direct dereferencing would potentially be overkill, but a member accessor would be nice.

    I am slightly concerned about the pre-existing notion of -> being a pointer dereferencing operator, it would be better if it was currently documented as just a dereferencing operator. However I think it would work fairly well.

    => could work, but I would worry about people confusing it with a comparison operator( if (x=>y<=z) looks kind of funny to me, ;). I think overloading -> would be a better choice overall.

    Being able to hoist members directly would be valuable, though I'm not sure its nessecery. One could copy the value of value into a local instead of tyring to access it directly, although that is pretty inconvient..


    Heh, looks like I was pretty neutral after all, wasn't I?

  • Anonymous
    June 15, 2004
    The comment has been removed

  • Anonymous
    June 15, 2004
    Bah! Seems to C# happens the same thing as to C/C++: The language gets "polluted". I really dislike this (except Generics, of course) trend. Really!

  • Anonymous
    June 15, 2004
    Uwe: How did C get polluted?

    What issues do you have with the nullable syntax? How would you prefer to see the language go?

  • Anonymous
    June 15, 2004
    The comment has been removed

  • Anonymous
    June 15, 2004
    Err, actually, even with a dereference we have the same problem.

    int? x = 10;

    x->ToString() will work ok
    but say x->Add(10);(its imaginary, I'm to tired to be more creative). most certainly won't. The Nullable structure would have to support a mechanism in which a dereference operation could modify the value directly, requiring new syntax as a matter of couse(can't have Nullable::Value be the only special property, after all)...I'm not sure this will work after all

  • Anonymous
    June 15, 2004
    daniel: I wasn't serious. There are already enough langauges with messed up syntax like http://www.muppetlabs.com/~breadbox/bf/ (Brainfuck) and http://www.catb.org/~esr/intercal/ (Intercal) and we don't need to make C# into that.

    Great point about value copying. I'm going to add that the main post.

  • Anonymous
    June 15, 2004
    Daniel: What "macros" are you referring to. What parts of C# do you like/not like?

  • Anonymous
    June 15, 2004
    Cyrus: I meant CC++ macros. That was one of my biggest problems with CC++ was the macro's and templates and other stuff that looked like code but was really something that turned into other code...it was confusing to say the least.

    As for C#, I like most of it, though I have issues with a few things. For one thing not being able to apply readonly constraints to locals or parameters is kind of annoying. Infact I agree with the notion that in parameters should probably have been read only in the langauge, while out and ref params are readwrite.

    My biggest complaint, however, would be the cast syntax. While it does hold true to CC++ and Java, I dislike using the same syntax for conversions and casting. I can understand why the designers decided to use it, however I would have by far prefered seperation of the casting and conversion operators, plus I would have prefered a simpler syntax for cast & member access combinations.

    I modfied the mono C# compiler to support this notion, specifically a cast<type>(expression), and convert<type>(expression) (They are based on C++ cast operator syntax. I had considered another operator, the name I forget now, which would be the same semantically as a cast is today, just using the new syntax ideas). The biggest upsides for this were that you always know whether you are performing a cast or a conversion and that the syntax
    cast<string>(myObject).Split() works like most other code(typeof, default, method calls, etc), whereas the equivlient C# code is the less attractive(IMHO, anyway) and common: ((string)myObject).Split()

  • Anonymous
    June 15, 2004
    The comment has been removed

  • Anonymous
    June 15, 2004
    daniel: If you look at my implementation of Nullable<T> above you'll see that it is a field, not a property.

    I think we definitely want to avoid pointers here :-)

    I agree that casts are ugly as sin too. Sigh... at least generics gets rid of a lot of casts in my code.

    I like your notion of readonly being the default for locals/parameters. Jay and I think that readonly should be the default for everythign, and mutability should be something you explicitly declare.

    Ok. Nighttime for me.

  • Anonymous
    June 16, 2004
    Cyrus,

    Personally, I think that the -> syntax is not appropriate. First, because it means something else in C++ and unmanaged C# code, and second, because I see it like a C++ thingy. Again, in my opinion (and this should not be attributed to my employer ;-)) it is an ugly operator, because in many fonts - and > are not tightly coupled and it doesn't look like an arrow at all (assuming that this was the intent).,,

    I'm not sure what other operator syntax to propose, but I think => is also bad. Maybe you can do it some other way like:

    struct A
    {
    ...
    bool HasValue() {...}
    }

    A? a;
    a.HasValue() // this is the Nullable<A>'s HasValue
    a?.HasValue() // this is the A.HasValue and a? to be de-null-ed A :-)

    I know that ? can be interpreted like the start of the ? : operator in the parser, but with some work ?. can be another operator that does just that. And the ? sign is more appropriate to the a nullable type-related operation, since ? means nullable when follows a type...

    Hope it helps!

  • Anonymous
    June 16, 2004
    The comment has been removed

  • Anonymous
    June 16, 2004
    The comment has been removed

  • Anonymous
    June 16, 2004
    Perhaps nullability should be delayed until it can be concretely added to the CLR and not be an kudgy add-on.

    Why-oh-why do we have to we have HasValue. wouldn't a simple comparison to null be sufficient because (null == null) should always return true.

    This is why I'm promoting the idea that value-types have true nullability built in and not this workaround. My fear is that anything that isn't truly built in is going to have a host of problems. To give a few:

    1) ?? versus ?
    2) conversion and casting issues (there are lots)
    3) the required existence of HasValue

    There are more and I'm afraid the next few weeks are going to be coming up with lots of kludgy solutions that are hard to remember and are going to artificially raise the difficulty of the language.

    Perhaps it's time to start thinking about major revisions to the CLR before it gets to be too late in the development cycle. Every year of delay makes it that much harder to justify major changes.

    Orion Adrian

  • Anonymous
    June 16, 2004
    Small correction:

    We don't complain because i and x aren't equal any longer or that i isn't aware of x changing value.

    Larger addition:

    > MyStruct? nullableMe;
    > nullableMe->DoSomethingThatChangesMyState()
    > MethodWithNullableMyStructParameter(nullableMe);
    >
    > Then everything would be ok. (I think).

    (Laguange syntax aside) Yes, because you're actually changing the 'value' inside nullableMe. This is significantly different than trying to use reference type thinking on value types. To make your original code work, you need

    MyStruct? nullableMe;
    MyStruct me = nullableMe.Value;
    me.DoSomethingThatChangesMyState()
    nullableMe.Value = me; //update the value inside nullableMe to process the new value on the following method call
    MethodWithNullableMyStructParameter(nullableMe);

  • Anonymous
    June 16, 2004
    The comment has been removed

  • Anonymous
    June 16, 2004
    Oh, and Cyrus: While your implementation may use a field, Nullable<T> uses a property for a value. Because of this, a specizliaed system would have to be used for dereferncing.


    Avoiding pointers would have its ups, but using a managed pointer isn't so bad, and its the only way to directly access a structure(when you are using a field, the compiler emits a field address load instruction).

  • Anonymous
    June 16, 2004
    Orion: While the CLR doesn't provide an instruction to compare nullability, it doesn't provide instructions to compare alot of things. C# as a language will allow you to compare nullable types to null using ==, Cyrus just tends to not use it.

    I don't really see any point in using two instructions, one of which checks a value type for nulll, which also has to be added to the runtime, when a three instruction sequence will do the job quite easily(load the struct, call HasValue, compare to true or false). HasValue exists for other languages, you don't really need it in C#.

    Its easily expressable in C# using normal operations, what is your particular problem?

    Frankly, this is as close to integration as you are going to see. the CLR is not going to change to support nullability in any way that doesn't use an object model. Its just not in its best interests when the compilers can hide those particular issues.

    The simplist solution, of course, would be simply to remove access to Value and HasValue from C# and only allow operators to work with it. That does away with comparison problems. However, without a member hoisting solution, its pretty moot.

  • Anonymous
    June 16, 2004
    The comment has been removed

  • Anonymous
    June 16, 2004
    The comment has been removed

  • Anonymous
    June 16, 2004
    Daniel: I'm not really following what you just said. Primarly because I said a simple comparison to null should be all you ever need. I'm not promotting additional operators, quite the opposite actually.

    What I'd like is consistent syntax for all variables. If the variable doesn't contain anything, then comparing it to null will return true.

    int? a = null
    if( a == null )
    {
    //this should always run, because a == null is always true)
    }

    And you should be able to put any type in place of int and get the same result. Heck I'd like to see the same thing be true when arrays are empty. But that's way outside of how the CLR does things.

    The only issue becomes comparing the values of variables versus the identity of variables. Is one variable truly the same as another. This pops up anywhere you have reference type parameters or variables. Though I did like javascript's === for identity checking.

    Currently C# tends to be very inconsistent in its treatment of value-type and reference-type variables. Casting issues versus conversion issues are also wonky.

    That and C# builds on the same problem that all C-based langauges have, the concept of a single return value. That and that return value being always returned. There's also some issues with exceptions, but I'm thinking that exceptions could be dramatically improved given time.

    Orion Adrian

  • Anonymous
    June 16, 2004
    >You have to generate an entirely new Nullable because the Nullable structure is immutable.

    Blah. I entirely forgot this point. Of course, an instance of Nullable could be mutable under the covers to accommodate this functionality (and still remain publically immutable). It just leads to more special code.

  • Anonymous
    June 16, 2004
    Orion: "Why-oh-why do we have to we have HasValue"

    Who said we do? My solution doesn't have HasValue. It just does simple "== null" checks.

  • Anonymous
    June 16, 2004
    Sorin: "a?.HasValue() " woudl be ambiguous. Are you referring to a static method on the type "a?" or are you referring to an instance method on the de-nulled variable 'a'?

  • Anonymous
    June 16, 2004
    Ron: "I'm also confused by the arguments presented for the case of changing a struct when using a nullable version of the struct. If I have "

    If i had:

    MyStruct? s;
    s.Mutate();

    Would you expect s to change or for s to stay the same?

    What if you had:

    MyRefType r;
    r.Mutate();

    Is your answer different? If so, do you think that that's good to have two different behaviors between reference types and nullable value types?

    Nullable value types try to unify their place along with regular nullable types (refernce types), so we want to keep their semantic meaning as close as possible.

  • Anonymous
    June 16, 2004
    Omer: "KiSS. "

    I am attempting to find the simplest solution possible. A solution that needs to warnings because it avoids ambiguoity would be the simplest IMO. Do you disagree?

  • Anonymous
    June 16, 2004
    Orion: "This is why I'm promoting the idea that value-types have true nullability built in and not this workaround. My fear is that anything that isn't truly built in is going to have a host of problems. To give a few:

    1) ?? versus ?
    2) conversion and casting issues (there are lots)
    3) the required existence of HasValue "

    Did you see my solution involving "public sealed class Nullable<T>"? It addresses your points. THere is no way to have ??, the conversions and casts are clear, and there is no 'HasValue'

    What would be your issues with this solution?

  • Anonymous
    June 16, 2004
    Daniel: "Oh, and Cyrus: While your implementation may use a field, Nullable<T> uses a property for a value. Because of this, a specizliaed system would have to be used for dereferncing. "

    Who says it has to stay that way?

  • Anonymous
    June 16, 2004
    Daniel: "You have to generate an entirely new Nullable because the Nullable structure is immutable.

    However, that approach is rather expensive as every call results in a read and write. The compiler could optimize this by only copying when it needs to, but its still pretty limited. "

    I have some interesting performance numbers that I will try to post soon about the cost of this. It's rather surprising :-)

    Also, remember that in my above implementation the nullable is immutable, but you can affect the state of the internal value field.

  • Anonymous
    June 16, 2004
    Cyrus: <quote>'"a?.HasValue() " woudl be ambiguous. Are you referring to a static method on the type "a?" or are you referring to an instance method on the de-nulled variable 'a'?</quote> (couldn't use '' or "" to quote because you used both in the string ;-))

    I understand your concern, and I admit I didn't think about what happens when you already have a type with the same name as that of the variable.

    However, to remove that ambiguity I suggest to use the same behavior as for this code (C# 1.0):

    namespace Test
    {

    struct a
    {
    public static void Met()
    {
    Console.WriteLine("a");
    }
    }

    class c
    {
    class b
    {
    public void Met()
    {
    Console.WriteLine("b");
    }
    }

    b a = new b();

    public void test()
    {
    a.Met(); //prints 'b'
    Test.a.Met(); //prints 'a'
    }
    }

    So, when a?. is used, if a is a defined variable within the same code context, it will mean de-null-ed a, even if another a type exists. Otherwise, if you need to call the type a's static method, you would need to use the full namespace path to that class.

    Or am I still wrong about something?

  • Anonymous
    June 17, 2004
    The comment has been removed

  • Anonymous
    June 17, 2004
    Cyrus (comments inline): "If i had:

    MyStruct? s;
    s.Mutate();

    Would you expect s to change or for s to stay the same?"

    If you really mean this (yes, I'm being intentionally picky about this code), I would expect this to not compile with an error similar to CS0165.

    But suppose

    MyStruct? s;

    is shorthand for

    MyStruct? s = null;

    Then I would exect

    s.Mutate();

    to throw a runtime exception of NullReferenceException.

    Getting past all the compiler and runtime errors, I would like it to like it to behave the same as

    MyStruct s = new MyStruct();
    s.Mutate();

    Assuming the internal values of s are not marked readonly, then I would expect its values to change.

    "What if you had:

    MyRefType r;
    r.Mutate();

    Is your answer different? If so, do you think that that's good to have two different behaviors between reference types and nullable value types?"

    I would expect the same behavior.

    "Nullable value types try to unify their place along with regular nullable types (refernce types), so we want to keep their semantic meaning as close as possible."

    True, but Nullable value types are still supposed to behave just like value types regardless of the implementation details (just like System.String). Thus, if it means there is a difference in semantics (though I don't see it in this situation), it's something to be dealt with. I'm not looking for Nullable to give me a way to treat value types as an object. I already have that. It's there to give me one additional state for concrete data.

  • Anonymous
    June 17, 2004
    Ron: Sorry, I meant:

    MyStruct? s = new MyStruct();
    s.Mutate();

    I agree. Nullable value types should behave the same as nullable types, except that they can now also be null.

  • Anonymous
    June 18, 2004
    Cyrus,

    Simplest indeed, but this usage shouldn't be disallowed, but avoided. If you fail to avoid it, you should correct yourself in the most intuitive of ways. Adding things to the language will never be an intuitive solution.

  • Anonymous
    September 02, 2005
    nice to be seen

  • Anonymous
    June 09, 2009
    PingBack from http://greenteafatburner.info/story.php?id=5221

  • Anonymous
    June 13, 2009
    PingBack from http://quickdietsite.info/story.php?id=14128