Difficulties with non-nullable types (part 2)
The fundamental difficulty that arises when trying to implement non-nullable types is actually enforcing that the value is not ever null. With value types, this is ensured by having a strict rule that there must exist a public no-arg constructor that does nothing. This restriction is ok for certain value types (like the core primitives like integer, boolean, etc.), but is often quite aggravating when dealing with more complex value types. In these cases you often have to code to this pattern:
public struct ComplicatedValue { bool initialized; public ComplicatedValue(SomeArguments args) { //initialize this struct initialized = true; } public void DoSomething() { if (initialized == false) { throw new UnsupportedOperationException(“You can only call this method once you have initialized this struct”); } //Do Something } }
With that pattern you’ve moved all the checking right back to runtime. This pattern does have the nice benefit that all that checking is contained within this class (as opposed to the person who consumer ComplicatedValue), but it’s still not very pretty. If we were to require this for reference types people would go nuts. So that means we need to coexist with the current ways that people implement reference types.
So, now lets look at a scenario involving non-nullable types:
public class Person { string! name; public Person(string! name) { this.PrintSomeDebugStuff(); } public void PrintSomeDebugStuff() { Console.WriteLine("Debugging Info: " + name.ToString()); } }
Well that’s not good! Because we haven’t assigned a value to name yet and we’re going to throw an exception because “name” is null even though we’ve declared that it’s not null! The problem here is that “this” instance was allowed to be used for general execution before all variables in “this” were set up according the constraints that were listed. If a class lists constraints on its members then it’s imperative that we ensure they are fulfilled before allowing general execution to continue. Note: we could ask for a looser type of restriction. Specifically, we could say that all constraints needed to be satisfied before executing any code that depended on that constraint. Unfortunately, determining what is the set of code that is dependent on a constraint is extremely difficult, and so it suffices to put the more restrictive system in place.
So, the compiler would flag the above code as illegal because not all non-null fields had been initialized before other code was executed that used the “this” reference. Instead, you would have to write something like:
public Person(string! name) { this.name = name; //or this.name = "foo"; //or this.name = SomeExpressionThatReturnsANonNullStringButDoesntUseThis; this.PrintSomeDebugStuff(); }
Ok. Seems pretty simple write? Well, there are a couple of little “gotchas” to be aware of. Consider the following code:
public abstract class Base { public Base() { this.Setup(); } public abstract void Setup(); } public class Derived : Base { string! name; public Derived(string! name) { this.name = name; } public override void Setup() { Console.WriteLine("Debugging Info: " + name.ToString()); } }
If you just look at “Derived” it all looks good. The constructor ensures that all fields are initialized before any other code is executed. Right? Nope. In C# the constructor is not run until the supertype’s constructor is called. i.e. you have:
public Derived(string! name) : base() { this.name = name; }
and Base’s constructor is executed before Derive’s is. So in the above example “Setup” will be called and will try to access “name” before it is actually initialized.
Now, in order to prevent this we would have to add an extension to C# constructors to get around this problem. Basically, we would need to give you a way to ensure all class invariants were ensured before being allowed to call the supertype’s constructor. Perhaps something like this:
public Derived(string! name) { this.name = name; base(); //base constructor call has moved to after the initialization of fields }
There would be special restrictions in place in this code region before “base()” is called. No access to “this” pointer, except to assign into fields, and no access to non-null fields of the supertype (since they haven’t been initialized yet).
Now we’re all set. We can enforce our class constraints and ensure that if anyone has access to the “this” pointer that our invariants have been met.
Ok. So that’s one problem with non-null types addressed. A few more yet to come!
---
Edit: I forgot to mention this in my post (which is usually what happens since i don't plan these and just write in a free flow manner), and DrPizza astutely noticed this: These modifications would bring C#'s intialization model more in line with C++'s. Like him i find that model to be far more sane. One thing that makes a whole lot of sense to me is that while initializing a base type, you do not have access to the members of the derived type. And, if i'm not mistaken, there are guidelines out there that a constructor should not call virtual methods in .Net (because of some security concern if i'm not mistaken). So, if we're recommending against using that capability, i'm not sure why it's there in the first place. One thing I don't like is how C++ initialization lists look. I'd like to come up with a nicer looking syntax for that.
Comments
- Anonymous
April 24, 2005
http://research.microsoft.com/SpecSharp/
Spec# is an extension of C#. It extends the type system to include non-null types and checked exceptions. It provides method contracts in the form of pre- and postconditions as well as object invariants.
Vote #1 for Spec# - Anonymous
April 24, 2005
Just a minor correction - Derived doesn't actually derive from Base. - Anonymous
April 25, 2005
The comment has been removed - Anonymous
April 25, 2005
Udi: "Just a minor correction - Derived doesn't actually derive from Base. "
Nice catch. I've corrected it. - Anonymous
April 25, 2005
Damien: The problem with solutions like Spec# is that they don't necessarily solve the problems that i'm goign to be outlining here. So you can end up with NullReferenceExceptions at runtime. They set up a nice type system, but don't enforce that it always be maintained.
I'd like to include non-nullable types in C#, but actually have them completely working (which will take a lot more effort). - Anonymous
April 25, 2005
The comment has been removed - Anonymous
April 25, 2005
The comment has been removed - Anonymous
April 25, 2005
Stuart: "It's not how I'd have designed the language in the first place, but it's how the language stands today, and naturally you don't want to break existing legal code, no matter how strange it is to write code that relies on such an obscure (mis)feature of initialization order. "
This wouldn't be breaking any existing legal code. :)
"So IMO it's a little hard to justify why that should be permitted but it should be illegal if the strings turn into string!s. "
Not really. The above is permitted because we have guaranteed that all constraints have been met before the use of the variables. Now, the ! just says that it needs to be initialized to something that isn't null before being used. This is identical to how locals must be proven to initialized before they're used.
"See my last few comments on the SDR post for more such scenarios and thoughts on this kind of issue."
Sure! - Anonymous
April 25, 2005
The comment has been removed - Anonymous
April 25, 2005
(in addition to the rule that you can't access 'this' until all members are initialized) - Anonymous
April 26, 2005
DrPizza: Absolutely right. I had meant to make that part of my post, but totally forgot about it. I've added a small edit at the bottom to reflect this. - Anonymous
April 26, 2005
The comment has been removed - Anonymous
April 26, 2005
So far i've discussed two interesting problems that arise when you try to add non-nullable types to C#... - Anonymous
April 29, 2005
The comment has been removed - Anonymous
July 25, 2007
There is no-doubt that the C#2 nullable-types is a cool feature. However I regret that C# don't support