Calling constructors in arbitrary places

[アーティクル]
01/28/2010

C# lets you call another constructor from a given constructor, but only before the body of the calling constructor runs:

public C(int x) : this(x, null)
{
// …
}
public C(int x, string y)
{
// …
}

Why can you call another constructor at the beginning of a constructor block, but not at the end of the block, or in the middle of the block?

Well, let's break it down into two cases. (1) You're calling a "base" constructor, and (2) you're calling a "this" constructor.

For the "base" scenario, it's quite straightforward. You almost never want to call a base constructor after a derived constructor. That's an inversion of the normal dependency rules. Derived code should be able to depend on the base constructor having set up the "base" state of the object; the base constructor should never depend on the derived constructor having set up the derived state.

Suppose you're in the second case. The typical usage pattern for this scenario is to have a bunch of constructors that take different arguments and then all "feed" into one master constructor (often private) that does all the real work. Typically the public constructors have no bodies of their own, so there's no difference between calling the other constructor "before" or "after" the empty block.

Suppose you're in the second case and you are doing work in each constructor, and you want to call other constructors at some point other than the start of the current constructor.

In that scenario you can easily accomplish this by extracting the work done by the different constructors into methods, and then calling the methods in the constructors in whatever order you like. That is superior to inventing a syntax that allows you to call other constructors at arbitrary locations. There are a number of design principles that support this decision. Two are:

1) Having two ways to do the same thing creates confusion; it adds mental cost. We often have two ways of doing the same thing in C#, but in those situations we want the situation to "pay for itself" by having the two different ways of doing the thing each be compelling, interesting and powerful features that have clear pros and cons. (For example, "query comprehensions" vs "fluent queries" are two very different-looking ways of building a query.) Having a way to call a constructor the way you'd call any other method seems like having two ways of doing something -- calling an initialization method -- but without a compelling or interesting "payoff".

2) We'd have to add new language syntax to do it. New syntax comes at a very high cost; it's got to be designed, implemented, tested, documented -- those are our costs. But it comes at a higher cost to you because you have to learn what the syntax means, otherwise you cannot read or maintain other people's code. That's another cost; again, we only take the huge expense of adding syntax if we feel that there is a clear, compelling, large benefit for our customers. I don't see a huge benefit here.

In short, achieving the desired construction control flow is easy to do without adding the feature, and there's no compelling benefit to adding the feature. No new interesting representational power is added to the language.

Comments

Anonymous
January 28, 2010
The one situation where I've found this rule awkward is when:

I have to perform proessing on my arguments, and
I need to pass the result of that processing to my base, and also use it in my ctor Example:

I have a constructor whose ctor takes in a XamlReader, which is a streaming (forward-only) interface.
I need to deserialize the XAML, and pass the result to my base ctor
But I also need to buffer the XAML, so I can deserialize it again multiple times. #2 or #3 are each trivially easy to do on their own; but since I can only read the stream once, and there's no way to pass a buffer from my base invocation into the body of my constructor, I can't do both. It's not the end of the world; I just changed my constructor to take a buffer, and I can always add a static factory method that takes a stream. But it makes the API a little more complex, and the factory method can't be used by derived classes.

Anonymous
January 28, 2010
The comment has been removed
Anonymous
January 28, 2010
Initializing read-only fields has always been the biggest problem with having initialization methods especially since C# doesn't support parallel assignment and multiple return values. Maybe a InitMethodAttribute attribute that relaxes the readonly restriction but can only be called in a constructor?
Anonymous
January 28, 2010
As Robert said, extracting initialization logic into a commonly called method requires you to remove all of the readonly modifiers from your member variables. So its not really accurate to say that "no new interesting representational power is added to the language." On the other hand, I think a better solution to that problem would be to allow an "initonly" modifier on methods. Such methods can write to readonly variables as if they were in a constructor, but can only be called from constructors or other initonly methods.
Anonymous
January 28, 2010
I was just trying to avoid introducing new keywords into the language. An attribute would effect verification not parsing.
Anonymous
January 28, 2010
The comment has been removed
Anonymous
January 28, 2010
Could it be also that you are required to call this(...) at the beginning just in case the chain of local constructors invoked end with one constructor that calls base(...)?
Anonymous
January 28, 2010
I don't understand why new syntax would have to be created. They already created special syntax in the form of base() and this(), so why not just allow that in the body of a ctor?
Anonymous
January 28, 2010
I really dont understand why new sintax would be needed. Why not allow the same sintax we use now but anywhere in the constructor body? base(...) or this(...) The biggest pain of not being able to do this is how to set up certain readonly variables. Sometimes you just have to give up on them because of this limitation. There are more cases where this design decision gets in the way but you can usually get around it with API modifications. A InitOnly attribute sounds pretty good but the learning curve is IMO steeper than just reusing existing syntax. How many of us when learning C# have actually tried this(...) or base(...) somewhere inside the constructor body and frowned when it didnt work (I did at least :p). I actually think the learning curve would be pretty small in this case. Of course, the implementation process I have no idea and it probably is pretty expensive.
Anonymous
January 28, 2010
I really dont understand why new sintax would be needed. Why not allow the same sintax we use now but anywhere in the constructor body? base(...) or this(...) The biggest pain of not being able to do this is how to set up certain readonly variables. Sometimes you just have to give up on them because of this limitation. There are more cases where this design decision gets in the way but you can usually get around it with API modifications. A InitOnly attribute sounds pretty good but the learning curve is IMO steeper than just reusing existing syntax. How many of us when learning C# have actually tried this(...) or base(...) somewhere inside the constructor body and frowned when it didnt work (I did at least :p). I actually think the learning curve would be pretty small in this case. Of course, the implementation process I have no idea and it probably is pretty expensive.
Anonymous
January 28, 2010
It seems perfectly reasonable to me, from correctness perspective, to allow base/delegating constructor calls in the middle of a constructor, so long as any preceding code does not reference "this" or "base" in any way, either explicitly or implicitly. This can even be made to use the existing language for local variable initialization, and the associated reachability analysis, by saying that "this" is treated as uninitialized variable, and base/delegating constructor call initializes it. That said, Eric doesn't seem to be making an argument that it is impossible, or even unreasonable; merely that it is not cost-effective investment of the team's resources. Which is interesting; on one hand, I'd very much rather prefer to see "readonly class" (or something similar) in C# 5.0. On the other hand, as others have rightly noted above, readonly fields are precisely the case which is unnecessarily complicated with the existing, callable-only-at-method-entry constructor invocation syntax, so it could be treated as part of the same problem.
Anonymous
January 28, 2010
The comment has been removed
Anonymous
January 28, 2010
configurator, I think it would make more sense to have a private constructor that takes an int and have it called by the constructor which takes a string: private Derived(int a) : base(a,a){ ... } public Derived(string s) : this(s.GetHashCode()){ ... } For the second situation I am imagining you mean something like this: Derived(Thingie t) : base(t.GetPoint().X, t.GetPoint().Y){ ... } The same solution applies: private Derived(Point p) : base(p.X, p.Y){ ... } public Derived(Thingie t) : this(t.GetPoint()){ ... } Honestly I am having a difficult time imagining a situation where a series of constructors like this don't solve most of the problems mentioned in the comments. Perhaps more tortured examples can be found, but I would begin to suspect the validity of the inheritance relationship in those cases.
Anonymous
January 28, 2010
How do we validate arguments before calling the base class constructor? That has always bothered me in both C# and VB.
Anonymous
January 28, 2010
Jonathan, Create a validator function, and pass its return value (True if it validated successfully, False otherwise) to the constructor, along with the parameters. Obviously, you have to pass each parameter twice, but it could be worse.
Anonymous
January 28, 2010
Jonathan, I would ask why the derived constructor needs to do additional validation on parameters before passing them to the base class constructor. The base class should be validating it's own parameters and throwing on bad values. If it is not, then it is broken. If the derived class has additional restrictions then it should be acceptable to validate these after the base constructor has run and throw from there if the need arises. If that is not acceptable, for example: if the base constructor is time and/or resource intensive, then a factory method might be appropriate, although re-writing the base class to use lazier construction is probably a better idea in that case.
Anonymous
January 28, 2010
@Pavel Minaev You missed my point. Eric points out that introducing the new feature would imply a learning curve for the user because he would have to learn the new sintax in order to understand, debug or write new code. I dont agree, anyone who knows how to code in C# nowdays would understand base(...) or this(...) when encountered midway through a constructor body. So IMO eric's reasoning does not stand in this particular issue. I do mention however that the implementation of such feature is surely expensive and I am fully aware that it's not by any means trivial.
Anonymous
January 29, 2010
@Grico, The new syntax still carries a learning curve for the developer, because he has to learn at least some of the rules that would have to go along with it, as Pavel explained.
Anonymous
January 29, 2010
The comment has been removed
Anonymous
January 29, 2010
The comment has been removed
Anonymous
January 29, 2010
Tom, You mentioned: "...I have not tested this, can you set a protected readonly field in a derived constructor?" No it can not. The readonly field must be initialized in the base constructor or an initializer. Could someone post an example of how not being able to call a constrcutor in an arbitrary location affects readonly variables? I just can't see it.
Anonymous
January 29, 2010
Tom: The way I see it, the new operator creates an instance, and the ctor just initializes it. Remember, it's not functionality that's being crammed in, it's initialization code. If initialization isn't supposed to be a special operation, why are constructors so special in the first place?
Anonymous
January 29, 2010
Of course, someone pointed out off-thread out that instead of using a static factory, I could just have the public ctor call a private overload. So scratch my comment above.
Anonymous
January 30, 2010
The comment has been removed
Anonymous
January 30, 2010
Mark, The whole notion of doing the work in methods and returning values to be assigned to readonly fields seemed obvious to me. I guess that's why I don't see any problem. I certainly would consider a base class depending on a derived implementation to be a serious problem and refactor it immediately. I think some limitations on the way we code are actually helpful and should not be circumvented. They encourage better code by forcing the author puts more thought into the dependencies between different parts. Allowing code to work in every way imaginable leads to spaghetti...
Anonymous
January 31, 2010
@Mark, Mike, Of course you can extract the algorithm into a method and assign the results to a readonly variable. But what if you have multiple readonly variables, and you add a constructor? I hope you remembered to assign all of the variables correctly. Or what if the initialization algorithm normally generates the values for all of the variables in a single pass? Now you have to run the algorithm multiple times instead of just once, or create a structure for the sole purpose of returning the results to the constructor so they can be assigned. The point is that there are numerous cases today where you have to sacrifice immutability or readability because of the restrictions of the language on constructors. I am NOT advocating allowing constructors to be called within a method body, even another constructor body, as I believe that would cause more problems than it would solve. However, I do think initialization methods such as I described earlier would be very useful. But I am not holding my breath.
Anonymous
February 01, 2010
David, I still disagree. I'm not sure I understand your comment about remembering to assign all the variables correctly. If you are adding a new constructor...why not just call the existing constructor from yours? As for an initialization algorithm that generates values in a single pass: Either the algorithm should be split into independent methods that return individual values or, if the values and the algorithm are so intertwined as to preclude this, then I would make the argument the result should indeed be a struct as the values are obviously tightly related. I realize you are not advocating the current restriction be changed, but I don't see that anyone has made a compelling case that the existing restrictions are even an impediment...providing the design is carefully considered. Can I come up with a class that is difficult to initialize properly given the current language restrictions? Sure. Can I make a good case for designing such a class? Probably not.
Anonymous
February 01, 2010
A derived constructor calling the base constructor fails our code review. It's clever code but hard to debug when combined with similar techniques throughout a large system. Such a system gets reputation of being difficult to maintain / debug.

次の方法で共有

Calling constructors in arbitrary places

Comments

その他のリソース