Compartilhar via


Calling static methods on type parameters is illegal, part three

There were lots of good comments on my previous entries in this series. I want to address some of them, but first I want to wrap this up by considering how a small change to the scenario makes it plausible to choose a different option.  Consider now the non-static, non-virtual instance method:

public class C { public void M() { /*whatever*/ } }
public class D : C { public new void M() { /*whatever*/ } }
public class E<T> where T : C { public static void N(T t) { t.M(); } }

I hope you agree that in a sensibly designed language exactly one of the following statements has to be true:

1) This is illegal.
2) E<T>.N calls C.M no matter what T is.
3) E<C>.N calls C.M but E<D>.N calls D.M .

As we've discussed, with static methods we chose the first option. But with instance methods, we choose the second! Our earlier objection to it -- that the user clearly meant for the more derived method to be called -- melts away. Why? Because as far as we are concerned, that might as well have been public static void N(C t) { t.M(); } which you would reasonably expect to always call the less derived method, since its not virtual.

Why not #3? Again, it has to do with static analysis. What it really comes down to is that in both the static and the instance cases, C.M and D.M are entirely different methods. The "new" calls that out; these are two different methods which just happen to share the same name. You can think of every method as having a "slot" in an object; in both the static and instance cases, we have defined two slots, not one. Had this been a virtual override then there would have been just one slot, and the contents of that slot would be determined at runtime. But in the non-virtual case there are two slots.

When the compiler generates the generic code it resolves all the slots at compilation time. The jitter does not change them. Indeed, the jitter does not know how! The jitter has no idea that D.M has anything to do with C.M ; again, they are completely different methods that just coincidentally share a name. They have different slots so they are different methods.

I hope that all makes sense. Now to answer a few selected questions from the comments:

First, a number of readers noted that there are languages (Delphi, Smalltalk) which have implemented the concept of "class" methods. That is, methods which are not associated with an instance (so they are neither instance nor virtual), but nevertheless, which method is called is determined at runtime (so they are not static.) How the .NET versions of these languages do the codegen, I do not know; my guess would be that they emit code that does late binding via reflection or some similar mechanism.

Second, this raises the question of whether C# ought to support some of the dynamic features which languages like JScript, Ruby, Python, etc, support. The VB team's motto as far as this question is concerned has always been early binding when possible, late binding when necessary.  The C# team, by contrast, has always cleaved to the principle that we do early binding when possible, late binding when the user explicitly writes umpteen dozen of lines of ugly reflection code.

However, we do recognize both the power of dynamic language features, and the ugliness and pain of explicit reflection. We are considering ways to make late-bound code easier in C# without making it nigh-indistinguishable from early-bound code (as VB sometimes does.)  These are very preliminary thoughts; we are still finishing up C# 3.0 here; we've hardly begun thinking about C# 4.0. But when we do, dynamic language features will be high on the list of things to think about. Anyone with bright ideas, by all means, send them my way.

Third, a number of readers noted that the word "static" has been consistently used in C# to mean "associated with a class", and not "determined at compile time".  I agree; this is a good example of a poorly named abstraction. Language features should be named based on how the user is intended to use the feature, rather than on how we high-falutin' compiler designers classify different kinds of method calls. We allowed the "static" keyword to be used to mean "associated with a class". Why should "static" mean "associated with a class?"  That makes no sense at all. It's really an accident. But it is one we are stuck with now. Part of the reason why I write this blog is to dig into some of these weird historical corners and the associated unfortunate artefacts of less-than-pure design choices.

And fourth, completely off topic, a reader asks us to expose the internals of the compiler's abstract syntax tree. Again, we are in the very, very early stages of designing C# 4.0, so I cannot comment on specific features or timetables. But I will say that many people have asked for this. I also note that in C# 2.0 and 3.0, we concentrated almost entirely on adding language features, so much so that we have fallen behind somewhat on the tools support for them.

Exposing the internals of the compiler so that third parties could more easily create analysis-and-rewriting tools would be a great way to combine the power of the C# semantic analysis engine with the power of community involvement. I encourage anyone who has ideas about what sort of tools they would like to develop, and what sort of API the compiler team could provide to make that easier, to drop me a note.  We need all the real-world usage cases we can get in order to decide how to invest our time and energy in the next version.

Comments

  • Anonymous
    June 21, 2007
    The comment has been removed

  • Anonymous
    June 21, 2007
    The comment has been removed

  • Anonymous
    June 21, 2007
    Access to C# compiler internals would really help me accomplish what I'm trying to do. Basically, I have a working compile time metaprogramming system for C#, which postprocesses an assembly to e.g. add code to check for nulls for method parameters marked with a [Required] attribute. This is limited however, because it can only implement transforms that do not change the interface of the class in any meaningful way: this means that I cannot, for example, create a [Property] macro which could be attached to a class field and spits out the getter/setter at compile-time for you. However, I'm not sure if you guys are great fans of metaprogramming: it would certainly be a bit "out there" compared to the features C# has at the moment. And people can always use Boo, especially with the new meta methods feature (http://blogs.codehaus.org/people/bamboo/archives/001593_boo_meta_methods.html).

  • Anonymous
    June 21, 2007
    Here's another one which I've been mulling over ... nowhere near a use case, or for that matter even really a complete thought, but I've been wondering about the possibility of enforcing immutability contracts over types. My personal preference is to keep mutability to an absolute minimum, for the obvious simplifications it brings to concurrency and avoidance of client mutation of shared state exposed by APIs. However, in some cases, there are patterns, usually dealing with extensibility where I would like to be able to enforce immutability.  Either as a constraint for internal developers or possibly third party developers. Not sure if this would really play with what you're indicating as far as exposing compiler internals.  So far, I'd been thinking of it as more likely to be a reflection and decompilation activity at runtime, which would be far too hairy for me to attempt (I have a real job!) and of course would at best only deal with managed code. Sorry this isn't really a use case.  Just some musings.

  • Anonymous
    June 22, 2007
    Stuart:  That's certainly a possibility, though I find obj."Method"() weird.  If you know the name already then why are you doing it dynamically?   In your examples, either its a string literal or a parenthesized expression. The string literal scenario is I think sufficiently implausible that we can not optimize for it -- just put the string literal into the parens if you really want to do that for some strange reason. Then we have only parenthesized expressions to worry about. This would then have the nice property that you could do obj.(blah)(whatever) for any expression blah of type string.   Having parentheses, or some other token to call out dynamism, is probably necessary if we want the semantics to be both unambiguous and not brittle.  We do not want to get into this unfortunate situation: string Method2 = "Method1"; obj.Method2(); // If I then add Method2 to my object, does this call Method1 or Method2? brittle! obj.(Method2)(); // unambiguously calls Method1. This also has the nice property of not being legal syntax now, since the thing that follows the dot must be an identifier.

  • Anonymous
    September 11, 2007
    I came across this blog when I was trying to find out why C# didn’t allow virtual static methods or virtual constructors. I’m used to using virtual constructors, virtual class methods and class references in Delphi and I’ve been really frustrated by their absence in C#. This blog has clarified the difference between static methods and class methods. You said earlier in your blog “[people ask me] why C# does not support “virtual static” methods. I am always at a loss to understand what they could possibly mean, since “virtual” and “static” are opposites! “virtual” means “determine the method to be called based on run time type information”, and “static” means “determine the method to be called solely based on compile time static analysis”. I would like to argue the benefits of virtual class methods: When a base class has a virtual constructor and various virtual class methods, each subclass can implement its own versions of the constructor and class methods. It’s largely class references that make virtual class methods and virtual constructors so useful, but they would also be very useful with generics. (Generics are not implemented in the current version of Delphi). Class references in Delphi are a little bit like the Type class in C#, but they can be restricted to just the types that derive from a particular base class, and you can call any class methods or constructors that have been defined in that base class from the class reference. When the class methods are virtual you get the version appropriate for the class reference. (When you’re used to this functionality, you wonder why anyone would need to use factory patterns. I like the comment about most design patterns being necessary when the language has failed to provide a needed feature natively). For example, virtual class methods could be used to return a description of the class, to determine whether the class could do some required action, or .... These classes can all be registered with some manager object, which can then use the class methods to determine which class is appropriate for some action, then create an instance of that class. In a C# like syntax (and using static with a slightly changed meaning), it would look like: public abstract class Animal {    public abstract static Animal Create();    public abstract static bool CanFly();    public abstract static string Description();    public abstract static int LegCount(); } public class Eagle: Animal {    public override static Animal Create()    {        return new Eagle();    }    public override static bool CanFly()    {        return true;    }    public override static string Description();    {        return “Large raptor”;    }    public override static int LegCount();    {        return 2;    } } unknown syntax declaring AnimalClass = class of Animal; private List<AnimalClass> animalClassList; animalClassList.Add(Eagle); foreach (AnimalClass animalClass in animalClassList) {    if ( animalClass.CanFly() )        aviary.Add(animalClass.Create()); } This sort of thing is very simple to do in Delphi, and requires much more convoluted code in C#.

  • Anonymous
    September 16, 2007
    Here's a potential use case for access to the AST (abstract syntax tree). We'd like to be able completely parse the source code, including pre-processing directives, to be able to intelligently merge source code changes with regenerated code, even if the user re-orders or renames classes, methods, etc. Current parsers (or at least the ones we've tested to date) struggle with pre-processing directives.

  • Anonymous
    September 25, 2007
    The comment has been removed