Jaa


Why are anonymous types generic?

Suppose you use an anonymous type in C#:

var x = new { A = "hello", B = 123.456 };

Ever taken a look at what code is generated for that thing? If you crack open the assembly with ILDASM or some other tool, you'll see this mess in the top-level type definitions

.class '<>f__AnonymousType0`2'<'<A>j__TPar','<B>j__TPar'>

What the heck? Let's clean that up a bit. We've mangled the names so that you are guaranteed that you cannot possibly accidentally use this thing "as is" from C#. Turning the mangled names back into regular names, and giving you the declaration and some of the body of the class in C#, that would look like:

[CompilerGenerated]
internal sealed class Anon0<TA, TB>
{
private readonly TA a;
private readonly TB b;
public TA A { get { return this.a; } }
public TB B { get { return this.b; } }
public Anon0(TA a, TB b)
{ this.a = a; this.b = b; }
// plus implementations of Equals, GetHashCode and ToString
}

And then at the usage site, that is compiled as:

var x = new Anon0<string, double>("hello", 123.456);

Again, what the heck? Why isn't this generated as something perfectly straightforward, like:

[CompilerGenerated]
internal sealed class Anon0
{
private readonly string a;
private readonly double b;
public string A { get { return this.a; } }
public double B { get { return this.b; } }
public Anon0(string a, double b)
{ this.a = a; this.b = b; }
// plus implementations of Equals, GetHashCode and ToString
}

Good question. Consider the following.

Suppose you have a library assembly, not written by you, that contains the following types:

public class B
{
protected class P {}
}

Now, in your source code you have:

class D1 : B
{
void M() { var x = new { P = new B.P() }; }
}

class D2 : B
{
void M() { var x = new { P = new B.P() }; }
}

We need to generate an anonymous type, or types, somewhere. Suppose we decide that we want the two anonymous types - which have the same types and the same property names - to unify into one type. (We desire anonymous types that are structurally identical to unify within an assembly because that enables scenarios where multiple methods use generic type inference to infer the same anonymous type; you want to be able to pass instances of that anonymous type around between such methods. Perhaps I'll do an example of that in the new year.)

Where do we generate that type? How about inside D1:

class D1 : B
{
[CompilerGenerated]
??? sealed class Anon0 { public P P { get { ... } } ... }
void M() { var x = new { P = new B.P() }; }
}

What is the desired accessibilty of Anon0? It cannot be private or protected, because then D2 cannot see it. It cannot be either public or internal, because then you'd have a public/internal type with a public property that exposes a protected type, which is illegal. (Nor can it be either "protected and internal" or "protected or internal" by similar logic.) It cannot have any accessibility! Therefore the anonymous type cannot go in D1.  Obviously by identical logic it cannot go in D2. It cannot go in B; it's just an assembly. The only remaining place it can go is in the global namespace. But at the top level an internal type cannot refer to P, a protected type. P is only accessible inside a derived class of B.

But we can put the anonymous type at the top level if it never actually refers to P. If we make generic class Anon0<TP> and construct it with P for TP, then P only ever appears inside D1 and D2, and yet the types unify as desired.

Rather than coming up with some weird heuristic that determined when anonymous types needed to be generic, and making them normally typed otherwise, we simply decided to embrace the general solution. Anonymous types are always generated as generic types even when doing so is not strictly necessary. We did extensive performance testing to ensure that this choice did not adversely impact realistic scenarios, and as it turned out, the CLR is really quite buff when it comes to construction of generic types with lots of type parameters.

And with that, I'm off for the rest of the year. Air travel is too expensive this year, so I'm going to miss my traditional family Boxing Day celebration, but I'm sure it'll be delightful to spend some time in Seattle for the holidays. I hope you all have a safe and festive holiday season, and we'll see you for more fabulous adventures in 2011.

Comments

  • Anonymous
    December 19, 2010
    > We've mangled the names so that you are guaranteed that you cannot possibly accidentally use this thing "as is" from C#. And here was I, thinking this was just part of an ongoing campaign to increase C# literacy by giving out lessons via error messages. (along with the Expression<Func<...>> notation and the usage of fully qualified class names) BTW, I think C# errors should have levels, but not like warnings, but like courses. I got some level 300 errors recently trying to put a facade around IQueryable, and it didn't even have anonymous types! Should I suggest that via connect? ;-)

  • Anonymous
    December 19, 2010
    Does this mean that on the (perhaps rare) occasion of having two anonymous types with the same property names but not the same type, you need to generate less types?

  • Anonymous
    December 20, 2010
    Why are anonymous type eh.. anonymous?  Until now i have never found that particularly usefull. The usefull part lies in the Creation-On-The-Fly, mostly withing Linq. But why not allow me to name them on that verry spot? I wouldn't mind specifying the access modifiers too!

  • Anonymous
    December 20, 2010
    Ferdinand: Just use a regular class for that. You'll get the full flexibility of regular classes and be able to define more closely in which namespace it lies, its access modifiers, its constructors and so on. Either the on-the-fly syntax for "named" types would have to include every single detail that the real class/struct definitions do, and then what's the point, or you risk running into a wall. And besides, as soon as you pass these types to other methods or create them in many places, now you'd have to merge the extra metadata from the two places or risk creating incompatible types. You can create and set up ordinary classes in interesting ways with object initializers, which has almost the same syntax as the anonymous type creation syntax, just add the type name after "new".

  • Anonymous
    December 20, 2010
    Jesper, what you are saying is the consequence of anonymous classes being anonymous, but not the reason for their anonimity. I just can't think of a good reason why these classes must be anonymous. They seem pretty simple classes. Not like anonymous methods, that have to deal with closures. It seems like a waste of opportunity that you have a strong type, which has a name, but you can't use its name. (the article explains that they are in fact not so simple, but even then the resulting type could be given a name).

  • Anonymous
    December 21, 2010
    The comment has been removed

  • Anonymous
    December 21, 2010
    Hi, I've been reading here for a while, but I've only recently taken the plunge from VB to C#, so forgive the slightly naive question.. I guess it means that in the following, x and y are different / incompatible types (Anon0<string, double>, vs Anon1<double, string>)? var x = new { A = "hello", B = 123.456 }; var y = new { B = 123.456, A = "hello" }; Cheers, Mike

  • Anonymous
    December 21, 2010
    For those who want to do more with anonymous types, there's always Tuple, where you specify the members' types instead of their names. The problem with Tuples is that there's no language support for them, making them rather cumbersome to use. One of the features I'm hoping for in the next version of C# is support for tuple packing and unpacking to make use of anonymous types (in general, not the kind in today's article) much nicer.

  • Anonymous
    December 21, 2010
    The comment has been removed

  • Anonymous
    December 21, 2010
    @Shuggy: I agree about that - tuples should be as light as possible and thus should be structs. In my implementation of Tuples (used for a project in .net 3.5), they are structs, and have a factory Tuple.Create<T>() just for type inference.

  • Anonymous
    December 21, 2010
    The comment has been removed

  • Anonymous
    December 21, 2010
    Have a happy holiday Mr. Lippert. :)

  • Anonymous
    December 22, 2010
    @Gabe: It all depends on how they're constructed and what they contain. If you've got code like this (using the hypothetical new syntax): int x, y, z = (1, 2, 3); Then allocating an object would be very bad If you've got code like this however: Tuple<long, long, long> SomeMethod(); Then a class would mean the return value is 32 (or 64)-bit while a struct would mean it is 192 bit. Pass the tuple around a bit and your code is suddenly slow because of all this copying around.

  • Anonymous
    December 22, 2010
    @Mike C: they are different and incompatible, but that is not an artifact of the specific implementation described here. It is simply the way the language itself is defined: "Within the same program, two anonymous object initializers that specify a sequence of properties of the same names and compile-time types in the same order will produce instances of the same anonymous type. "

  • Anonymous
    December 23, 2010
    The comment has been removed

  • Anonymous
    December 28, 2010
    Jeroen, Visibility is a fairly meaningless concept from the runtime's point of view, but not from the verifier's point of view, and by default all assemblies must be verified before they are loaded. If you disassemble Eric's example, and change the anonymous type so that it is not generic, it will compile, but running PEVerify will result in several "not visible" errors. These errors would prevent the assembly from loading at runtime.