CLR eye for C# guy : On proverbial trees and custom attributes
So... if a tree falls in the forest, and no one is there to hear it, does it make a sound? Very new and original question, eh? Do I know the answer? Does anyone? Does this have anything to do with custom attributes? You'd be surprised to know that yes, in fact it does. But fortunately - as far as attributes are concerned - I do know the answer. Sort of.
This entry was supposed to be introductory. In fact, I wanted it to be very short and neat, just to start off with something simple. Well, as I started to review my notes, I realized that I will have no such luck.
As we know, attributes - more specifically custom attributes - are an extensibility mechanism which makes it possible to add bits of custom metadata in a non-intrusive way. This metadata is stored is a special metadata table which can then be looked at by various tools via Reflection or Metadata API. Attributes are extensively used by compilers, validation tools, execution environments, and parts of CLR execution engine - for instance Remoting and Interop subsystems. They are great, because they don't get in your way if you don't want them to and right there if you are interested.
So... let's have a closer look and see what exactly happens when write something like this:
[AttributeUsage( AttributeTargets.Class )]
public class ClassDescriptionAttribute : Attribute
{
public ClassDescriptionAttribute( string description, bool localizable ) {...}
public string Description {...}
public bool Localizable {...}
public string ResourceID {...}
}
[ClassDescription("Class Foo", true, ResourceID = "id1") ]
[Serializable]
public class Foo
{
}
As you can see, everything looks pretty straight-forward. There is a custom attribute called "ClassDescriptionAttribute" (I have omitted method bodies and private members because they are pretty standard - primitive getters and setters ); and a class, which I have inventively called "Foo" that has two attributes associated with it - the custom one we have declared, and the standard system attribute "Serializable", which is commonly used by Remoting subsystem. The semantics of what's going on are pretty clear as well - we want class Foo to have a localizable description with resourceID="id1". Something in our application will then extract this information and use it accordingly - the typical use would be to display it together with the other class information.
Let's compile this thing, ILDASM it and see what declaration of Foo looks like in IL.
.class public auto ansi serializable Foo extends [mscorlib]System.Object
{
.custom instance void Attribs.ClassDescriptionAttribute :: .ctor (string, bool) =
( 01 00 09 43 6C 61 73 73 20 46 6F 6F 01 01 00 54 0E 0A 52 65 73 6F 75 72 63 65 49 44 03 69 64 31 )
// ...Class Foo...T..ResourceID.id1
...
}
Huh, so this is interesting. There are several puzzling things there, which we will try to figure out one by one.
1. Where did "Serializable" go? On pseudo-attributes
Very good question. If you look closer, you will see some very explicit signs of presence of our freshly-defined custom attribute, but [Serializable] is not there. Well, turns out, that's because SerializableAttribute is what they called "pseudo-attribute", which means that it's not a "real" attribute, but something the C# compiler recognizes and translates into a class flag "serializable". Make no mistake, there is actually a class called "SerializableAttribute" that is defined in mscorlib.dll - it's just that it never gets emitted by the compiler as such.
The reason for this is pretty obvious -extraction of custom attributes is rather expensive, so CLR execution engine tries to avoid doing that as much as possible. Having "serializable" as a class flag as opposed to a "real" custom attribute, significantly improves performance of the Remoting subsystem.
There are other pseudo-attributes, full list of which you can find here. There are two main things to remember about pseudo-attributes
· They are very special and mostly intended for CLR use - as you see there're only a handful of them, and adding new ones would require CLR metadata specification change
· They can not be extracted using Type.GetCustomAttributes(), instead they need to be accessed via appropriate metadata item properties or Metadata API
2.
What's that ".ctor" business? What custom attributes really are As you can see, .custom statement - you can probably guess that it used to declare custom attributes in IL - mentions ClassDescriptionAttribute in rather peculiar way - it seems that it is trying to "call" something called ".ctor". Why is that? As you may guess, ".ctor" means "constructor"(incidentally, ".cctor" means "static constructor"). Intuitively, this makes sense, as an attribute class can have more than one constructor, so we have to be specific as to which one we want "called".
Things get a little clearer if you consider how constructors are represented in CLR Metadata tables. You see, all constructors are stored together with other methods in MethodDef/MethodRef tables, and while certain flags do indicate quite clearly that this particular method is in fact a constructor, these are the only two tables that contain information about them.
Another Metadata table that is of interest to us is called CustomAttribute. This one - as you may have guessed - describes custom attributes. This particular table has 3 fields - Parent, Type and Value. Parent is something that the attribute is defined on (assembly, class etc); Value is that ugly blob we'll talk about later, and Type points to another metadata item. CLI demands that that metadata item should be stored in MethodDef/MethodRef tables and be an instance constructor. (Incidentally there are some indications that Value could, in fact, point to something else - current metadata token encoding allows for that. Whether that will ever be exploited remains to be seen...)
So, based on all of the above , a custom attribute is an association between any metadata item (except another custom attribute) and a instance constructor. At least, this is what it is based on the CLI as per ECMA-335.
That's why we see ".ctor" explicitly mentioned in IL declaration - it is, in fact, just a pointer to an entry in MethodDef table.
3. What's that binary goop there? On custom attribute value encoding.
So what in the world is that scary blob that is being "assigned" to the constructor? It's clear it has something to do with attribute initialization - you can kind of see some hints of that in the character dump of the blob - but what exactly?
Well, as you would expect, that is information necessary to initialize the attribute class properly. This is not - as one may assume - a binary in-memory representation of the attribute class instance itself, but instead encoded constructor arguments and property name/value pairs, allowing for subsequent attribute initialization.
You can get a very detailed information on how to read this blob in Serge Lidin's Inside IL Assembler, but in a nutshell, here's what this it says
· 01 00 - prolog
· 09 43 6C 61 73 73 20 46 6F 6F - first ctor argument; string "class Foo" prefixed by its length (0x09).
· 00 - second ctor argument; bool "false"
· 54 0E 0A 52 65 73 6F 75 72 63 65 49 44 03 69 64 31 - name/value pair representing "ResourceID"="ID1".
0x54(SERIALIZATION_TYPE_PROPERTY) tells us this is a property name/value pair
0x0E(ELEMENT_TYPE_STRING) specifies property type
0x0A is the length of "ResourceID";
0x03 is the length of "id1"
As you can see, while the constructor argument types are extracted from its signature and not explicitly encoded in the blob, property name/value pairs explicitly specify property/field type.
So that blob represents a sequence of actions that need to be carried out to create an attribute instance. Who gets to create an attribute class? Read on...
4. What was that bit on proverbial trees about? Who really creates a custom attribute.
So as we can see, there is more than enough information to create an instance of ClassDescriptionAttribute - we know the class, necessary constructor signature, its arguments and additional properties to assign.
So what does it get created by? The compiler? It actually emits all the information necessary to create tan instance of the attribute class, but no, it doesn't create it in a conventional sense.
CLR execution engine? That's a possibility, but how would that work - would it just go ahead and create all custom attributes on startup? Keep in mind, we actually need execute some IL code to create an instance of an attribute class - namely the constructor - surely doing all that would be an awful waste of time, especially if no-one bothers to actually access the attributes?
You have probably guessed it - instances of attribute class are actually created by Reflection when someone asks - typically via Type.GetCustomAttributes(). I suspect that this is considered to be an implementation detail of the Reflection API though - all you really need to know is that your attribute class gets created, and you can't really control when and how.
While we are on the subject of implementation details, you will find that - at least in .Net v1.1 - attribute classes are in fact created every time you ask for them.
Let's update the constructor of ClassDescriptionAttribute as follows:
public ClassDescriptionAttribute( ... )
{
Console.WriteLine( "ClassDescriptionAttribute created");
...
}
And then run the following code snippet:
Console.WriteLine("Extracting custom attribute...");
ClassDescriptionAttribute descriptionAtribute = typeof(Foo).GetCustomAttributes( typeof( ClassDescriptionAttribute ), false )[0] as ClassDescriptionAttribute;
Console.WriteLine("Extracting custom attribute yet again...");
ClassDescriptionAttribute descriptionAtribute2 = typeof(Foo).GetCustomAttributes( typeof( ClassDescriptionAttribute ), false )[0] as ClassDescriptionAttribute;
Console.WriteLine( descriptionAtribute == descriptionAtribute2 );
Basically we are extracting the same attribute twice and then compare that the objects that get returned are in fact the same. If you run this code, you may be surprised to find out that it prints:
Extracting custom attribute...
ClassDescriptionAttribute created
Extracting custom attribute yet again...
ClassDescriptionAttribute created
False
Interesting, huh? Every time you call to ask for an attribute, Reflection creates another instance of it, which of course means that returned values will have different identity (by the way, you can make them "look" the same by redefining operator "=" and methods "Equal" and "GetHashCode" based on attribute fields rather then binary identity).
So let's get back to the original question - "does a custom attribute get created if no-one asks for it? ". The answer is "no". There is a bit of a twist as far as pseudo-attributes are concerned, but then again those things never get created, so the answer still remains.