Too much type information, or welcome back System.Object and boxing
We all know that generics are good - they promote code reuse, static type checking by the compiler, increase runtime performance, allow more flexible OOP designs, lay the foundation for LINQ, help the IDE to provide more helpful IntelliSense and have tons and tons of other vital advantages. "var" is another good feature, which (unlike "object"), also helps to preserve full static type information.
However I hit a rare case recently where I had too much static type information about my code, so I had to use System.Object (and boxing) to get the desired effect. I had a method that used reflection to set a property on a type, similar to this:
static void SetProperty(object f)
{
Type type = f.GetType();
PropertyInfo property = type.GetProperty("Bar");
property.SetValue(f, 1, new object[0]);
}
I also had a struct like this:
struct Foo
{
public int Bar { get; set; }
}
Now, I tried to set the Bar property on an instance of the struct:
static void Main(string[] args)
{
var f = new Foo();
SetProperty(f);
Foo foo = (Foo)f;
Console.WriteLine(foo.Bar);
}
It didn't work! It printed out 0! I was puzzled. And then I realized what is happening. Since Foo is a struct, and f (thanks to var!) is also statically known to be a struct, the compiler passes a copy of the struct by value to the SetProperty method. This copy is modified, but the original f is not.
One simple change and it started working fine:
static void Main(string[] args)
{
object f = new Foo();
SetProperty(f);
Foo foo = (Foo)f;
Console.WriteLine(foo.Bar);
}
I changed var to object, the struct was boxed into an object on the heap, the reference to this same object was passed to the SetProperty method, method set the property on the boxed instance, and (Foo) unboxed the same modified instance - the code now prints out 1 and everything is OK again.
"var" provided too much type information to the compiler - it avoided boxing, and knew that the variable is a struct, so I lost the modified value. After casting to object, we hid the extra information from the compiler and got the uniform behavior for both value types and reference types.
In my original code where I encountered this peculiar behavior (a custom deserializer that reads XML and uses reflection to set properties on objects), I was too focused on working with all types so I forgot that those can be value types as well. Since I had everything strongly typed with generics, type inference, vars and other modern goodness, the kind hardworking compiler preserved all the information for me and avoided boxing where I was expecting to get reference type behavior. Thankfully, unit-tests revealed the error 10 minutes after it was introduced (I definitely need to post about the usefulness of unit-tests and TDD in the future), so it was a quick fix to box a type into object before filling its properties.
It was an amusing experience.
Comments
Anonymous
September 20, 2008
The comment has been removedAnonymous
September 20, 2008
Yes, mutable structs are evil. However the requirements for our object deserializer are such that it has to be universal and be able to deserialize both classes and structs. To deserialize an immutable struct, we would need to know about a constructor and actually create the value, not just set properties.Anonymous
September 22, 2008
You wrote "We all know that generics are good...". I started with basic, worked my way to C, C++, and Java, then VB (I know, really obleque turn), before I got to C#. While I agree in part that Generics are good, I don't believe var is a good thing at all. It's an evil little thing that says I don't know what I'm working with, so I'll blindly go forward. The C family of languages is strongly typed to aviod such dangerous ideas. Since everything in C# is derived from the Object class, even value types, there should be no reason to use var outside of being assigned an anonymous type, which can only be used safely in the function it is declared.Anonymous
September 22, 2008
Why were you using a struct in the first place?Anonymous
September 22, 2008
The comment has been removedAnonymous
September 22, 2008
The comment has been removedAnonymous
September 22, 2008
If you know it's a struct, your designed it as struct you know you have to pass the parameters as ByRef (VB) / ref (C#). Theres no mistake there. Three letters makes the difference and you won't have any problem, even if you pass an object.Anonymous
September 23, 2008
And that's exactly why I hate mutable structs so much. See my post on enforcing immutability at http://blogs.msdn.com/kevinpilchbisson/archive/2007/11/20/enforcing-immutability-in-code.aspxAnonymous
September 23, 2008
@Luis, Passing the argument to SetProperty by reference is not appropriate, because it implies that after the method call the reference passed in could be pointing to a completely difference instance! Which is clearly not what the method is trying to accomplish.Anonymous
September 23, 2008
@ Luis, The whole point is that you don't know in advance that you have a struct, or an object... Kirill has stated that this example has been pulled from an automatic serializer that he is working on... in that case, you need to be able to to pass it various objects without actually knowing what they are.Anonymous
September 23, 2008
The comment has been removedAnonymous
September 26, 2008
@Kirill Certainly var has its place as a 'type' name for variables holding the return values of LINQ queries that will be anonymous types generated by the compiler, but in situations where the programmer knows the type in advance, it seems like it invites issues like this. I would personally recommend coding standards that explicitly forbade its use in those situations for that reason.Anonymous
September 27, 2008
Well, as I said, I've never hit this case before, so I thought var can do no harm. I used to think carefully everytime I needed to declare a local variable, and I now I will think even more carefully. But I still love var and I expect myself using it in the future as well where appropriate (I'll just have to be more careful). Not necessarily for anonymous types (which I rather almost never use), but also where it increases readability and the type is clear from the variable name/ambience.Anonymous
September 29, 2008
"To deserialize an immutable struct, we would need to know about a constructor and actually create the value, not just set properties." I don't understand this. If you are using reflection, can you not still deserialize a struct by setting the fields rather than the properties? After all, the framework somehow knows how to deserialize "immutable" value types... Since inheritance is not an issue, you know all the fields in a value type will be DeclaredOnly, and there will always be a parameterless default constructor suitable for Activator.CreateInstance(). At the very least, if you want to use properties and make it immutable by normal means, just use a "private set" and then let reflection find that.Anonymous
September 29, 2008
Hi Bruce, all your comments are very valid. From a couple of hints I see that you clearly know what you're talking about. However we have a couple of requirements:
- We want to keep our serializer/deserializer very simple, maintainable (500 lines of code for deserializer and 150 lines of code for serializer) and keep full control over it
- We only serialize public writable instance properties, we don't even look at fields
- The list of participating properties is returned by a piece of common reflection logic that we want to keep really simple/trivial
- This is not shipping code, so we just want to get it working and move on - my solution turned out to be the best in terms of cost/benefit - quick, maintainable and does the job.
Anonymous
September 30, 2008
Sure, that makes sense now. I just wanted to make sure I wasn't missing something in my own mapper / deserializer code. Thanks for the reinforcement. I've been bitten by the value-type bug before, so that's why I viewed this in the first place. Good info, thanks.Anonymous
September 30, 2008
Var is good! It reduces the "noise" in your code (especially when you use generics a lot) thus making it more readable. For me readability is much more important then this potential stupid simple issues when you mix up var with object.Anonymous
October 01, 2008
The comment has been removedAnonymous
October 02, 2008
@Justin, No, I am not confusing them. 'out' is merely a specialized form of 'ref', in which the argument does not have to be initialized before being passed in (as opposed to 'ref' in which the argument does have to be initialized before being passed in). In fact the CLR does not even support 'out' parameters; the C# compiler simply implements them as ref parameters in IL (try calling a method with an out parameter that was created in C# from a VB.NET application). In both cases, the called method can set the reference to point to a different instance than what was passed in. The only difference is that an 'out' parameter is required to be set by the called method, since it may not have been initialized before the method was called; whereas setting a 'ref' parameter is optional, because it was required to already have a value before it was passed.Anonymous
October 02, 2008
I looked up "const ref" because that seems like what what is being desired by some in the above comments, and it appears that this doesn't exist in C# (like in C++). I had to look this up to get a better understanding. I found this link helpful: http://channel9.msdn.com/forums/Coffeehouse/255508-const-ref-in-C/ (I need to read more about "var" as well....)Anonymous
October 15, 2008
The comment has been removedAnonymous
October 23, 2008
I would like to ask another question: why did you even choose to have your own serializer, if there are a number of them already? (Xml Serializer, Xaml Serializer, Soap Formatter, WCF Formatters). Was the major intention "to reinvent the wheel"?