Building Tuple [Matt Ellis]

Article
07/07/2009

For readers who are interested in the work that goes into designing a feature, I wrote an article for MSDN Magazine that appears in this month’s issue. Check out CLR Inside Out: Building Tuple which introduces the new Tuple type as well as discusses the design work we did behind it.

I’d love to hear feedback on what you think about the article and if you’d like to see more behind the scenes design articles in the future. I’d also love to answer any questions about the design or why we made the decisions we did.

Also, there is one change we are thinking about making between Beta 1 and Beta 2 around tuple. In Beta 1 we have a factory method, Tuple.Create, which builds tuples and has some nice type inference properties. In Beta 1 the overload of this method which takes eight arguments requires that the last element is a tuple and builds an extended tuple. For example:

Tuple.Create(1, 2, 3, 4, 5, 6, 7, Tuple.Create(8));

Will build an eight element tuple that looks like [1, 2, 3, 4, 5, 6, 7, 8].

Tuple.Create(1, 2, 3, 4, 5, 6, 7, Tuple.Create(8, 9));

Will build a nine element tuple that looks like [1, 2, 3, 4, 5, 6, 7, 8, 9]

For Beta 2 we hope to change this so that eight argument version of Tuple.Create always builds an eight element tuple. In this case:

Tuple.Create(1, 2, 3, 4, 5, 6, 7, 8);

Will build an eight element tuple that looks like [1, 2, 3, 4, 5, 6, 7, 8]

Tuple.Create(1, 2, 3, 4, 5, 6, 7, Tuple.Create(8));

Will build an eight element tuple that looks like [1, 2, 3, 4, 5, 6, 7, [8]].

Tuple.Create(1, 2, 3, 4, 5, 6, 7, Tuple.Create(8, 9));

Will build an element tuple that looks like [1, 2, 3, 4, 5, 6, 7, [8, 9]]

If you want to build tuples with more than eight elements and your language doesn’t have special tuple syntax, you’ll have to use the Tuple constructors directly. If we see lots of people doing this we’ll add more overloads to Tuple.Create.

Thanks for reading; I hope everyone enjoys the article and look forward to your questions and comments. Cheers!

Comments

Anonymous
July 07, 2009
The comment has been removed
Anonymous
July 07, 2009
This is a good change you made. From a usability perspective the old behavior is really confusing. It's actually a pity that we need this factory method. It would be much nicer when C# and VB would support inference on constructors, but I understand the trouble we’ll get into with such a language feature.
Anonymous
July 07, 2009
I'd really like to build tuples like this: var tuple2 = new { "element1", "element2" }; var e1 = tuple.Item1; var e2 = tuple.Item2; The factory API could then be called behind the scenes.
Anonymous
July 07, 2009
In my opinion, the Beta 2 version of Tuple.Create() is better, since the outcome of the first one would really make me dumbstruck when first time using it. I am also really glad that you rejected the idea of naming the tuple properties with English numeral names - there are a lot of non-English folks out there working with .NET and expecting them to really understand the numeral word permutations is a risky bet. Why not use an indexed or a collection property instead of the individual ItemX properties?
Anonymous
July 08, 2009
I've read the article. I still don't like the idea of Tuple being a reference type. It means that there's one more type for which I will have to do recurrent pointless null checks for no good reason. In my opinion, Tuple should be a canonical example of a type that is absolutely, clearly a value type and nothing else. In any case, overloading == and != for Tuple is a must regardless of whether it's value or reference type. It's a general rule of thumb when overriding Object.Equals (don't C# compiler warns you about this?), it is a clear indication to the user that type has value semantics, and other BCL and FCL reference-but-really-value types do it (e.g. System.Uri, or System.Xml.Linq.XName).
Anonymous
July 08, 2009
I was very surprised to read in the article that Tuple will be a reference type. The article focused on the performance aspects, and concluded that there was no significant performance loss to making it a reference type. My question is, why would you want it to be a reference type in the first place? As the Framework Design Guidelines point out, and MS DevDiv members such as Eric Lippert have blogged about repeatedly, the choice between value type and reference type should be about semantics, not implementation, since implementations change. Surely Tuple fits the semantics of a value type better than those of a reference type? MichaelGG's example illustrates that perfectly. I am forced to ask: if Tuple is not a value type, what is? Why do value types exist at all? "...we were unable to find compelling reasons for Tuple to implement interfaces like IEquatable<T> and IComparable<T>, even though it overrides Equals and implements IComparable." I would think that the compelling reason is the one that was just stated: the type already implements IComparable, so for consistency it should implement IComparable<T>. Again, semantics and usability should be the primary design goal, not performance. Implementing the generic interface should not be rejected unless it can be proven that doing so will cause hard performance goals to not be met. This is what Microsoft designers have been preaching for years. The article gives the impression that the decision to not implement the generic interfaces was made out of fear, not after testing the actual implementation against pre-defined metrics. Perhaps the article gave the wrong impression?
Anonymous
July 08, 2009
@Alex O The reason we didn't use an indexed or collection property is because the only sensable return type for that would be Object, so you'd lose the nice type safe properties of Tuple when you pulled items back out.
Anonymous
July 08, 2009
@MichaelGG, Regarding the Value vs. Reference type decision, as I pointed out in the article, we did consider a split design where two and three element tuples would be value types, but the rest would be reference types, but there was strong pushback from the language teams about that due to the confusing semantic issues. @MichaelGG, @pminaev, With respect to overloading == and !=, There's a comment in the design guidelines that addresses this: Section 8.10.2 which deals with Equality operators on Reference Types. In my book copy it says to Consider not overloading equality operators on reference types, even if you override Equals or implement IEquatable<T> and avoid doing overloading the operators if the implementation would be significantly slower than that of reference equality. On the the relevent MSDN Page[1] it says: "Most languages do provide a default implementation of the equality operator (==) for reference types. Therefore, you should use care when implementing the equality operator (==) on reference types. Most reference types, even those that implement the Equals method, should not override the equality operator (==)." Now perhaps it makes sense to break this guideline if we want to the type to feel more like a value type. I'll discuss this issue with the team and see if we want to make that change. [1]: http://msdn.microsoft.com/en-us/library/7h9bszxx.aspx
Anonymous
July 08, 2009
The comment has been removed
Anonymous
July 08, 2009
My feedback: I had to check that a) I didn't actually work for Microsoft, b) I hadn't designed and built a Tuple class and c) I hadn't written an article for msdn. It took a moment, but I got there. Matt "not that one" Ellis
Anonymous
July 08, 2009
I understand the reasoning behind making it a reference type now, thank you. The reason why Tuple should get an overloaded operator== regardless is simply because the default reference-eqiality version is nonsensical for a Tuple (as it is for any other immutable reference type that represents a value). There's absolutely no benefit in being able to determine that two Tuple variables reference the same instance - there's nothing useful you can derive from that information. I would again like to point out classes such as Uri, XName and XNamespace that do that correctly. If FDG does not cover this case, it is a fault in FDG. As a side note, anonymous classes in C# do not redefine operator== to mean structural equality, even though they do redefine Object.Equals. However, when I created a Connect ticket about it (https://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=349014), Mads Torgersen replied that, in retrospect, they should indeed have made it so, even though they cannot change it now for back-compat reasons. Please don't fall into the same trap! ;)
Anonymous
July 08, 2009
Good article, I like this kind of in-depth article. I agree with the other commenters - Tuple should be a struct, and Tuple.Create(1, 42) == Tuple.Create(1, 42) should be true.
Anonymous
July 08, 2009
C# 3.0 added anonymous types that have a pretty straightforward way of defining class member names and use type inference, plus it does not have a built-in limitation on the number of members that can be specified. So, instead of creating a tuple as yet another language abstraction, why not just enhance the anonymous type mechanism by allowing the anonymous types to expose the required interfaces (e.g. IStructuralComparer etc.) and add ability to pass instances of such types around as strongly typed entities? Cheers, Alex
Anonymous
July 09, 2009
Alex, that (passing around anonymous types) would require structural type equivalence if you want to be able to do that seamlessly between assemblies. And CLR is very much centered around the notion of nominal typing, though NoPIA is a very restricted form of structural typing (which anonymous classes won't be able to reuse). So this would require a fairly major change to the CLR to implement.
Anonymous
July 09, 2009
"So this would require a fairly major change to the CLR to implement." Unless anonymous types in C# used Tuple underneath, which is essentially what F# already does, and would make a lot of sense. Unfortunately it would probably be a breaking change at this point.
Anonymous
July 09, 2009
The comment has been removed
Anonymous
July 09, 2009
@Robert Bullen Regarding your first point. I don't think we thought much about that case, but it is interesting feedback. I wonder, however, if you're finding you need to do this a lot if it makes sense to make your data enclosed by a first class type instead of using Tuple. Regarding the second point. We actually did think a lot about doing this. However, we currently only support co and contra variance on interfaces and delegates and we untimately felt it wasn't worth it to include a coresponding set of ITuple interfaces right now. We are thinking about variance when we design new data structures, it just didn't seem worth the extra level of plumbing right now.
Anonymous
July 10, 2009
The comment has been removed
Anonymous
July 12, 2009
Hello BCL Team! Could you discuss the interaction between features like this one and the new collectible dynamic assemblies feature (AssemblyBuilderAccess.RunAndCollect)? Specifically, if I dynamically emit a type definition T which would now be eligible for garbage collection, and then make and use instances of, say, Tuple<int, string, T, string, T>, will T still be eligible for garbage collection when all instances of T and all other references to T (including generic types using T, and instances of those generic types, ...) are gone? Does the same answer apply if T is a value type? I know this changes the generic specialization process under the hood, or at least used to. Thanks!
Anonymous
July 17, 2009
Matt, Please reconsider the Equals/== decision. Is it because you don't know what == will do for each element of the Tuple?
Anonymous
July 20, 2009
An update on the equals operator and value equality: First, thanks everyone for your feedback on this issue. It’s always great to get this level of feedback from all of you. We’ve spent the past week talking with architects from the C#, VB and F# teams around adding an equals operator to Tuple and after a lot of discussion on both sides of the issue, we’ve decided not to do this. I’ll do my best to explain the reasoning here and answer any questions you might have. In general, the .Equals() method is intrinsic to the type, while the equals operator is very much tied to the language. For most brand new types, the distinction isn’t necessary to make. But for a tuple, which can contain other types that already have special equality semantics in a language, the story gets much more complicated. In the end, we decided that we can’t enforce a semantics on the equals operator unless the semantics is one that behaves as expected from any language. Originally we thought that it would make sense for the equals operator to just the Equals method, but it turns out this leads to a slightly bizarre semantics (at least in C#) where you have something like this: Double.NaN == Double.NaN -> False Tuple.Create(Double.NaN).Equals(Tuple.Create(Double.NaN)) -> True (Since Double.NaN.Equals(Double.NaN) is true) Some languages also have a different operator semantics for = with Strings. For example, in VB the empty string and null compare as equals when compared with = and can sometimes coerce strings representing numbers (like “5”) into numbers themselves, so you can have “5” = 5. What happens when you wrap these things in Tuples? Do you still get the correct behavior? We’ve decided that languages and not the BCL team should decide what operators do unless we have a very good reason to think that we can get the correct semantics across all languages. In this case, we don’t think we can, so we won’t be adding operators.
Anonymous
July 20, 2009
Thanks for explaining it all, Matt. It makes a bit more sense now, from the BCL's point of view. So, the bigger question is, will C# get fixed to do the right thing? Or will people using tuples in C# get hit with the completely illogical tuple(1,2) != tuple(1,2)?
Anonymous
July 20, 2009
@MichaelGG, I can't comment on the plans of the C# team. During our internal discussions about this issue the point was raised that C# could do something here in their compiler to give semantics that C# developers would want, but I'm not sure what their release plans are and don't want to commit them to anything. My recomendation would be to file an issue with connect asking them to do this.
Anonymous
July 21, 2009
I can understand the difficulty, but I am still very concerned that the semantics of Tuple equality will be completely unintuitive. I simply can't imagine a case where Tuple.Create(1, 2) == Tuple.Create(1, 2) should return false! This is just setting up developers for failure, and I urge you to reconsider what you can do to mitigate this problem before .NET 4.0 is released. Of course, if Tuples were value types instead of reference types, this wouldn't be an issue. After all, the entire point of having value types is that semantically they have value equality, which is exactly what we want for Tuples. I still have not seen an explanation of why Tuples are references types to begin with.
Anonymous
July 22, 2009
@Douglas McClean, RE: Collectible dynamic assemblies (AssemblyBuilderAccess.RunAndCollect) Tuple is a generic type. Collectible assemblies interact with Tuple in the same way they interact with generics in general: You're free to mix and match. You can have generic types and methods in collectible and regular (non-collectible) assemblies, and you can instantiate them with types defined in either collectible or non-collectible assemblies. The lifetime rules for these instantiations are the same as for the collectible types involved, and they keep alive the assembly involved where applicable. (By "collectible type" I mean "type defined in a collectible assembly"). The assembly will be collected when nothing refers to it anymore (essentially object instances, Reflection.Emit objects referring to that assembly, method frames on thread stacks). This works for value types too. Please let us know via Connect if you run into bugs, or even things that contradict your intuition. I'll try to put together a few blog posts for the CLR team blog in the near future. Ovidiu Platon, developer, CLR type system team
Anonymous
July 23, 2009
@Ovidiu Platon, Please see feedback item 476776 at https://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=476776 for detailed repro steps of the question I am trying to ask. If you'd like, I have repro code available which I was unable to post to Connect for whatever reason, I'd be happy to share it by email.
Anonymous
July 24, 2009
Thank you for taking the time to report this, Douglas. I was able to reproduce the problem you mention with a fairly simple program. I'll post updates on this bug on Connect. Please feel free to post your repro code directly in the bug description if attaching doesn't work. Ovidiu Platon, developer, CLR type system team
Anonymous
August 06, 2009
If operator== is not well-defined for Tuple, then at the very least I'd expect the compiler to complain (at least a warning), and the implementation to be there but unconditionally throw. Anything else is a recipe for submarine bugs. As a side note, if Tuple was a value type to begin with, it wouldn't have any default operator== (just like KeyValuePair does not), and thus this wouldn't be an issue.

Share via

Building Tuple [Matt Ellis]

Comments

Additional resources