Jaa


I Don't Like Arrays

The number one reason that I dislike arrays in .NET is the fact that they implement IList<T> explicitly, thereby burying useful members like IndexOf behind a cast or equally ugly calls to static methods on System.Array, and needlessly renaming the Count property to Length. As a result, it's unduly difficult to change code that operates on an array to use a more versatile collection in its place.

In fact, I don't like explicit interface implementation (EII) much at all. In my opinion, there are only two valid reasons to use EII: to hide members which unconditionally throw NotSupportedException, or to implement a poor man's return type covariance or parameter type contravariance. I consider EII harmful in all other cases. [Update: I overstated things, there are other cases where EII is acceptable, see the comments for details.]

The second reason that I dislike arrays is that they show up too often in public API, which further exacerbates their impedance mismatch with other collections. (To be fair, this isn't an issue with arrays themselves so much as how they are used in the wild.)

I pointed out to colleagues this morning that the same arguments that we provide for avoiding List<T> in public API, mainly that it's designed for performance rather than extensibility, can be applied to T[].

As a follow-up to that conversation, we decided to change one of our API which returns string[] to return Collection<string>. This saved us a truly unnecessary call to List<string>.ToArray() and allows us to further customize the API in the future, perhaps by making it lazy and more efficient in the exceedingly common case where it is used only for enumeration.

I was happy with this change until it broke one of our test cases which called String.Join on the resulting array. There's no reason why String.Join couldn't take IEnumerable<string> or at least IList<string> in place of string[], but it doesn't, and we were forced to write our own Join method to work around the problem.

One of the main arguments in favor of typing parameters as arrays is to enable the addition of the params keyword. I don't dispute the convenience of params, but there's a way to have your cake and eat it too:

     public void DoSomething(params string[] arguments) {
        DoSomething((IEnumerable<string>)arguments);
    }

    public void DoSomething(IEnumerable<string> collection) {
        ...
    }

In fact, it would be nice if C# supported params IEnumerable<T> and implemented it as above, or even better would be to save the unnecessary overloads by training our compilers to make sense of ParamArrayAttribute for all trailing parameter types to which there exists an implicit conversion from the strongest array type that can hold the trailing arguments.

This actually brings me to another topic which gets a lot of press these days: dynamic vs. static typing. I've recently rekindled a fondness that I developed in university for the Scheme programming language and I've also been familiarizing myself with Ruby, which I like a lot.  The most important thing that I've learned from Ruby (which might be obvious to the Smalltalkers of the world, but was news to a Java-educated punk like me), is that static typing can actually get in the way of polymorphism and object-orientation. For example, imagine if String.Join didn't declare the type of its array argument, but instead just used the features that it needed. I could then pass it a collection which happens to quack just like the array that the developer had in mind and everything would just work...

That's not to say that there's no value in static-typing. For one thing, it can help produce faster code. For another, it helps drive statement completion features like IntelliSense. And finallly, it makes my day job writing static code analysis much easier. :)

Let me summarize my thoughts as follows:

  1. If we ever build a new framework, let's make sure that arrays and collections are syntax-compatible from day one.
  2. Don't abuse explicit interface implementation.
  3. Where possible, prefer abstractions like IEnumerable<T> and IList<T> over T[] for parameter types in public API.

I smell some new FxCop rules lurking. What do you think?

Comments

  • Anonymous
    June 27, 2006
    The comment has been removed
  • Anonymous
    June 27, 2006
    I agree with most of the post. In fact, many of the recommendations are more or less included in the Framework Design Guidelines. The only thing I would debate is that I think there are several other good reason to use IEE besides what you mentioned. For example, we often implement a member explicitly when there is no reason to call it other than through the interface. For example, there is no reason to call IsReadOnly on an array or on List<string>.
  • Anonymous
    June 27, 2006
    Dave, I agree that GetObjectData is a special case since ISerializable.GetObjectData is only intended to be called during serialization and would only serve to clutter up the public surface area.

    Krzysztof, you're absolutely right about List<T>.IsReadOnly. I suppose that my example of hiding members which throw NotSupportedException is just a special case of the more general idea of hiding all members which become redundant or useless once you already know the concrete type.

    I've just re-read the Framework Design Guidelines section on EII: section 5.1.2 on page 111 of the book. There are 3 "Consider" clauses. I think the first two are right on the mark, but if it were up to me, I'd change the last one about renaming members from "Consider" to "Do not".  I can't think of a single instance where where it was helpful that concrete types and interfaces used different names to indicate the same operation.
  • Anonymous
    June 27, 2006
    Today, when the concept of disposing is relatively well known among .NET developers, it might seem fine to have FileStream.Dispose and not Close. But when we designed the first version of the Framework, we were afraid that there is just too much history behind the concept of “closing a file” and so we felt it’s really important to have methods called Close on such types. Also, there is a nice symmetry in “Open” and “Close.” In case of “Open” and “Dispose,” the symmetry is not as obvious.  In the end, I think both approaches have pros and cons.
  • Anonymous
    June 27, 2006
    The comment has been removed