How do you decide what goes in an interface?
I've been having an interesting talk with Doug McClean about appropriate locations of methods. We're discussing if the ICollection<A> interface should include the Transform method. Specifically should the interface look like this:
public interface ICollection<A> {
...
ICollection<B> Transform<B>(IBijection<A,B> bijection);
...
}
Doug thinks (feel free to correct me Doug and I'll update this page) that Transform isn't really appropriate on the main collection interface because it carries a burden on consumers of the ICollection interface to understand it. Instead it should be pushed down a specialized subinterface (like ITransformableCollection) and a dummy implementation on a helper class. i.e.
public static class CollectionHelpers {
public static ICollection<B> Transform<A,B>(ICollection<A> collection, IBijection<A,B> bijection) {
ITransformableCollection<A> tc = collection as ITransformableCollection<A>;
if (tc != null) {
tc.Transform<B>(bijection);
}
return new TransformedCollection<A,B>(collection, bijection);
}
}
(where TransformedCollection is a specialized class that uses the bijection to pass values to and from the underlying collection).
I was thinking about this and trying to determine how you decide what should go on an interface. One could make the argument that you should keep the interface as simple as possible and only provide the bare minimum of methods that give you full funcationality. However, if I were to take that argument to it's logical conclusion than the entire ICollection interface would look like:
public interface ICollection<A> {
B Fold(Function<B,Function<A,B>>, B accumulator);
}
Using that one could get every other bit of functionality that is in the list interface. You could implement 'ForAll', 'Exists', 'Find','Count', 'Iterate', 'Transform', 'FindAll', 'Filter', and everything else in the current interface on top of Fold. However, this wouldn't be a very convenient interface to use. All you'd ever see was folds and you'd always be asking yourself "what is this fold doing?" Chances are a lot of those times you'd be doing something like 'counting' or 'finding an element'. And so one could argue "certain actions that people commonly do on a collection should be pushed into the interface". So what happens when you run into a method like 'Transform'. I've already run into cases where I've needed it. However, is it common enough that other people will run into it? Or is it something that will be used all of 0.01% of the time. Is that enough to keep it in the interface? Should that even be part of the criteria?
I'd love thoughts on this problem, addressing this issue if possible, or other cases where you're run into this.
Comments
- Anonymous
June 20, 2004
The comment has been removed - Anonymous
June 20, 2004
Doug: Actually, my collections don't have a Count property. Otherwise they wouldn't be able to represent infinitely large collections. I've left out count because of that. - Anonymous
June 20, 2004
Yeah, I've had issues with infinitely large collections too. Maybe ICollection just doesn't exist, and we should mix and match IEnumerable, IFiniteCollection, ITransformableCollection, ...
One abstraction at a time.
Also, this isn't a big deal, but my name is McClean not McLean (in the article). - Anonymous
June 20, 2004
Something else to think about here (which could go on the language feature request thread too) is that at this level of granularity, which I think is conceptually useful, it is difficult to properly type your parameters if you require some features of more than one interface. This can sometimes be worked around with generic methods and constraints, but I'm wondering if there isn't a more general solution.
In general, a set of interface types A = {I1, I2, .. IN} defines another type TypeA that only allows values that are of types that implement every interface in A. Could we use types like that as parameter types? - Anonymous
June 20, 2004
Doug: Could you explain the last bit about the set of interface types? - Anonymous
June 20, 2004
Sure Cyrus. You were actually talking about it before in a really old post, about how having lots of small interfaces is unwieldy if you need access to features from some set of them in order to write a method.
Postulate these types:
interface IEnumerable<T> // as defined in mscorlib now
interface IFiniteCollection<T> : IEnumerable<T> {
int Count {get;}
}
interface IOrderedCollection<T> : IEnumerable<T> {
IOptional<T> MaximumValue {get;}
IOptional<T> MiniumValue {get;}
IEnumerator<T> GetOrderedEnumerator();
}
Suppose you desire to write a method that will return the third quartile of a collection of some numeric type that can be subtracted and divided. Since we haven't worked out the operators yet, lets just suppose the type is double, so that we have this:
SomeFiniteOrderedCollectionType<double> theList; // initialized somehow
// note that SomeOrderedCollectionType<T> : IFiniteCollection<T>, IOrderedCollection<T>, ...
Our problem is, we need access to the count, from IFiniteCollection<T>, and the maximum and minimum values from IOrderedCollection<T>, but we don't want to require the concrete type of the parameter be anything specific. Ooops. Now we are up a creek without a paddle. We could type the parameter as one or the other, and throw an exception if it wasn't both. Or type it is IFiniteCollection<T> and fallback to searching for the Min and Max if it happened not to be IOrderedCollection<T>, but suppose that we don't want either of those solutions. (I am trying to make a simple example of a case where we need access to two distinct interfaces.)
Suppose we had a syntax like this:
double ComputeThirdQuartile({IFiniteCollection<double>,IOrderedCollection<double>} list) {
// in here, the compiler knows that list implements both IFiniteCollection<double> and IOrderedCollection<double>, and requires callers to pass something that is both those things
return (list.MaximumValue - list.MinimumValue) / list.Count;
// TODO: handle the case of empty list and hence unwrapping the IOptional<double> instances. Consider this psuedocode to get my point across.
}
This obviously could be refined, or we could change the syntax, or whatever, but my point is that we should think about this because it removes the number one drawback to using very granular interfaces, namely that they can't be freely mixed and matched. - Anonymous
June 20, 2004
Doug: I see exactly what you mean now. Another alternative is this:
double ComputeThirdQuartile<T>(T list) where T : IFiniteCollection<double>, IOrderedCollection<double> {
}
But it's certainly verbose.
I think what I'd prefer is if C# moved to having a system where we combined type inference and structural subtyping so that you could just do:
double ComputeThirdQuartile(list) {
uint count = list.Count;
IOptional<double> max = list.Max;
}
and then we'd figure out all the types. - Anonymous
June 20, 2004
Yeah, that's what I had in mind when I wrote "This can sometimes be worked around with generic methods and constraints, but I'm wondering if there isn't a more general solution." Maybe it can always be worked around, but I have a feeling I had a counter-example once. I might find it in my notes somewhere. Declaring a return type like this, would be one complication.
I like the type inference, to a point. But there are serious versioning issues, I think, as well as issues with abstract and virtual declarations. It could be a good approach for some languages, but it doesn't seem in keeping with the C# style in some ways.
This area seems worth some more thought. I'm not sure any of our three approaches so far is the way to go entirely. - Anonymous
June 21, 2004
The comment has been removed - Anonymous
June 28, 2004
Another thing enabled by types like {I1, I2} is a language syntax like this:
SomeType x; // init somehow
x.DoStuff();
x.DoOtherStuff();
when(x is I1) {
// variable x has the type {SomeType, I1} in this scope
x.DoStuffDefinedByI1();
}
This would be much easier to use and more readable than the ubiquitous:
I1 xAsI1 = x as I1; // it's usually difficult to name this new variable
if(xAsI1 != null) {
xAsI1.DoStuffDefinedByI1();
}
Code of this type is a common source of mistakes or inefficiencies in beginner's C# programs also, and syntax support for this scenario would go a long way to changing that.