Set Collection [Ryan Byington]
So the BCL thinking of implementing a set collection for a future release and we would like to get you our blog reading public opinion on what scenarios are important to you for a Set collection. The collection will likely have the standard ICollection<T> APIs like Add, Remove, Clear, and Count and we would likely support the following scenarios:
Test for membership (ie does element A exist in set T)
Test for subset (ie is set T a subset of set U)
Union
Intersection
Complement
But what other functions should a Set collection be able to perform?
Comments
- Anonymous
May 03, 2006
I'd like a static property on it for null set, much like String.Empty - Set.NullSet or Set.Empty or some such notation.
If there will be a way to check if Set A is a subset of Set B, there should also be a way to check if Set A is a superset of Set B - admittedly subset technically covers all the ways of checking for a superset (just reverse the operands), but having both methods available would be valuable.
In the subset-checking operation, there should be a way to check if a subset is a proper subset, or an improper subset. - Anonymous
May 03, 2006
This would be a very useful addition. Will there be an analog to STD C++'s multiset?
It would also be useful to have an analog of multimap unless there is some way to get that behavior that we have missed? - Anonymous
May 03, 2006
Hmm. I thought the whole point of the Power Collections project was to test out different types of collections. These collections could then be implemented in the CLR if they proved "successful".
If so, what's wrong with incorporating the Power Collections Set? - Anonymous
May 03, 2006
While building in the common base functionality is crucial, it will of course will impossible to cover everyone’s’ specific needs. Thus, I hope for an extendable design. There are a variety of ways to implement a set data structure, such as basic arrays, hashes, linked lists, trees, and others. Then there are ordered sets as well. And that’s just the start. I would like to see an ISet<T> interface to allow for customization. - Anonymous
May 03, 2006
Test for Overlap... Overlaps or IsDisjoint - Anonymous
May 03, 2006
I would suggest that you look at the Wintellect.PowerCollections.Set<T> class and provide that functionality along with an ISet<T> interface. I have not yet found any methods that I need that Wintellect.PowerCollections.Set<T> does not provide.
But it would be really nice to have the class as part of the BCL. - Anonymous
May 03, 2006
+1 for just slurping in the PowerCollections. - Anonymous
May 03, 2006
One question is whether the set operations operate in-place, or always return a resultant set.
In our own experience, we find that the in-place set operations are usually the most efficient, while still remaining intuitive. We use static class methods to return the results of operations on two sets. The only problem is that the static methods couldn't be defined in ISet<T>, but could be defined in Set<T> : ISet<T>.
Also, we find it very useful to make the arguments to all set operations to be IEnumerable<T> instead of ISet<T>. That way you can compose sets from sets, lists, collections, LINQ results, etc. When performing the union of an ISet<T> with an IEnumerable<T>, it is not an exception for the IEnumerable<T> to contain multiple instances of the same T, since any instance of T is only ever added to the set once.
eg: class ISet<T> : ICollection<T>
{
void Union(IEnumerable<T> collection);
void Intersection(IEnumerable<T> collection);
void Complement(IEnumerable<T> collection);
}
class Set<T> : ISet<T>, ICollection
{
public static ISet<T> Union<IEnumerable<T> a, IEnumerable<T> b> { ... }
public static ISet<T> Intersection<IEnumerable<T> a, IEnumerable<T> b> { ... }
public static ISet<T> Complement<IEnumerable<T> a, IEnumerable<T> b> { ... }
} - Anonymous
May 03, 2006
Oops! "class ISet<T>" should obviously have been "interface ISet<T>". Mea culpa!
Would also second the suggestion of having an interface IOrderedSet<T> : IList<T>. - Anonymous
May 03, 2006
Difference (A-B = all the elements in A which are not in B).
What about bags (i. e., duplicates allowed)? If there are bags, then conversions to/from set/bag (bag to set could fail if the bag has duplicates). Sequences (ordered bags) too.
See also http://www.jfsowa.com/logic/math.htm - Anonymous
May 03, 2006
If you are going to have an interface and allow multiple implementations of sets, then this would be a good way to do it:
interface ISet<T> : ICollection<T> {...}
class HashSet : ISet<T> {...}
class TreeSet : ISet<T> {...}
class ListSet : ISet<T> {...}
static class Set {...} //any static methods that apply to sets but can be implemnted efficiently in terms of ISet<T> should be put here (since interfaces cannot have static methods):
Otherwise, a simple class Set<T> : ICollection<T> would suffice. - Anonymous
May 03, 2006
Find, FindAll, & ConvertAll - ala List<T>
Love those new methods. I go back and forth on whether ForEach would make sense on a Set in actual practice (most people think ordered indexing when they think of for loops)... - Anonymous
May 04, 2006
Difference, I agree with David, — it is a MUST. Or you may call it subtraction. - Anonymous
May 04, 2006
I'll toss in another vote for a Subtraction function. - Anonymous
May 04, 2006
From past experiences the code for actions on multiple sets is ugly. It would be nice to have something like the following for each of the ISet methods on the Set class:
public static Set<T> Union<T>( params ISet<T>[] sets ) { ... } - Anonymous
May 08, 2006
Extending the set to different providers? Basically somehow to figure it it exits/add etc
Set a = new SetFactory("MY DB");
if(!("User" isAnElementOf a))
a.Add("User");
helps for DB/file related programing styles with providers. - Anonymous
May 12, 2006
Thanks for all of the great feedback. We will take all of your feedback into consideration in the design. This has been a big help.
To answer DJ's questions yes, we are considering multiset analogous to STL's class for a future release.