Udostępnij za pośrednictwem


Set Collection [Ryan Byington]

So the BCL thinking of implementing a set collection for a future release and we would like to get you our blog reading public opinion on what scenarios are important to you for a Set collection. The collection will likely have the standard ICollection<T> APIs like Add, Remove, Clear, and Count and we would likely support the following scenarios:

Test for membership (ie does element A exist in set T)

Test for subset (ie is set T a subset of set U)

Union

Intersection

Complement

But what other functions should a Set collection be able to perform?

Comments

  • Anonymous
    May 03, 2006
    I'd like a static property on it for null set, much like String.Empty - Set.NullSet or Set.Empty or some such notation.

    If there will be a way to check if Set A is a subset of Set B, there should also be a way to check if Set A is a superset of Set B - admittedly subset technically covers all the ways of checking for a superset (just reverse the operands), but having both methods available would be valuable.

    In the subset-checking operation, there should be a way to check if a subset is a proper subset, or an improper subset.
  • Anonymous
    May 03, 2006
    This would be a very useful addition.  Will there be an analog to STD C++'s multiset?

    It would also be useful to have an analog of multimap unless there is some way to get that behavior that we have missed?
  • Anonymous
    May 03, 2006
    Hmm.  I thought the whole point of the Power Collections project was to test out different types of collections.  These collections could then be implemented in the CLR if they proved "successful".

    If so, what's wrong with incorporating the Power Collections Set?
  • Anonymous
    May 03, 2006
    While building in the common base functionality is crucial, it will of course will impossible to cover everyone’s’ specific needs. Thus, I hope for an extendable design. There are a variety of ways to implement a set data structure, such as basic arrays, hashes, linked lists, trees, and others. Then there are ordered sets as well. And that’s just the start. I would like to see an ISet<T> interface to allow for customization.
  • Anonymous
    May 03, 2006
    Test for Overlap... Overlaps or IsDisjoint
  • Anonymous
    May 03, 2006
    I would suggest that you look at the Wintellect.PowerCollections.Set<T> class and provide that functionality along with an ISet<T> interface.  I have not yet found any methods that I need that Wintellect.PowerCollections.Set<T> does not provide.  

    But it would be really nice to have the class as part of the BCL.
  • Anonymous
    May 03, 2006
    +1 for just slurping in the PowerCollections.
  • Anonymous
    May 03, 2006
    One question is whether the set operations operate in-place, or always return a resultant set.  

    In our own experience, we find that the in-place set operations are usually the most efficient, while still remaining intuitive. We use static class methods to return the results of operations on two sets.  The only problem is that the static methods couldn't be defined in ISet<T>, but could be defined in Set<T> : ISet<T>.

    Also, we find it very useful to make the arguments to all set operations to be IEnumerable<T> instead of ISet<T>.  That way you can compose sets from sets, lists, collections, LINQ results, etc.  When performing the union of an ISet<T> with an IEnumerable<T>, it is not an exception for the IEnumerable<T> to contain multiple instances of the same T, since any instance of T is only ever added to the set once.

    eg:  class ISet<T> : ICollection<T>
        {

           void Union(IEnumerable<T> collection);
           void Intersection(IEnumerable<T> collection);
           void Complement(IEnumerable<T> collection);

        }

        class Set<T> : ISet<T>, ICollection
        {

    public static ISet<T> Union<IEnumerable<T> a, IEnumerable<T> b> { ... }

    public static ISet<T> Intersection<IEnumerable<T> a, IEnumerable<T> b> { ... }

    public static ISet<T> Complement<IEnumerable<T> a, IEnumerable<T> b> { ... }

        }

  • Anonymous
    May 03, 2006
    Oops!  "class ISet<T>" should obviously have been "interface ISet<T>".  Mea culpa!

    Would also second the suggestion of having an interface IOrderedSet<T> : IList<T>.
  • Anonymous
    May 03, 2006
    Difference (A-B = all the elements in A which are not in B).

    What about bags (i. e., duplicates allowed)?  If there are bags, then conversions to/from set/bag (bag to set could fail if the bag has duplicates).  Sequences (ordered bags) too.

    See also http://www.jfsowa.com/logic/math.htm
  • Anonymous
    May 03, 2006
    If you are going to have an interface and allow multiple implementations of sets, then this would be a good way to do it:

    interface ISet<T> : ICollection<T> {...}
    class HashSet : ISet<T> {...}
    class TreeSet : ISet<T> {...}
    class ListSet : ISet<T> {...}
    static class Set {...} //any static methods that apply to sets but can be implemnted efficiently in terms of ISet<T> should be put here (since interfaces cannot have static methods):

    Otherwise, a simple class Set<T> : ICollection<T> would suffice.
  • Anonymous
    May 03, 2006
    Find, FindAll, & ConvertAll - ala List<T>

    Love those new methods.  I go back and forth on whether ForEach would make sense on a Set in actual practice (most people think ordered indexing when they think of for loops)...
  • Anonymous
    May 04, 2006
    Difference, I agree with David, — it is a MUST. Or you may call it subtraction.
  • Anonymous
    May 04, 2006
    I'll toss in another vote for a Subtraction function.
  • Anonymous
    May 04, 2006
    From past experiences the code for actions on multiple sets is ugly. It would be nice to have something like the following for each of the ISet methods on the Set class:

    public static Set<T> Union<T>( params ISet<T>[] sets ) { ... }
  • Anonymous
    May 08, 2006
    Extending the set to different providers? Basically somehow to figure it it exits/add etc

    Set a = new SetFactory("MY DB");
    if(!("User" isAnElementOf a))
        a.Add("User");

    helps for DB/file related programing styles with providers.
  • Anonymous
    May 12, 2006
    Thanks for all of the great feedback. We will take all of your feedback into consideration in the design. This has been a big help.

    To answer DJ's questions yes, we are considering multiset analogous to STL's class for a future release.