다음을 통해 공유


.NET: Best Approach Implementing equality comparison

Comparison and equality are one of the things that are done in an strange way in .NET.


Difference between equality and comparison

First thing to note is that, there is a difference between equality comparison and a less than/greater than comparison. On the whole, equality is used in searching and looking up while comparison is used in sorting. Equality comparison is always possible. You compare whether two things are equal. But less than/greater than might not necessarily be possible in all situations.

There are also two general ways when comparing two things (whether for equality or less than/greater than):

  • Objects themselves perform comparison on each other
  • Another object -and external object- performs comparison on them

In the first approach, we call the Equals() method an object, passing it the other object we intend to compare its equality with, and the former object performs equality comparison himself and sees whether it is equal to the given object or not. While Equals() method provides us a good point to override comparison, the limitation is that, we can override Equals() only once in a class, while there might be different equality contexts available for comparison.

For example in a collection of Person objects, one time we might wish to compare equality based on LastName and another time we might want to do it based on BirthDate. Using Equals() method regretfully gives us only one chance for comparison. Also, it corrupts the equality rule of all of our objects. Apparently we don't want to change the equality algorithm of our objects, one of which is equality comparison based on reference that is done intrinsically by object base class. Also, what if we don't have access to the source code of the class we are using its instances in our application?! Let's not think about such a frightening situation (in that case, inheritance is an ultimate shot, however, not the only shot, as we will see soon). This leads us to another approach.

In the second approach we use another object as a judge that performs comparison (whether it be equality or less than/greater than) and returns the result. Because the judge object is external and can be any object, we will potentially have numerous options to use for such comparison. One time we might use a LastNameEqualityComparer object, another time use a BirthDateEqualityComparer and another time use whatever equality comparer we want. We have total freedom.

Now, we get to the point where I said there is strange or anomalous behavior in .NET collections regarding comparison.

Dictionary

Some collections such as Dictionary<TKey, TValue> provides us a way to pass them an equality comparer object in their constructors when we are creating an instance of them.

public Dictionary(IEqualityComparer<TKey> comparer)
  
example: (from https://msdn.microsoft.com/en-us/library/ms132072(v=vs.110).aspx)
public class  Example
{
 public static  void Main()
 {
 // Create a new Dictionary of strings, with string keys 
 // and a case-insensitive comparer for the current culture.
 Dictionary<string, string> openWith = 
 new Dictionary<string, string>( 
 StringComparer.CurrentCultureIgnoreCase);
  
 // Add some elements to the dictionary.
 openWith.Add("txt", "notepad.exe");
 openWith.Add("bmp", "paint.exe");
 openWith.Add("DIB", "paint.exe");
 openWith.Add("rtf", "wordpad.exe");
  
 // Try to add a fifth element with a key that is the same 
 // except for case; this would be allowed with the default 
 // comparer. 
 try
 {
 openWith.Add("BMP", "paint.exe");
 }
 catch (ArgumentException)
 {
 Console.WriteLine("\nBMP is already in the dictionary.");
 }
  
 // List the contents of the sorted dictionary.
 Console.WriteLine();
 foreach( KeyValuePair<string, string> kvp in  openWith )
 {
 Console.WriteLine("Key = {0}, Value = {1}", kvp.Key, 
 kvp.Value);
 }
 }
}

But some collections don't provides us a way in their constructors to pass them a custom comparer. One instance is Stack.

If we read the MSDN documentation of the Contains() method in the non-generic Stack, and generic Stack<T> classes, we get the following sayings that reveals everything:

  • non-generic Stack.Contains(): this method determines equality by calling Object.Equals.
  • generic Stack<T>.Contains(): this method determines equality using the default equality comparer EqualityComparer<T>.Default for T, the type of values in the list.

If we use a non-generic Stack class, our only choice is overriding Equals() in the class of our object. But if we use non-generic Stack<T>, .NET team generously favored us one other tiny choice. We can implement the IEquatable<T> interface in our class that has an Equals() method and implement the algorithm of our new equality comparison in this Equals() method. Why we should do that? Because that is what EqualityComparer<T>.Default does! See MSDN documentation again:

https://msdn.microsoft.com/en-us/library/ms224763(v=vs.110).aspx

The Default property checks whether type T implements the System.IEquatable<T> interface and, if so, returns an EqualityComparer<T> that uses that implementation. Otherwise, it returns an EqualityComparer<T> that uses the overrides of Object.Equals and Object.GetHashCode provided by T.

Although this explanation is a little misleading or vague, simply put it says, the Default property returns a comparer object that checks whether the objects being compared have implemented IEquatable<T> interface or not. If so, it uses the implemented Equals() of that interface in the objects, otherwise it resorts to the intrinsic Equals() method that is inherited to all objects from the object, father of all, base type.

This IEqualityComparer<T>.Default object and that IEquatable<T> interface together help not to corrupt the innate Equals() methods of our classes.

However, as good as what .NET team might have thought by favoring us using an IEqualityComparer<T>.Default in the Contains() method of the non-generic Stack<T> class, their solution is far from what is expected. Because again it stucks us to the first problem. We have only one and only one chance to define an equality comparison algorithm in our class. Naturally we can't implement an IEquatable<T> interface multiple times in our class.

The tiny problem is that, they missed adding a new constructor in Stack<T> class that accepts an IEqualityComparer<T> like what they have done in Dictionary<TKey, TValue>. This is a shame. Because this is not a rare occasion. The same is true for some other collections such as Queue<T>, HashSet<T>, LinkedList<T> and List<T>. I don't know whether they have did this intentionally or they simply forgot to do that.

So what? What should we do if we had multiple equality testing algorithms.

Fortunately there is still hope.

Extension Methods

If .NET team working on generic collections forgot adding new constructors to generic classes, they did a good job and solved the problem from the root by adding a bunch of extension methods to all IEnumerable, IEnumerable<T> collections in System.Linq namespace and freed themselves forever. Look at the following extension methods in System.Linq namespace:



      public static bool Contains<TSource>(
                    this IEnumerable<TSource> source,  
                    TSource value,      
                    IEqualityComparer<TSource> comparer      
      )  

You got the idea? They defined a general Contains() method for any IEnumerable<T> collection that allows us to give it a custom equality comparer object. Hooray! Problem solved. But wait. Why should we be happy? That comparer parameter might still use IEquatable<T> and presumes the objects have an Equals() method! Oh my gush! Still returned to the same point and the problem exists. We stuck forever! Don't freak out. Be calm.

The IEqualityComparer<T> interface is defined this way:



      public interface IEqualityComparer<in T>
      {  
                    bool Equals(T x, T y);  
                    int GetHashCode(T obj);  
      }  

It says, an equality comparer should have an Equals() method and it is in this very method that the equality comparison algorithm will go. This method receives two objects and compares them together using whatever algorithm the creator of the equality comparer class has intended.

The good point of this IEqualityComparer<T> and that Contains<T>() extension method is that, your objects are not expected to implement an IEquatable<T> as well. This is another good news. Because we are neither forced to override Equals() in our class and corrupt it, nor we have to implement IEquatable<T> in them. In fact, our classes remain clean and intact and we even don't have to have their source code.

So, this was the final cure for the malignant issue of equality comparison. The same story is true for less than/greater than comparison.

Conclusion

In conclusion, never override the intrinsic Equals() method, inherited from Object, in your classes. Instead use the extension methods that has a comparer parameter in their signature and receive a comparer object (like Contains<T>()).

All extension methods do not have an overload that has a comparer parameter. You can yourself write the required extension method you need and complete the probably incomplete work in .NET.

See Also

Working with .NET Equality Features