C#: String comparison guidelines and common usage
The recommendation for string comparison has been updated for Whidbey (Visual Studio 2005) and there is an excellent MSDN article on this. One of the highlights is the introduction of the clear cut enumeration that can be passed into most string comparison methods to indicate the kind of comparison you are trying to make.
[Serializable][ComVisible(true)]public enum StringComparison{ CurrentCulture = 0, CurrentCultureIgnoreCase = 1, InvariantCulture = 2, InvariantCultureIgnoreCase = 3, Ordinal = 4, OrdinalIgnoreCase = 5,}
The recommendation also states that for culture-agnostic comparisons use the Ordinal and OrdinalIgnoreCase comparisons. These are fast and also safe. They rely on byte matching and are excellent options for matching strings for internal (non-UI) processing.
string.Compare(str1, str2, StringComparison.Ordinal);
With the introduction of the guidelines, developers have become defensive and have started looking for all code that compare string and re-coding them to meet the guidelines. Let's take a look at the most common culture-agnostic string matching used in code and see if they are safe.
string.Equals(string1, string2)
Default interpretation for equals is Ordinal so using this is fine. In case of using any other type of comparison use in the lines of
string.Equals(string1, string2, StringComparison.OrdinalIgnoreCase);
string1 == string2
In accordance to the class library design guidelines the == operator for string is overloaded and the implementation is same as for string.Equals. So this is equivalent of calling string.Equals(string1, string2, StringComparison.Ordinal) which is also fine.
switch(string1)
For small sized switch blocks of the form
string myStr = "Abhinaba";// ...switch (myStr){ case "Abhinaba": Console.WriteLine("switch match"); break; default: Console.WriteLine("switch did not match"); break;}
the code is compiled into
if ((myStr!= null) && (myStr == "Abhinaba")){ Console.WriteLine("switch match");}else{ Console.WriteLine("switch did not match");}
So this is also fine and the comparison will be a string.Equals( strA, strB, StringComparison.Ordinal) comparison.
However, if the switch block is larger then things get complicated. A dictionary is created and lookup happens through Dictionary.TryGetValue with the string as the key. Lookup happens using code similar to
int num1 = this.comparer.GetHashCode(key) & 0x7fffffff;for (int num2 = this.buckets[num1 % this.buckets.Length]; num2 >= 0; num2 = this.entries[num2].next){ if ((this.entries[num2].hashCode == num1) && this.comparer.Equals(this.entries[num2].key, key)) { return num2; }}
A quick look and a bit of poking around with reflector indicates that the result will ultimately be the same as string.Equals( strA, strB, StringComparison.Ordinal).
InvariantCulture
As the guideline suggests replace all InvariantCulture usage with either Ordinal or OrdinalIgnoreCase
Be on the Safe Side
The above discussion was mainly to figure out what to make out of common string comparison statements in existing code. Going forward I think its best to be defensive and clear in code and explicitly call the comparison methods and pass the correct StringComparison enumeration constant.
Comments
- Anonymous
April 12, 2007
Hello. Great article on the StringComparison. Do you have any recommendations on how this can be implemented in C# 1.1? - Anonymous
June 27, 2007
So whats the recommendation if we want to use switch and case in-sensitive comparison?