What’s faster: string.Equals or string.Compare?
I just realized I was so busy lately that I haven’t blogged for a while!
Here’s a quiz that left me clueless for some time (courtesy of our C# MVP Ahmed Ilyas):
using System;
using System.Diagnostics;
public class Examples
{
public static void Main()
{
string stringToTest = "Hello";
Stopwatch equalsTimer = new Stopwatch();
equalsTimer.Start();
stringToTest.Equals("hello", StringComparison.OrdinalIgnoreCase);
equalsTimer.Stop();
Console.WriteLine("Equals Timer: {0}", equalsTimer.Elapsed);
Stopwatch compareTimer = new Stopwatch();
compareTimer.Start();
String.Compare(stringToTest, "hello", StringComparison.OrdinalIgnoreCase);
compareTimer.Stop();
Console.WriteLine("Compare Timer: {0}", compareTimer.Elapsed);
}
}
On my machine, this prints out:
Equals Timer: 00:00:00.0009247
Compare Timer: 00:00:00.0000012
We looked at the source code of string.Equals and string.Compare and it was essentially the same (modulo very minor details which shouldn’t cause issues).
So what’s wrong? Why would the first call be 770 times slower than the second one? Jitting? No. Cache hit/miss? No.
After a while, we figured it out [UPDATE: So I thought! ]. The first method is a virtual instance method, so a callvirt is emitted by the compiler:
callvirt instance bool [mscorlib]System.String::Equals(string, valuetype [mscorlib]System.StringComparison)
While the second method is a static one, so the call instruction is used instead:
call int32 [mscorlib]System.String::Compare(string, string, valuetype [mscorlib]System.StringComparison)
In this case, the method body was insignificant compared to the costs of doing virtual dispatch vs. a direct call. If you’d measure this in a loop of 1000000, the results will average out. So will they average out if you compare long strings, when the method body execution time dwarfs the call costs.
UPDATE: As always, Kirill jumps to conclusions too fast. Ahmed pointed out that if you swap the order of the calls, then the results are way different again! So it’s not the callvirt cost. Still puzzled, maybe it IS the JITter compiling the BCL code for the two method bodies.
Interesting...
Comments
Anonymous
September 22, 2010
Are the strings interned during the second method call? Just a pre-dinner guess...Anonymous
September 22, 2010
No, probably not... It's most likely the methods being jitted...Anonymous
September 22, 2010
That's not really a valid test at all. There's probably common code between both methods that is being JIT'd and processor caching taking place in the first call that drastically speed up the second call. If you don't want to add loops to smooth it out, you could just do Equals, Compare, Equals, Compare and only measure the last two.Anonymous
September 22, 2010
My guess is the first call to String.Equals loads mscorlib.dll since that would be the first place mscorlib is actually needed. Try placing a call to String.Equals before starting the stopwatch.Anonymous
September 22, 2010
Try doing something like 1000000 iterations for each call and then splitting the total time with the number of iterations. That way you the JIT will not impact your benchmark.Anonymous
September 22, 2010
> maybe it IS the JITter compiling the BCL code for the two method bodies. Framework code is NGENed though, so it doesn't need to be JITted.Anonymous
September 22, 2010
It's easy to show that it's to do with caching, paging, JITting or something like that: instead of timing two different things, time the same thing twice - you'll still get the same results. My guess is that it's bringing the code into the CPU's L1 cache which is "slow". Of course, if you put in a loop of 10 million iterations or so in order to time a sensible length of operation, this difference vanishes. Alternatively, you can call the method you're about to time before you start timing it as well... although in this case, it's all so fast that it's worth doing the same thing with the Stopwatch methods, to make sure the cost of bringing Stopwatch.Stop into the cache (or whatever's slowing it down) doesn't pollute the results.Anonymous
September 22, 2010
When running the tests 100000000 times in a loop, the timings on my machine come out as 27.24s for equals and 25.80s for Compare, which is of course far from the performance difference that you get for the single instances. For such short methods, you cannot meaningfully micro-benchmark them by running them just once. Caches and all other kinds of stuff may interfere with your results. What is interesting though, is that Compare seems to come out as a few percent faster in all of my tests. Why that is the case, I can only guess from looking at the BCL code in Reflektor. There, it seems that the code is identical with the exception of Equals using a length check to quickly return false for strings of different length. It then uses them same comparison routine as Compare and compares the result of that routine with 0, which probably adds another few cycles. My guess is that the length check, which is not helping in the present example, is what is responsible for the difference in run time. Having said all that, unless you have a really really tight loop that does nothing but billions of string comparisons, I would use what seems most natural in the program and not worry about the performance difference at all.Anonymous
September 23, 2010
indeed. I was just curious to know what "difference" there would be with String.Equals and String.Compare and came around to finding out that both methods can have the same types of parameters, and was curious to know if there would be a difference in perf (I know its very minor minor perf difference - but you wouldn't believe some developers/clients who code....poorly!) But it does show that String.Compare is slightly faster than String.Equals when you pass in the parameters as shown in the example. This was one of those "right, lets investigate and learn" nights I had recently :-) So maybe better practice would be to use String.Compare in some scenarios with the option of having culture sensitivity if required.Anonymous
September 23, 2010
The comment has been removedAnonymous
September 23, 2010
The comment has been removedAnonymous
September 23, 2010
I think it simply has to do with the GC logic. OrinalIgnoreCase needs to allocate a new string with upercase letters to do the case insensitive comparison. The GC needs to realloacte its Gen0 Heap quite some time to fine tune its size. After it has allocated enough Small object heaps the second test will run much faster since no reallocations are needed anymore. You can check this by changing it to CompareOrdinal in the comparison. Yours, Alois KrausAnonymous
September 23, 2010
they are quite the same if you test on more iterations: static void Main() { const string stringToTest = "Hello"; const int cycle = 100000000; var equalsTimer = new Stopwatch(); equalsTimer.Start(); for (int i = 0; i < cycle; i++ ) { stringToTest.Equals("hello", StringComparison.OrdinalIgnoreCase); //String.Compare(stringToTest, "hello", StringComparison.OrdinalIgnoreCase); } equalsTimer.Stop(); Console.WriteLine("Equals: {0}", equalsTimer.Elapsed); Console.ReadLine(); }Anonymous
September 23, 2010
"The real original problem was finding that there are 2 methods in the .NET Framework that allow us to do the same thing..." They most certainly do not do the same thing. Compare determines the relative order of two strings based on culture and casing rules. Equals simply determines if two strings are equivalent. For example, you can definitively return false immediately from Equals if two strings differ in length. But for Compare you still need to compare the characters until their order is determined.Anonymous
September 23, 2010
yes thats right Josh. sorry, my mind has been everywhere. I know these 2 classes work differently but I meant in that when doing an ordinalignorecase on both methods, with the same test that it returns the evaluated result, in this case if they match then its a 0 for compare and true for equals. maybe im looking way way too much into this and over confusing myself, which I sholdn't do!Anonymous
September 24, 2010
you only called each method only once and then got the conclusion? there are some many factors can affect the results.Anonymous
September 27, 2010
Sounds like quantum computing, where the simple act of measuring it changes it's result. :)Anonymous
September 27, 2010
Interesting topic :) Maybe it's better to create two separate programs for each variant and check results. Btw, have you tried to check another framework methods - string.IsNullOrEmpty, str.Length == 0, comparing with "" or string.empty ?(with previous checking if string is null)? Which is faster?Anonymous
November 30, 2010
<p>test</p>Anonymous
November 30, 2010
I have tested this exact scenario - string.Equals vs. string.Compare. The results are interesting to say the least - I'll let them speak for themselves below. There are several things in the test above that are, IMHO, undesirable methods to conduct performance tests and some questions about the test run itself:
- Don't use IL to try to diagnose any performance issues. The JIT does an incredible job of code optimization way beyond what you see in IL.
- Was the test run during low CPU activity? Context switching between threads can affect timings.
- Did you test both string being the same as well as each being different?
- What version of .NET did you run your test against?
- Was it built with optimizations turned on (in RELEASE mode)?
- Was it run from VisualStudio? Even running an app under Release mode and selecting "Start Without Debugging (ctrl+F5)" can affect results. Sometimes it will run under *.vshost.exe which will skew things. All of my performance tests are done thru Vance Morrison's (former CLR Architect) "MeasureIt" tool - blogs.msdn.com/.../measureit-update-tool-for-doing-microbenchmarks.aspx I've modified his tool to include many more performance measurements. Here are the results I obtained - the details and code for each test are below (unfortunately, I can't post in HTML where the results could more easily be viewed as a table): Each test was run 10 times with 1,000 iterations each. Stats collected were: Median, Mean, Standard Deviation. MeasureIt tool was compiled against .NET 3.5 SP1 in Release Mode and run from it's .exe file during low CPU utilization with Win7 x64 Intel Core 2 Duo E8500 with 4GB RAM. All result values are in seconds. Obviously, my tests below were run using strings of the same length. Obviously, testing with different length strings is another scenario to consider. Test 1 - Using strings of same length (9 chars) and same values: a == b Median: .141 Mean: .155 StdDev: .045 Min: .131 Max: .288 string.Equals(a, b) Median: .047 Mean: .048 StdDev: .024 Min: .031 Max: .115 string.Equals(a, b, StringComparison.CurrentCulture) Median: .183 Mean: .184 StdDev: .019 Min: .168 Max: .236 string.Equals(a, b, StringComparison.CurrentCultureIgnoreCase) Median: .175 Mean: .179 StdDev: .015 Min: .168 Max: .220 string.Equals(a, b, StringComparison.InvariantCulture) Median: .175 Mean: .181 StdDev: .020 Min: .168 Max: .236 string.Equals(a, b, StringComparison.InvariantCultureIgnoreCase) Median: .199 Mean: .199 StdDev: .014 Min: .183 Max: .236 string.Equals(a, b, StringComparison.Ordinal) Median: .168 Mean: .179 StdDev: .020 Min: .168 Max: .236 string.Equals(a, b, StringComparison.OrdinalIgnoreCase) Median: .175 Mean: .181 StdDev: .020 Min: .168 Max: .236 string.Compare(a, b) Median: 3.798 Mean: 3.804 StdDev: .020 Min: 3.791 Max: 3.859 a.Equals( b ) Median: .463 Mean: .470 StdDev: .024 Min: .455 Max: .539 StringComparer - sc.Equals( a, b ) Median: .168 Mean: .168 StdDev: .019 Min: .152 Max: .220
Test 2 - Using strings of same length (9 chars) and different values: a == b Median: .304 Mean: .301 StdDev: .009 Min: .288 Max: .319 string.Equals(a, b) Median: .288 Mean: .287 StdDev: .012 Min: .267 Max: .304 string.Equals(a, b, StringComparison.CurrentCulture) Median: 3.979 Mean: 3.982 StdDev: .006 Min: 3.979 Max: 3.995 string.Equals(a, b, StringComparison.CurrentCultureIgnoreCase) Median: 3.806 Mean: 3.869 StdDev: .090 Min: 3.791 Max: 3.979 string.Equals(a, b, StringComparison.InvariantCulture) Median: 3.510 Mean: 3.514 StdDev: .012 Min: 3.503 Max: 3.534 string.Equals(a, b, StringComparison.InvariantCultureIgnoreCase) Median: 3.518 Mean: 3.514 StdDev: .007 Min: 3.503 Max: 3.518 string.Equals(a, b, StringComparison.Ordinal) Median: .440 Mean: .441 StdDev: .005 Min: .440 Max: .455 string.Equals(a, b, StringComparison.OrdinalIgnoreCase) Median: .880 Mean: .873 StdDev: .008 Min: .864 Max: .880 string.Compare(a, b) Median: 3.670 Mean: 3.679 StdDev: .010 Min: 3.670 Max: 3.691 a.Equals( b ) Median: .267 Mean: .271 StdDev: .008 Min: .267 Max: .288 StringComparer - sc.Equals( a, b ) Median: 2.791 Mean: 2.786 StdDev: .007 Min: 2.775 Max: 2.791
Code used in Vance Morrison's MeasureIt app: static public void MeasureStringComparisonSameValues() { string a = "message 1", b = "message 1"; stringComparisonHelper( a, b ); } static public void MeasureStringComparisonWithDiffValues() { string a = "message 1", b = "friendS X"; stringComparisonHelper( a, b ); } static private void stringComparisonHelper( string a, string b ) { timer1000.Measure( "string comparison (a == b)", 10, delegate { int x = 10, y = 5; if ( a == b ) { x = x * y; } x = x * y; } ); timer1000.Measure( "string comparison - static string.Equals(a, b)", 10, delegate { int x = 10, y = 5; if ( string.Equals( a, b ) ) { x = x * y; } x = x * y; } ); timer1000.Measure( "string comparison - static string.Equals(a, b, StringComparison.CurrentCulture)", 10, delegate { int x = 10, y = 5; if ( string.Equals( a, b, StringComparison.CurrentCulture ) ) { x = x * y; } x = x * y; } ); timer1000.Measure( "string comparison - static string.Equals(a, b, StringComparison.CurrentCultureIgnoreCase)", 10, delegate { int x = 10, y = 5; if ( string.Equals( a, b, StringComparison.CurrentCultureIgnoreCase ) ) { x = x * y; } x = x * y; } ); timer1000.Measure( "string comparison - static string.Equals(a, b, StringComparison.InvariantCulture)", 10, delegate { int x = 10, y = 5; if ( string.Equals( a, b, StringComparison.InvariantCulture ) ) { x = x * y; } x = x * y; } ); timer1000.Measure( "string comparison - static string.Equals(a, b, StringComparison.InvariantCultureIgnoreCase)", 10, delegate { int x = 10, y = 5; if ( string.Equals( a, b, StringComparison.InvariantCultureIgnoreCase ) ) { x = x * y; } x = x * y; } ); timer1000.Measure( "string comparison - static string.Equals(a, b, StringComparison.Ordinal)", 10, delegate { int x = 10, y = 5; if ( string.Equals( a, b, StringComparison.Ordinal ) ) { x = x * y; } x = x * y; } ); timer1000.Measure( "string comparison - static string.Equals(a, b, StringComparison.OrdinalIgnoreCase)", 10, delegate { int x = 10, y = 5; if ( string.Equals( a, b, StringComparison.OrdinalIgnoreCase ) ) { x = x * y; } x = x * y; } ); timer1000.Measure( "string comparison - static string.Compare(a, b)", 10, delegate { int x = 10, y = 5; if ( 0 == string.Compare( a, b ) ) { x = x * y; } x = x * y; } ); timer1000.Measure( "string comparison (a.Equals( b )", 10, delegate { int x = 10, y = 5; if ( a.Equals( b ) ) { x = x * y; } x = x * y; } ); StringComparer sc = StringComparer.Create( System.Threading.Thread.CurrentThread.CurrentCulture, true ); timer1000.Measure( "string comparison - StringComparer (sc.Equals( a, b )", 10, delegate { int x = 10, y = 5; if ( sc.Equals( a, b ) ) { x = x * y; } x = x * y; } ); }