BinaryInsert Part2
Previously I discussed a potential missing API in List(Of T).BinaryInsert. One of the items I mentioned was it had better performance because it was O(Log N) vs Insert and Sort which is O(NLogN). Several users correctly pointed out this was incorrect and that Insert() had the additional overhead of an Array.Copy() which is O(N)ish. But most agreed O(N) + O(LogN) was better than O(NLogN).
Given that I already missed a key portion, I decided to write a test program to try out the various methods. Caveat: I'm not a performance guy. While I find performance intriguing and interesting it is by no means my specialty. Any single performance test is unlikely to capture all real world scenarios. However I did find the results a bit surprising. At the bottom of the post is the test code I wrote.
Here is the summary output.
Force Jit
BinaryInsert: 00:00:00.0051167
Insert Then Sort: 00:00:00.0000251
Range (0-99)
BinaryInsert: 00:00:00.0000266
Insert Then Sort: 00:00:00.0000316
Random 10
BinaryInsert: 00:00:00.0000053
Insert Then Sort: 00:00:00.0000034
Random 100
BinaryInsert: 00:00:00.0000294
Insert Then Sort: 00:00:00.0000235
Random 1000
BinaryInsert: 00:00:00.0004917
Insert Then Sort: 00:00:00.0001526
Random 10000
BinaryInsert: 00:00:00.0261899
Insert Then Sort: 00:00:00.0018287
Random 100000
BinaryInsert: 00:00:02.4289054
Insert Then Sort: 00:00:00.0237019
As you can see, based on my sample program, BinaryInsert is much slower than Insert and Sort. I ran the profiler against this and verified the suspicion that List(Of T).Insert() took the vast majority of the time.
Perhaps there is a reason BinaryInsert is missing.
Module Module1
Function BinaryInsert(Of T)(ByVal enumerable As IEnumerable(Of T), ByVal comp As IComparer(Of T)) As TimeSpan
Dim list As New List(Of T)
Dim watch As New Stopwatch()
watch.Start()
For Each value In enumerable
list.BinaryInsert(value, comp)
Next
watch.Stop()
Return watch.Elapsed
End Function
Function InsertAllThenSort(Of T)(ByVal enumerable As IEnumerable(Of T), ByVal comp As IComparer(Of T)) As TimeSpan
Dim list As New List(Of T)
Dim watch As New Stopwatch()
watch.Start()
For Each value In enumerable
list.Add(value)
Next
list.Sort(comp)
watch.Stop()
Return watch.Elapsed
End Function
Sub TestBoth(Of T)(ByVal title As String, ByVal enumerable As IEnumerable(Of T))
TestBoth(title, enumerable, Comparer(Of T).Default)
End Sub
Sub TestBoth(Of T)(ByVal title As String, ByVal enumerable As IEnumerable(Of T), ByVal comp As IComparer(Of T))
Dim copy = New List(Of T)(enumerable)
Dim ellapsedBinary = BinaryInsert(New List(Of T)(copy), comp)
Dim ellapsedSort = InsertAllThenSort(New List(Of T)(copy), comp)
Console.WriteLine(title)
Console.WriteLine("BinaryInsert: {0}", ellapsedBinary)
Console.WriteLine("Insert Then Sort: {0}", ellapsedSort)
End Sub
Function Range(ByVal start As Integer, ByVal count As Integer) As List(Of Integer)
Dim list = New List(Of Integer)
For i = start To count - 1
list.Add(i)
Next
Return list
End Function
Function Random(ByVal count As Integer) As List(Of Integer)
Dim rand As New Random()
Dim list = New List(Of Integer)
For i = 0 To count - 1
list.Add(rand.Next())
Next
Return list
End Function
Sub Main()
TestBoth("Force Jit", New Integer() {1})
TestBoth("Range (0-99)", Range(0, 100))
TestBoth("Random 10", Random(10))
TestBoth("Random 100", Random(100))
TestBoth("Random 1000", Random(1000))
TestBoth("Random 10000", Random(10000))
TestBoth("Random 100000", Random(100000))
End Sub
End Module
Comments
Anonymous
April 07, 2008
Well, if you want to compare the performance of 2 pieces of code you need to try to make sure that they are equivalent otherwise you may end up comparing apples and oranges. In this particular case there is a subtle difference between InsertAllThenSort and BinaryInsert. BinaryInsert has an additional property that InsertAllThenSort does not have: it keeps the list sorted during adding. If you move list.Sort() inside the For Each loop you may find that BinaryInsert is faster, or maybe not :). I did not tried it.Anonymous
April 07, 2008
The comment has been removedAnonymous
April 07, 2008
Mike, I agree with what you said about making sure you're actually comparing the same values. I should have been more clear here. In this case I was considering the code for the following scenario. There is a long lived search which returns chunks of data at a time in an un-ordered fashion. The display for this data is continually updated and needs to be in a sorted fashion. So I wanted to find the fastest way to insert and sort a set of values into a sorted list. That's why I designed this performance "test" in this way.Anonymous
April 07, 2008
Jeff, Thanks for the detailed analysis there. I love the idea of shimming in a Comparison class which will acurately track the total number of comparisons used.Anonymous
April 08, 2008
jaredpar, OK, so if I understand it correctly the list where you insert stuff lives longer than BinaryInsert/InsertAndSort methods and such an insert method is called with that list and a "chunk" of data. If my understanding is correct then the size of the chunk will make a difference. For example if the total number of values to insert is 10000 and the chunk size is 10 then BinaryInsert wins hands down (0.4ms vs 40ms). For a chunk size of 100 they're equal and for a chunk size of 1000 InsertAndSort wins (0.08ms vs 0.4ms). Thanks for posting this, it's fun to do this kind of stuff. Jeff, "does comparisons and memory swaps but no copies" Not be nitpicking, but what a memory swap is? Is there such a wonderful operation called memory swap that does not involve at least 2 copies? :) For the record: I "stolen" the quicksort implementation from Rotor and modified it to count the swaps. The result: about 30000 swaps for an array of 10000 Int32s. That means 60000 copies which means you can make at least 6 worst case BinaryInserts in the same time.Anonymous
April 08, 2008
Mike: You're right that a memory swap copies contents in a sense. What I was referring to was a "bulk copy" of everything in the array past the point of insertion. If you have the list [2, 3, 5, 7] and insert "4", you'd note that you've exceeded the capacity of 4 elements and then the list would grow to capacity * 2 = 8: (X denotes a reserved but unused slot) [2, 3, 5, 7, X, X, X, X] via an Array.Copy and then another bulk copy (Array.Copy) to give: [2, 3, 5, 5, 7, X, X, X] (Note the repeated 5) and then a "swap" / update to give [2, 3, 4, 5, 7, X, X, X] When I said "swap", I meant the array size isn't touched and no bulk copies are needed. I didn't realize that "Insert" is a bit inefficient in the design and thus does 2 bulk copies. So, the elements copied number is even worse than I reported. What is interesting is if you know you won't repeat elements, you can use a SortedDictionary<T, Boolean>. This under the covers will use a Red Black tree implementation. The real interesting bit is that the red black tree is still slightly slower than the insert then sort approach. This is probably because it's not taking advantage of the speedy L1 and L2 cache for contiguous chunks of memory that Array uses. In addition, the retrieval for an index then becomes O(log n) instead of O(1). It's all about trade offs :) The real question is how exactly will you be using it to determine what's needed.Anonymous
April 08, 2008
"I didn't realize that "Insert" is a bit inefficient in the design and thus does 2 bulk copies. So, the elements copied number is even worse than I reported." Not really. It only does 2 bulk copies when it needs to expand the array but since expanding is done by doubling the size one of copies can be taken out the picture. After all List.Add() does the same bulk copy when it needs to expand the array. In the end is one bulk copy from an Insert without expand compared with the number of swaps/copies generated by quicksort which as I detailed above seems to be quite large. And indeed it's all about trade offs. That's why I find this kind of stuff to be quite fun.