次の方法で共有


Missing API: List(Of T).BinaryInsert

One API that seems to be missing from List(Of T) is a BinaryInsert method.  Especially since there is already a BinarySearch method.

Binary insert is a method for inserting a value into an already sorted list.  Since the list is already sorted we can do a binary search to find the appropriate place to insert.  The insert keeps the list sorted so the cost of a binary insert is just the cost of the search which is O(Log(N)). 

An alternative method for keeping a sorted list sorted is to insert and then resort.  Most sorting algorithms have a cost of O(N*Log(N)).  In other words it's N times more expensive.

Yet this API doesn't exist.  No matter.  We can quickly fix this problem with a couple of extension methods. 

 Public Module Extensions

    <Extension()> _
    Public Sub BinaryInsert(Of T)(ByVal list As List(Of T), ByVal value As T, ByVal comp As IComparer(Of T))
        list.BinaryInsert(value, comp, 0, list.Count)
    End Sub

    <Extension()> _
    Public Sub BinaryInsert(Of T)(ByVal list As List(Of T), _
                                  ByVal value As T, _
                                  ByVal comparer As IComparer(Of T), _
                                  ByVal iStart As Integer, _
                                  ByVal iEnd As Integer)
        While iStart < iEnd
            Dim len = iEnd - iStart
            Dim iMiddle = iStart + (len \ 2)
            Dim comp = comparer.Compare(value, list(iMiddle))
            If 0 = comp Then
                iStart = iMiddle
                Exit While
            ElseIf comp < 0 Then
                iEnd = iMiddle
            Else
                iStart = iMiddle + (len Mod 2)
            End If
        End While
        list.Insert(iStart, value)
    End Sub

End Module

Comments

  • Anonymous
    March 31, 2008
    Interesting method! I have two questions though. The first is why not use the built in "BinarySearch" method that you mentioned since its return value is an indicator of where it should go if it couldn't find it. More precisely, "a negative number that is the bitwise complement of the index of the next element that is larger than item or, if there is no larger element, the bitwise complement of Count." This way, you leverage the framework's built in BinarySearch and get advantages of things like a default comparer. This gives you: <Extension()> _ Public Sub BinaryInsert(Of T)(ByVal list As List(Of T), ByVal value As T)    BinaryInsert(list, value, Nothing) End Sub <Extension()> _ Public Sub BinaryInsert(Of T)(ByVal list As List(Of T), ByVal value As T, ByVal comp As IComparer(Of T))   Dim insertLocation As Integer = list.BinarySearch(0, list.Count, value, comp)   If insertLocation < 0 Then        ' Bitwise negate it to get the actual spot        insertLocation = insertLocation Xor -1    End If    list.Insert(insertLocation, value) End Sub However, since List(Of T) is implemented internally as an array, an insert anywhere besides the end of the list will grow the internal array and then call Array.Copy for the remainder of the array (after the insert index). So, you're going to have O(log n) for the search and O(n) for the copy yielding a total of O(n), which is still better than the O(n log n) for insert/sort though.

  • Anonymous
    March 31, 2008
    Unluckily the .Insert-Statement of a List is O(n) because it has to move on average n/2 elements for the insert. So the BinaryInsert-Function is also only O(n) and not O(log(n)).

  • Anonymous
    March 31, 2008
    Jeff, For the first question yes, it would be more effecient to implement by using the Xor'd result of BinarySearch.  Originally I did some poking into that API but I misread the documentation and thought it returned a non-useful negative number.   Second, yes the Array.Copy does move it above O(LogN).  The rest of this post should be taken with the caveat that I am not a big perf guy :).  AFAIK, Array.Copy eventually maps to memcpy.  I don't believe once it hits the processor that it has O(N) complexity but they key word there is believe.  I did a little poking around and couldn't find a definitive answer one way or the other.  But the end result is yes it will have closer to O(N) rather than O(LogN).