Digging into interface calls in the .NET Framework: Stub-based dispatch.
In my last blog, I demonstrated how to use the .NET SOS.DLL extension DLL in the Visual Studio. In this entry I will use the capabilities of this extension DLL to dig into the way the .NET framework handles interface dispatch. As with all my blog entries, I hope that after reading this you will be inspired to try variations of the experiment I do here yourself.
Caveat: Interface dispatch has changed dramatically between version 1.1 and version 2.0 of the runtime. The example below is showing what happens for version 2.0. The techniques can be used to investigate how 1.1 interface dispatch works, however it is quite different.
Consider the follow program, that simply defines a do-nothing method ‘anIntefaceMethod’ and calls it 10 times in a loop. The question is: exactly what is the overhead of dispatching to an interface method?
interface AnInterface
{
void anInterfaceMethod();
}
class Test : AnInterface
{
public void anInterfaceMethod()
{
}
static public void Main()
{
AnInterface anInterface = new Test();
for(int i=0; i < 10; i++)
anInterface.anInterfaceMethod(); // breakpoint
}
}
Remember as always, you want to make certain that you have compiled this code with optimizations (The ‘Release’ build configuration, and that you have set the debugging options as explained in a previous blog.
By placing a breakpoint on the call to ‘anInterfaceMethod’ in Main, we can quickly switch the disassembly view (Debug->Windows->Disassembly) and see that the code for dispatching to this method is
mov ecx,esi
call dword ptr ds:[00920018h]
Thus the dispatch is an indirect call through the memory location 0x0920018. You can use the Debug->Windows->Memory->Memory1 window to look at what is at 0x0920018 (Right click in the memory window to set the window to display as 4 byte integers, and be sure to 0x in front, as otherwise it is interpreted as decimal). In my case I find that the memory 0x0920018 has the value 0x0926012. (Note that you are likely to get different addresses if you try this yourself.)
Note that if I hit ‘go’ several times to let the ‘for’ loop run a few times, something rather surprising happens. The first time through the loop, value at 0x0920018 does not change, but the second time it changes value (in my case to 0x0092709e). After the first two times, the value at 0x0920018 no longer changes. Thus there is something special going on on the first couple of calls.
Most of the time, you are interested in the ‘steady-state’ performance characteristics, so it is not as interesting to understand the code on first couple of calls (in my case 0x0926012) then it is to understand the code on all other cases (in my case the code at 0x092709e). This is just as well, because as we will see, that case is more complicated. Let's look at what is going on on after the first two calls.
Unfortunately, If you try to do the ‘obvious' thing and look at the code at 0x092709e in Visual Studio’s disassembly window you will get a dialog box saying the code can not be displayed. Visual Studio is trying to ‘protect’ you against seeing internal details of the runtime (I find this rather annoying actually, the code really is there). This is where SOS comes to the rescue.
Following the instructions in my last blog, I set the ‘Enable Unmanged Code Debugging’ box on the project, and restart the debugger (hitting ‘go’ three times). Then I can open the Debug->Windows->Immediate window, and type
.load sos
!u 0x092709e
nmanaged code
0092709E 813970309100 cmp dword ptr [ecx],913070h
009270A4 0F85F32F0000 jne 0092A011
009270AA E98D904B00 jmp 00DE00B0
It turns out that the 'jne' jump is almost never taken, so the interesting target is 0x0DE00B0. I do a !u to find out what code is at that address.
!u 0x0DE00B0
Normal JIT generated code
Test.anInterfaceMethod()
Begin 00de00b0, size 1
>>> 00DE00B0 C3 ret
Which is the telling me that 0x00DE00B0 the start of the method Test.anInterface(), and that this routine consists of a single return instruction.
The other intersting piece of data in the code at 0x092709e is the value used in the 'cmp' (compare) instruction, 0x913070. I happen to know that this is points at an internal data strcuture of the runtime called a 'MethodTable'. This structure represents a type of an object and the first field of every object in the GC heap points as this structure. SOS has a command called '!DumpMT' (Dump Method Table), that can display this. If
!DumpMT 913070
EEClass: 0091126c
Module: 00912c14
Name: Test
…
which is saying that 0x913070 is the internal type representation for the class ‘Test’.
Thus we have found the exact sequence of instructions that gets executed on most interface calls for my sample program. They are
mov ecx,esi // set up 'this'
call dword ptr ds:[00920018h] // call interface
STUB:
cmp dword ptr [ecx],913070h // check if 'this' is an instance of 'Test'
jne CheckFailedCode
jmp Test.anInterfaceMethod
The first two instructions are at the call site, and the last three are in a ‘stub’. Thus it takes 5 instructions to dispatch to a interface method in this case.
What is going on here?
OK, we have seen the code sequence used for interface dispatch for a particular example, now we need to understand it at an architectural level. The dispatch mechanism has the following attributes
- Every call site that dispatches through and interface has a pointer associated with it that points at code that actually performs the dispatch (in our example above this is 00920018). As execution proceeds, this pointer may change (that is the code that performs the dispatch might change).
- In the example above, this pointer quickly stabilizes to a 3 instruction sequence that checks to see if the 'this' object is of a particular type and if so dispatches to a particular address.
We can see that this technique works very well if at any particular dispatch site, you tend to call interfaces on object of the same type. This would happen for example if you are iterating over a collection which happens to have all the same type elements in it (a common case). In that case, the extra overhead is only 3 instructions (and these three instructions dispatch very well in modern processors).
Of course this is not always the case, in which case we care about what happens when the match does not happen. In this case, things get more complicated. Initially, it simply creates a new stub (with a new method table comparison), and makes the call site point at the new stub (hence the need for an indirect call). This works well if you tend to make many calls on the same object type for a while, then switch to another (for a while).... The common example here is library code on homogenous collections. The update logic is 'smart' however in that if it detects that you are bouncing back and forth between multiple objects types for a particular dispatch site, it replaces the stub with a more complicated one that does a hash-table style lookup (which will be slower, but does not need to be replaced on every change).
But I think we have gone far enough for this blog. We can go into details of polymorphic dispatch sites in another blog entry if there is interest.
Recap
Interface dispatch in the version 2.0 .NET Runtime is based on dispatch stubs.
- Each call site might have a different dispatch stub.
- Initially, these stubs point at a generic dispatcher, however quickly get updated to a 'monomorpic' stub that 'guesses' that the call site always dispatches to a particular destination. This works surprisingly well in most cases, and the overhead of dispatch for this case is very low.
- The stubs get updated if the stub's guess is wrong. First it simply tries a different monomorphic stub, however eventually it replaces it with a polymorphic stub that knows how to dispatch reasonably efficiently to one of several targets.
As always, I recommend you try these experiments yourself. For example, you can confirm that if a call site dispatches to two different targets, it will follow the heuristics that I describe above (Exercise for the reader).
Comments
Anonymous
March 13, 2006
Cool post :)Anonymous
March 20, 2006
Very interesting topic and a great post!
"We can go into details of polymorphic dispatch sites in another blog entry if there is interest."
There is! ;)
It would also be interesting to read about:
- How this dynamic code-stub dispatching mechanism differes from the .NET 1.x table-based dispatch (IIRC)
- Why you chose to select this design? Why change away from the 1.x mechanism?
- How interface dispatch performance differs from 1.x and differs from delegate invokation
/Hallvard VassbotnAnonymous
March 21, 2006
Well you'll recall that in the Performance Quiz #9 solution&nbsp;there was a surprising result where...Anonymous
September 23, 2006
PingBack from http://mkseo.pe.kr/blog/?p=1621Anonymous
January 23, 2007
Well you'll recall that in the Performance Quiz #9 solution there was a surprising result where Test7Anonymous
April 12, 2008
The comment has been removedAnonymous
April 14, 2008
For the most part, dispatch performace is independent of the complexity of the inheritance hierarchy. There are differences between interfaces an 'normal' virtual as mentioned in the article, but the fact the complexity of the inheritance does not really matter (or if they are generic). Calling interfaces defined on structs is an interesting case. Because structs don't support inheritance, you typically know its exact type, in which case it is not even a virtual dispatch (just a normal call). If you have cast the struct to an object, you have BOXed it (copied it to the GC heap as an object), in which case the dispatch is much like any other object.Anonymous
May 12, 2008
Článek JIT Optimizations na populárním serveru CodeProject nabízí celkem pěkný a podrobný pohled na to,Anonymous
June 09, 2009
PingBack from http://quickdietsite.info/story.php?id=4901Anonymous
June 23, 2009
The comment has been removedAnonymous
November 30, 2015
The comment has been removedAnonymous
November 30, 2015
.NET IL defines something called a MethodImpl Table which is a table that for a class, maps interface methods to method implementations. Thus In the MethodImpl table there would be entries that say for ImplTestClass ITestInterface.Test1 maps to ImplTestClass.Method1 and ITtestInterface.Test2 maps to ImpTestClass.Meth2.