Creating an IL-rewriting profiler

A frequent topic of discussion between those of us on the CLR Profiling API team at Microsoft and our customers is how to write a profiler that rewrites IL to do cool stuff. Unfortunately, there still is very little documentation on how to do this, and what documentation there is, is rather scattered. I'm not going to say anything new here. But I will try to bring together the scattered info into one place.

Q: Why do I care?

You may want to instrument managed code to insert custom calls into your profiler to measure timings or code coverage, or record execution flow. Maybe you want to perform custom actions, like taking note whenever thread synchronization is invoked (e.g., Monitor.Enter,Leave). One way to do this is to take the original IL of the application you're profiling, and rewrite that IL to contain extra code or hooks into your profiler or managed code you ship alongside your profiler.

Q: I can ship managed code alongside my profiler? Are you saying I can write my profiler in manage code?

No, sorry to get your hopes up. Your profiler DLL must be unmanaged. Your ICorProfilerCallback implementations must be unmanaged (and should not call managed code). However, if you rewrite IL, it's perfectly fine for that IL to call into managed code that you've written and shipped alongside your profiler.

Q: In a nutshell, what's involved?

Well, first off, you're making a profiler. That means you create an unmanaged in-proc COM server DLL. If this much is already new to you, you should probably stop reading this, search MSDN for "ICorProfilerCallback", and grope through the table of contents for background info on how to write a profiler in general.

Keep in mind there are many ways to do this. I'll outline one of the more straightforward approaches here, and the adventurous should feel free to substitute their own ingredients:

  • In your ICorProfilerCallback2::ModuleLoadFinished callback, you call ICorProfilerInfo2::GetModuleMetadata to get a pointer to a metadata interface on that module.
  • QI for the metadata interface you want. Search MSDN for "IMetaDataImport", and grope through the table of contents to find topics on the metadata interfaces.
  • Once you're in metadata-land, you have access to all the types in the module, including their fields and function prototypes. You may need to parse metadata signatures and this signature parser may be of use to you.
  • In your ICorProfilerCallback2::JITCompilationStarted callback, you may use ICorProfilerInfo2::GetILFunctionBody to inspect the original IL, and ICorProfilerInfo2::GetILFunctionBodyAllocator and then ICorProfilerInfo2::SetILFunctionBody to replace that IL with your own.

Q: What about NGEN?

If you want to rewrite IL of NGENd modules, well, it's kind of too late because the original IL has already been compiled into native code. However, you do have some options.  If your profiler sets the COR_PRF_USE_PROFILE_IMAGES monitor event flag, that will force the "NGEN /Profile" version of the modules to load if they're available.  (I've already blogged a little about "NGEN /Profile", including how to generate those modules, here.)  So, at run-time, one of two things will happen for any given module.

1) If you set COR_PRF_USE_PROFILE_IMAGES and the NGEN /Profile version is available, it will load.  You will then have the opportunity to respond to the JITCachedFunctionSearchStarted callback.  When a function from an NGEN /Profile module is about to be executed for the first time, your profiler receives the JITCachedFunctionSearchStarted callback.  You may then set the *pbUseCachedFunction [out] parameter to FALSE, and that will force the CLR to JIT the function instead of using the version that was already compiled into the NGEN /Profile module.  Then, when the CLR goes to JIT the function, your profiler receives the JITCompilationStarted callback and can perform IL rewriting just as it does above for functions that exist in non-NGENd mdoules.  What's nice about this approach is that, if you only need to instrument a few functions here and there, it can be faster not to have to JIT everything, just so you get the JITCompilationStarted callback for the few functions you're interested in.  This approach can therefore improve startup performance of the application while it's being profiled.  The disadvantage, though, is that your profiler must ensure the NGEN /Profile versions of all the modules get generated beforehand and get installed onto the user's machine.  Depending on your scenarios and customers, this may be too cumbersome to ensure.

2) If you set COR_PRF_USE_PROFILE_IMAGES and the NGEN /Profile version is not available, the CLR will refuse to load the regular NGENd version of that module, and will instead JIT everything from the module.  Thus, it's ensured that you have the opportunity to intercept JITCompilationStarted, and can replace the IL as described above.

Q: Any examples?

Here is an MSDN article that talks about making an IL rewriting profiler:
https://msdn.microsoft.com/en-us/magazine/cc188743.aspx

Some blog entries I wrote about using IL rewriting to work around a bug:
https://blogs.msdn.com/davbr/archive/2006/02/27/540280.aspx
https://blogs.msdn.com/davbr/archive/2006/06/07/620925.aspx

Q: Any caveats?

Rewriting IL in mscorlib.dll functions can be dangerous, particularly in functions that are executed during startup initialization of the managed app or any of its AppDomains. The app may not be initialized enough to handle executing some of the managed code that might get called (directly or indirectly) from your rewritten IL.

If you're going to modify the IL to call into some of your own managed code, be careful about which functions you choose to modify. If you're not careful, you might accidentally modify the IL belonging to your own assembly and cause infinite recursion.

And then there's the worst of both worlds: when you need to rewrite IL to call into their own assemblies and you happen to rewriting IL in mscorlib. Note that it's simply unsupported to force mscorlib.dll to reference any other assembly. The CLR loader treats mscorlib.dll pretty specially. The loader expects that, while everyone in the universe may reference mscorlib.dll, mscorlib.dll had better not reference any other assembly. If you absolutely must instrument mscorlib.dll by modifying IL, and you must have that IL reference some nifty new function of yours, you had better put that function into mscorlib.dll by dynamically modifying mscorlib.dll's metadata when it is loaded. In this case you no longer have the option of creating a separate assembly to house your custom code.

Q: Has anyone else tried making an IL-rewriting profiler?

Sure. If you want to learn from other people's experiences, read through the Building Development and Diagnostic Tools for .Net Forum. Here are some interesting threads:

https://social.msdn.microsoft.com/Forums/en-NZ/netfxtoolsdev/thread/5f30596b-e7b7-4b1f-b8e1-8172aa8dde31
https://social.msdn.microsoft.com/Forums/en-GB/netfxtoolsdev/thread/c352266f-ded3-4ee2-b2f9-fbeb41a70c27

Comments

  • Anonymous
    March 06, 2007
    Good post.  NCover has used IL rewriting to do code coverage analysis since the beginning of the project (about 3-4 years ago).

  • Anonymous
    March 07, 2007
    PingBack from http://cheatcodedirect.com/creating-an-il-rewriting-profiler/

  • Anonymous
    March 08, 2007
    I'm very interesting in code rewriting in mscorlib.dll. I understand that it is very dangerous, but it's need to me. Lately I've posted to the forum my problem with the instrumenting Remoting infrastructure's methods (http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=1324841&SiteID=1&mode=1). My post was moved from "Building development and..." to the CLR theme. The problem there was that some methods faild to JIT compile. I suppose that the reason is a problem with context. Maybe problem is security context. In any case what do you think - why the method can fail to JIT compile? How to determine the bad context?

  • Anonymous
    March 14, 2007
    Thanks for the blog. This is really helpful. I’m working on an IL rewriting profiler as a sort of side project and have a question. I’ve noticed that when I call ICorProfilerInfo::GetILFunctionBody on any method that has extra sections at the end, the code size that gets returns doesn’t seem to be correct. It seems that when I’ve parsed through the header bytes, the body bytes, and the extra section bytes, I still have anywhere from 1-6 extra bytes left over at the end. What are these bytes? Are they actually part of the method? When I use IMethodMalloc::Alloc to create my rewritten IL,I’m not including space for these extra bytes and it seems to be working correctly, but I’m afraid that this is going to cause errors at some point. I feel a little like a mechanic who finishes rebuilding an engine only to find he has a few parts left over. HELP! Any light you could shed on this would be great, or if you could at least point me in the right direction I’d really appreciate it. Thanks in advance.

  • Anonymous
    March 15, 2007
    There are several different APIs for handling Types in .NET. Criteria : For each category I want to call

  • Anonymous
    March 19, 2007
    Hi, Eric.  I saw your post on the forum as well, and I've posted an answer there (http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=1366051&SiteID=1).  Sorry for the delay. Also, Sergey, I haven't forgotten about you!  I'm still doing some research on your questions, and will post a reply in the forum.  I will probably not have an answer for all your questions, but hopefully I'll have some information that will be of help to you.

  • Anonymous
    March 19, 2007
    Sergey, I've finally posted my response to your forum question, and I've moved the thread back to the "Building Development and Diagnostic Tools for .Net" forum (sorry it was wrongly moved on you!). http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=1367091&SiteID=1&mode=1

  • Anonymous
    February 08, 2008
    Hi David, Thanks for the article. I think it takes me one step closer. I've grepped the internet, and run this by Richter and Robbins, but have not been able to find an acceptable solution. I ran across a similar solution by Aleksandr Mikunov [1] - I am hoping you have a cleaner solution. Richer recommends using DynamicMethod of Reflection [2]. But I'm interested in the existing IL code, and not an additional method built at runtime. I desire to force a ReJIT of a method in C#. It appears the generally accepted solution is to use ICorProfilerInfo::SetFunctionReJIT(). This has two problems:

  1. Microsoft does not recommend use of the API for other than profiling purposes [3]
  2. According to the field and verified by Microsoft, the method does not work [4] Any ideas on how to flush a compiled method so that I can get a recompilation? Jeff Jeffrey Walton Pasadena, MD [1] Rewrite MSIL Code on the Fly with the .NET Framework Profiling API, http://msdn.microsoft.com/msdnmag/issues/03/09/NETProfilingAPI/ [2] DynamicMethod Class, http://msdn2.microsoft.com/en-us/library/system.reflection.emit.dynamicmethod.aspx [3] ICorProfilerInfo::SetFunctionReJIT Causes Deadlock, http://www.dotnet247.com/247reference/msgs/58/290727.aspx [4] ICorProfilerInfo::SetFunctionReJIT Causes Deadlock, http://www.dotnet247.com/247reference/msgs/58/290727.aspx
  • Anonymous
    February 09, 2008
    The comment has been removed

  • Anonymous
    March 07, 2011
    Hi Dave, A lot of the links to the forum posts are now obsolete (broken) is it possible to work out how to get the forum post ID and find the original messages?

  • Anonymous
    March 15, 2011
    The comment has been removed