How to debug compiler generated code
Edit 2/8/2010: Update for recent F# language changes
James Margetson recently showed me a great tip for debugging compiler generated code. Typically the problem is that using the Visual Studio debugger all you see is the x86 disassembly of the JIT-ed methods, but what you really need is the IL opcodes that F# compiled down to with the x86 under that. Fortunately all the tools you need to do this come with the .Net SDK. Consider the following, bugged F# program:
// Simple F# program
open System
type Season =
| Spring
| Summer
| Fall
| Winter
override this.ToString() =
match this with
| Spring -> "Spring"
| Summer -> "Summer"
| Fall -> "Fall"
| Winter -> "Winter"
type SeasonUtils = class
// Gets the next season
static member GetNextSeason season =
match season with
| Spring -> Summer
| Summer -> Fall
| Fall -> Winter
| Winter -> Spring
// Gets the current name of the season by calling .ToString()
static member GetSeasonName (season : Season) = season.ToString()
// Returns the current season
static member CurrentSeason () =
// Convert a list of months and tupifies them with the target season
let genSeasonTuples monthRange season =
monthRange |> List.map (fun monthInd -> (monthInd, season))
// Map of (months * season) pairs
let monthSeasonList = (genSeasonTuples [1 .. 2] Spring) // Who likes March anyways?
@ (genSeasonTuples [4 .. 6] Summer)
@ (genSeasonTuples [7 .. 10] Fall)
@ (genSeasonTuples [11 .. 12] Winter)
let monthSeasonMap = Map.ofList monthSeasonList
let currentMonthIndex = DateTime.Now.Month
monthSeasonMap.[currentMonthIndex]
end
// Code in our main module
printfn "(attach debugger and press any key)"
Console.ReadKey(true)
let currentSeason = SeasonUtils.CurrentSeason()
printfn "Current season = %s" (SeasonUtils.GetSeasonName(currentSeason))
Console.WriteLine("(press any key)")
Console.ReadKey(true)
If you run it and break into a debugger, all you see is the x86 disassembly with the F# code on top. (As you can see quite a bit gets lost when looking at F# -> x86).
000001e0 call 7891820C
000001e5 mov esi,eax
000001e7 mov dword ptr ds:[001A3184h],esi
let currentSeason = monthSeasonMap.[currentMonth]
000001ed call dword ptr ds:[001A6898h]
000001f3 mov esi,eax
000001f5 call dword ptr ds:[001A689Ch]
000001fb mov edi,eax
What we would like to do is break the source program into its native IL-level. To do this you can use ILDasm.exe, a managed code disassembler. To get the IL code for an assembly, type:
ildasm foo.exe /OUT=foo.exe.il /SOURCE
The /SOURCE parameter included the original source code (from foo.exe.pdb) in the output IL. The result is the following code in ‘foo.exe.il’:
.method public static string GetSeasonName(class Codefile/Season season) cil managed
{
// Code size 9 (0x9)
.maxstack 3
//000024:
//000025: // Gets the current name of the season by calling .ToString()
//000026: static member GetSeasonName (season : Season) = season.ToString()
IL_0000: ldarg.0
IL_0001: tail.
IL_0003: callvirt instance string [mscorlib]System.Object::ToString()
IL_0008: ret
} // end of method SeasonUtils::GetSeasonName
So now we have the IL, but that doesn’t help us debug a live instance of foo.exe. Enter ILAsm.exe, which will ‘round trip’ disassembled IL code and build a new assembly. This seems strange, since we don’t gain anything. (The IL code is exactly the same.) But, ILAsm.exe has a debug flag which will generate a new PDB file which contains debugging information for not the F# lines of code, but the IL instructions themselves. And, since we included the /SOURCE flag when we disassembled the assembly the F# code is included as comments. Type:
ilasm foo.exe.il /DEBUG
Now, if we run the program and debug it, rather than using the .fs file as the source language we use foo.exe.il, and the disassembly display should show the F# source code along with the generated IL. For example:
//000044:
//000045: let currentMonthIndex = DateTime.Now.Month
IL_010f: call valuetype [mscorlib]System.DateTime [mscorlib]System.DateTime::get_Now()
000004bb lea ecx,[ebp+FFFFFF40h]
000004c1 call 78EDAAE8
IL_0114: stloc.s '$struct-addr@45'
000004c6 lea edi,[ebp-40h]
000004c9 lea esi,[ebp+FFFFFF40h]
000004cf movq xmm0,mmword ptr [esi]
000004d3 movq mmword ptr [edi],xmm0
IL_0116: ldloca.s '$struct-addr@45'
000004d7 lea esi,[ebp-40h]
IL_0118: call instance int32 [mscorlib]System.DateTime::get_Month()
000004da mov ecx,esi
000004dc call 78F080BC
000004e1 mov dword ptr [ebp+FFFFFF3Ch],eax
IL_011d: stloc.s currentMonthIndex
000004e7 mov eax,dword ptr [ebp+FFFFFF3Ch]
000004ed mov dword ptr [ebp-38h],eax
Now while debugging you can get a complete picture of what’s going on under the hood. It gives you an appreciate of how much the CLR abstracts the machine and how much F# abstracts the CLR.
Comments
Anonymous
March 14, 2008
Fantastic! Thanks for the tip Chris.Anonymous
March 16, 2008
Awesome i didn't knowed anything about this...