The CLR x86 JIT, an overview
I'll be doing a series of articles on how the x86 CLR JIT compiler works and the different optimizations it does for you.
Whenever a method is about to be executed for the first time, the VM realizes it doesn't have native code for it, and invokes the JIT to generate it (if you are curious about how it works and have the Rotor source code, go to CallCompileMethodWithSEHWrapper in jitinterface.cpp and backtrack from there).
Although it is designed to be a fast compiler (compiling is happening at runtime, you can't keep the user waiting) and a lot of trade offs, design decisions an heuristics are in place to make this happen, the JIT really doesn't look that different from a 'normal' compiler (where 'normal' means as per the Dragon Book ;)
Work is roughly divided into the following stages:
1. Importing: In this stage, the input IL is transformed to the JIT's internal representation. Verification (if required) also happens here. A lot of communication happens back and forth between the JIT and the VM, as the JIT has to ask a lot of questions to the VM (for example, if the IL has a LDFLD instruction, it has to make sure the VM loads the class, and ask how it can access the field, directly? need a helper (for eg MarshalByRef classes), what helper?).
2. Morphing: Here, the compiler just applies a transformations to our internal structures in order to simplify or optimize the code. Things that happen here include detecting JIT intrinsics (functions the JIT has special knowledge about, such as Math.Sin(), which will end up being an fsin x86 instruction), constant propagation, inlining, etc...
3. Flowgraph analysis: The JIT performs a traditional flowgraph analysis, to determine the liveness of variables and gen/kill sets, dominator information, loop detection, etc.... This information is used in all subsequent stages.
4. Optimization phase: In this stage, the heavyweight optimizations happen: Common Subexpression and Range Check Elimination, loop hoisting, etc...
5. Register allocation: Registers are a scarce resource on x86. Operations performed on registers are generally faster than those done on memory, hence its important to make variables live in registers. This stage tries to select the most optimal placement for each variable. To do this it takes in account the number of uses of each variable in the method. It also makes a decision about what type of frame the method will use (EBP or ESP based, will the frame be double aligned), etc...
6. Code generation: We finally get to generate the x86 code. Here we take in account what processor we're generating code for, in order to choose the best idioms for it. GC and debugger information is also recorded here.
7. Emit: This is the last stage, at this point we have the x86 code the method will use and information to generate the GC and debugger information. Everything will be packaged together and returned to the VM, which will then redirect control to the newly generated code.
Comments
Anonymous
October 26, 2004
Very interesting. What algorithm did you chose for registry allocation? Why is necessary control flow analisys? I think this phase could be done by the compiler that emits MSILAnonymous
October 26, 2004
How long does all this take?
I imagine that some bits are faster than others, and you're only working with a single method, but I imagine it still has an impact of a kind.
Is there a good guide to .NET IL? I'm (probably far too much) familiar with Java bytecode, and I'm curious as to how .NET differs radically.
Another question, do you process method by method, class by class, or assembly by assembly?
Sorry, I'm just very curious :)Anonymous
October 26, 2004
Andrew: Checkout http://weblogs.asp.net/kennykerr/category/7140.aspx for a quick intro to MSIL.Anonymous
October 26, 2004
Some background before the actual question
Constants (const in C#) are used only by the
compiler, using the value they have at compile
time, so a change in some constants is leaving
us with a version problem (we must recompile
all depended assemblies)
Default parameters having the same version
problem as the caller actually use the compile
time values.
The solution that C# team is providing is
overloads, this is actually a semi-workaround,
since they don't capture the developer's
intention (which is just some default values)
and they are maintenance problem (we must
synchronize the attribute's definitions and xml
documentation of the method)
Now the question :)
How difficult is to add the constant resolution
in the first steps of JITing ?
That will solve problem #1 and if the default
parameters values could be encoded as some
internal constants (so the caller use them
instead of the actual values) will solve #2
also
I think that JITter already does much more
complicated things, than a simple Find and Replace
RegardsAnonymous
October 26, 2004
David, could you comments on the differences in JIT and NGEN in v2Anonymous
October 26, 2004
To Panos Theofanopoulos: Why not just use public read-only fields instead of public constants? With read-only fields, I am pretty sure that the compiler does not just hoist the value out at compile time.Anonymous
October 26, 2004
I'm confused why the JIT does heavy-weight optimizations like common sub-expression elimination. Shouldn't those be done by the compiler? I can understand the range check optimization because the JIT and CLR already have special understanding of these types.Anonymous
October 26, 2004
Lorenzo:
The IL doesn't have information about control flow and liveness. Certainly, some compilers do it to remove dead code, but because IL doesnt have it, we need to recompute it.
Andrew: Jitting is per methodAnonymous
October 26, 2004
The comment has been removedAnonymous
October 26, 2004
Simple, quick question: According to the times it takes for the JIT engine to emit a method, what is better: many small methods (these stages occur more often and the optimizations are scarse) or few large methods (these stages occur fewer times, but optimizations are heavier)?Anonymous
October 26, 2004
Eric : And what's the use of const then ? :)
The answer is that the static field's value (even if it's readonly) cannot be "inlined" by the JITter
by inlining i mean the following
class AsmA {
public const int myConst = 4;
}
the statement
return 4 / AsmA.myConst;
will generate a
MOV EAX,1
and the statement
return a / AsmA.myConst
test eax,eax
jns positive
add eax,3
positive:
sar eax,2Anonymous
October 31, 2004
Great Stuff! I know you have stated upfront the x86 focus, but I would be interested in any info you have with regards to JITing and the Compact Framework.Anonymous
October 25, 2007
Just a reminder: Release builds are not Debug builds. Seems obvious, but it's worth saying again. ReleaseAnonymous
June 12, 2009
PingBack from http://greenteafatburner.info/story.php?id=5587