The CLR x86 JIT, an overview

I'll be doing a series of articles on how the x86 CLR JIT compiler works and the different optimizations it does for you.

 

Whenever a method is about to be executed for the first time, the VM realizes it doesn't have native code for it, and invokes the JIT to generate it (if you are curious about how it works and have the Rotor source code, go to CallCompileMethodWithSEHWrapper in jitinterface.cpp and backtrack from there).

 

Although it is designed to be a fast compiler (compiling is happening at runtime, you can't keep the user waiting) and a lot of trade offs, design decisions an heuristics are in place to make this happen, the JIT really doesn't look that different from a 'normal' compiler (where 'normal' means as per the Dragon Book ;)

 

Work is roughly divided into the following stages:

 

1. Importing: In this stage, the input IL is transformed to the JIT's internal representation. Verification (if required) also happens here. A lot of communication happens back and forth between the JIT and the VM, as the JIT has to ask a lot of questions to the VM (for example, if the IL has a LDFLD instruction, it has to make sure the VM loads the class, and ask how it can access the field, directly? need a helper (for eg MarshalByRef classes), what helper?).

 

2. Morphing: Here, the compiler just applies a transformations to our internal structures in order to simplify or optimize the code. Things that happen here include detecting JIT intrinsics (functions the JIT has special knowledge about, such as Math.Sin(), which will end up being an fsin x86 instruction), constant propagation, inlining, etc...

 

3. Flowgraph analysis: The JIT performs a traditional flowgraph analysis, to determine the liveness of variables and gen/kill sets, dominator information, loop detection, etc.... This information is used in all subsequent stages.

 

4. Optimization phase: In this stage, the heavyweight optimizations happen: Common Subexpression and Range Check Elimination, loop hoisting, etc...

 

5. Register allocation: Registers are a scarce resource on x86. Operations performed on registers are generally faster than those done on memory, hence its important to make variables live in registers. This stage tries to select the most optimal placement for each variable. To do this it takes in account the number of uses of each variable in the method. It also makes a decision about what type of frame the method will use (EBP or ESP based, will the frame be double aligned), etc...

 

6. Code generation: We finally get to generate the x86 code. Here we take in account what processor we're generating code for, in order to choose the best idioms for it. GC and debugger information is also recorded here.

 

7. Emit: This is the last stage, at this point we have the x86 code the method will use and information to generate the GC and debugger information. Everything will be packaged together and returned to the VM, which will then redirect control to the newly generated code.

Comments

  • Anonymous
    October 26, 2004
    Very interesting. What algorithm did you chose for registry allocation? Why is necessary control flow analisys? I think this phase could be done by the compiler that emits MSIL

  • Anonymous
    October 26, 2004
    How long does all this take?

    I imagine that some bits are faster than others, and you're only working with a single method, but I imagine it still has an impact of a kind.

    Is there a good guide to .NET IL? I'm (probably far too much) familiar with Java bytecode, and I'm curious as to how .NET differs radically.

    Another question, do you process method by method, class by class, or assembly by assembly?

    Sorry, I'm just very curious :)

  • Anonymous
    October 26, 2004
    Andrew: Checkout http://weblogs.asp.net/kennykerr/category/7140.aspx for a quick intro to MSIL.

  • Anonymous
    October 26, 2004
    Some background before the actual question

    Constants (const in C#) are used only by the
    compiler, using the value they have at compile
    time, so a change in some constants is leaving
    us with a version problem (we must recompile
    all depended assemblies)

    Default parameters having the same version
    problem as the caller actually use the compile
    time values.
    The solution that C# team is providing is
    overloads, this is actually a semi-workaround,
    since they don't capture the developer's
    intention (which is just some default values)
    and they are maintenance problem (we must
    synchronize the attribute's definitions and xml
    documentation of the method)


    Now the question :)

    How difficult is to add the constant resolution
    in the first steps of JITing ?


    That will solve problem #1 and if the default
    parameters values could be encoded as some
    internal constants (so the caller use them
    instead of the actual values) will solve #2
    also

    I think that JITter already does much more
    complicated things, than a simple Find and Replace

    Regards

  • Anonymous
    October 26, 2004
    David, could you comments on the differences in JIT and NGEN in v2

  • Anonymous
    October 26, 2004
    To Panos Theofanopoulos: Why not just use public read-only fields instead of public constants? With read-only fields, I am pretty sure that the compiler does not just hoist the value out at compile time.

  • Anonymous
    October 26, 2004
    I'm confused why the JIT does heavy-weight optimizations like common sub-expression elimination. Shouldn't those be done by the compiler? I can understand the range check optimization because the JIT and CLR already have special understanding of these types.

  • Anonymous
    October 26, 2004
    Lorenzo:

    The IL doesn't have information about control flow and liveness. Certainly, some compilers do it to remove dead code, but because IL doesnt have it, we need to recompute it.

    Andrew: Jitting is per method

  • Anonymous
    October 26, 2004
    The comment has been removed

  • Anonymous
    October 26, 2004
    Simple, quick question: According to the times it takes for the JIT engine to emit a method, what is better: many small methods (these stages occur more often and the optimizations are scarse) or few large methods (these stages occur fewer times, but optimizations are heavier)?

  • Anonymous
    October 26, 2004
    Eric : And what's the use of const then ? :)

    The answer is that the static field's value (even if it's readonly) cannot be "inlined" by the JITter

    by inlining i mean the following

    class AsmA {
    public const int myConst = 4;
    }

    the statement
    return 4 / AsmA.myConst;

    will generate a
    MOV EAX,1

    and the statement

    return a / AsmA.myConst

    test eax,eax
    jns positive
    add eax,3
    positive:
    sar eax,2

  • Anonymous
    October 31, 2004
    Great Stuff! I know you have stated upfront the x86 focus, but I would be interested in any info you have with regards to JITing and the Compact Framework.

  • Anonymous
    October 25, 2007
    Just a reminder: Release builds are not Debug builds. Seems obvious, but it's worth saying again. Release

  • Anonymous
    June 12, 2009
    PingBack from http://greenteafatburner.info/story.php?id=5587