Dela via


Why Java VMs do not have a Pre-JIT feature?

Both Java and .NET have several things in common - their runtimes are both able to execute code written in a machine-independent "assembly language". As we know, this code is represented in binary format: bytecode in the Java world, and IL in .NET. The generic idea of a "bytecode" is pretty old, in fact UCSD Pascal had a similar concept called P-Code back in 1970s, and then later Smalltalk built on the same idea.

Usually, these generic instructions are not executed directly, but rather they are translated into machine code on the fly, in a process called JIT. The acronym stands for Just-In-Time, which is similar with the back-end phase of a C/C++ compiler. This is also not a new idea, and if I remember correctly, Smalltalk-80 implemented it for the first time in a succesful manner.

As a side note, in the case of .NET, these instructions were specifically designed to enable a simpler language but also a potentially faster JIT process. For example, in IL you have a single, virtual "add" instruction which adds whatever two numeric operands are present on the stack, irrespective of their types. The runtime will perform the right optimization since the argument types can be deduced anyway from the metadata information. This contrasts somewhat with the Java approach where you have numerous flavors of "add", one for each possible pair of integer types.

On the other side, the fact that you intensively need the metadata during the JIT also implies that it is pretty hard to design a ".NET/IL processor" since you need to understand both the .NET metadata and IL at the same time. OK, maybe it is not impossible but certainly hard. The Java bytecode, on the other side, was initially designed to be run on a processor, not in a JIT environment.

But with JIT we have now another challenge. First, as soon as you stop the execution of a process, you lose all the optimization information gathered in the previous run. And you have to JIT again and again, each time the process starts. Second, when your process starts, you lose some time compiling the IL/bytecode.

A natural idea is to cache the compiled images on disk, so at the next start you will just load them and start from there. Even more than that - there is an optimization called Pre-JIT that allows CLR (starting with 1.0) to pre-compile a .NET assembly ahead of time, and persist the generated in machine-dependent executable. Pre-JIT helps for example to get better load times for GUI-style apps.

I am wondering why no Java virtual machines do not have something similar these days?

[update: fixing the link about UCSD Pascal]

Comments

  • Anonymous
    March 24, 2005
    The UCSD Pascal link doesn't work - 404.
  • Anonymous
    March 24, 2005
    "I am wondering why no Java virtual machines do not have something similar these days? "

    GCJ (the GNU Java compiler) can compile .class files into executable objects which can then be linked into a .exe file, or even C++ files. Of course, the final exe / library will end up with GC & Java runtime (statically / dynamically linked .LIB) dependencies.

    Of course, GCJ executables run slower (but start faster) than executables run under the JIT. In particular, GCJ uses the Boehm conservative GC, which limits its optimization strategies (for one thing, it can't compact since it's conservative).

    Also, inferring the correct machine operator to use from the datatypes that are statically analysed and discovered on the stack at that point in the assembly (a fundamental limitation of both JVM and MSIL being that the stack must be the same at every instruction, no matter what path got to that instruction) is trivial, not difficult (especially since the stack type data inferral needs to be done for verification). It's easier than function overload resolution, which is something compiler writers do as a matter of course when implementing a language with that feature.

    And yes, I write compilers.
  • Anonymous
    March 25, 2005

    By pre-JITting you are probably talking about ngen.exe right?

    ngen.exe can be very bad (premature optimization) since it will create machine code on the machine it's running, for instance at build time, not on the end user machine.

    Since ngen is not provided by the .NET 2.0 runtime (neither with previous runtime versions by the way), there is no way you are going to ngen your app as a post-install step.

    I must have missed something really obvious...

  • Anonymous
    March 25, 2005

    Oops, sorry. ngen.exe is with the runtime dist.

  • Anonymous
    March 25, 2005
    The comment has been removed
  • Anonymous
    March 25, 2005
    >> GCJ (the GNU Java compiler) can compile .class files into executable objects which can then be linked into a .exe file, or even C++ files.

    Yes, I knew about GCJ (I should have mention it). But I don't consider it a runtime. It's just a compiler. So you lose all these advantages coming from a runtime environment (you cannot re-JIT if you detect a certain usage pattern over time, etc).

  • Anonymous
    March 25, 2005
    The UCSD link is working now.
  • Anonymous
    March 25, 2005
    ngen.exe is in .Net framework redist. Check your framework directory.

    C:WINDOWSMicrosoft.NETFrameworkv1.1.4322ngen.exe

    Java5.0 has introduced Java Class Data Sharing concept. Not the same concept, but similar result. It is very limited though.

    Java was never popular for client side applications. That is probably why Sun does not do ngen.exe.


    But all are just speculations.
  • Anonymous
    March 25, 2005
    The comment has been removed
  • Anonymous
    March 25, 2005
    I'll go out on a limb here and guess that for their particular circumstances they decided that adding support for pre-JITing of user code was more expensive relative to the customer benefit that it would provide than other features under consideration for this release cycle.