次の方法で共有


Cost of array bounds checking -- one experiment

Sometimes people ask me what the overhead of array range checking is. A while ago I did a quickish experiment to measure the cost of the bounds checking introduced by the JIT in a typical application. This is just one part of the managed safety net but it's a fairly easy one to get a handle on compared to some of the other overheads. Notably, it's a lot trickier to get a handle on the marginal cost of explicit range-checking done in helper methods in base classes.

The test case was a Windows Presentation Framework application that basically tried to render the same thing that you might see on the MSN home page. The time being measured is from process startup (t=0) until Main exits (which is almost but not quite everything). This is a wall clock time on a normally configured machine so there is noise in the system which I attempt to remove as shown below.

To faciliate this experiment I used a then recent build of WPF, which was built using PD3 of the CLR (so this is 2004 vintage). I modified mscorjit.dll so that it does not introduce array bounds checks for arrays or strings. This modified jitter was built using a slightly different process than the official build (I did a standard dev build) so to control that difference I also rebuilt the standard jitter, unmodified, in the same way. So in this experiment I'm comparing two jitters both build by me in the same way. To reduce the impact of normal machine noise in the runs I removed the 10 slowest run --but I show the full data as well.

Times were gathered with everything was hot in the disk cache.

Time Results:

Test

N

Average

Std. Dev.

Standard All 36 1.661s 0.275s
Standard Slowest 10 Removed 26 1.506s 0.054s
       
Norange All 36 1.609s 0.574s
Norange Slowest 10 Removed 26 1.497s 0.017s
       
Delta All 36 0.052s  
Delta Slowest 10 Removed 26 0.009s  
       
% Delta All 36 -3.1%  
% Delta Slowest 10 Removed 26 -0.6%  

But notice that in both cases the delta is within one standard deviation of the mean. Based on this experiment we could not reject the null hypothesis that range checking is making no difference in execution time (i.e. noise is making a bigger difference).

Space Results

Working set numbers were very unstable due to the interactive nature of the test case, the variation from run to run is even more pronounced in terms of working set than it is in terms of time and I strongly feel that this variation totally hides any meaningful analysis of workingset. However it's easy enough to see the absolute difference in terms of size of the assemblies on disk which I think accurately accounts for the space tax.

Assembly Checked Unchecked Delta
mscorlib.ni.dll 9,777,152 9,703,424 -0.75%
System.Drawing.ni.dll 1,585,152 1,581,056 -0.26%
System.Security.ni.dll 901,120 897,024 -0.45%
System.ni.dll 6,713,344 6,668,288 -0.67%
System.Windows.Forms.ni.dll 12,132,352 12,099,584 -0.27%
Others (similar, not shown) ... ... ...
Total 73,719,808 73,359,360 -0.49%

So basically, at least in this experiment, the cost was within the noise level of the test. Of course your mileage may vary but this seems to agree with anecdotal experience in larger applications.

Comments

  • Anonymous
    July 12, 2006
    The comment has been removed
  • Anonymous
    July 12, 2006
    The comment has been removed
  • Anonymous
    July 12, 2006
    The comment has been removed
  • Anonymous
    July 22, 2006
    The JIT generally does a pretty good job at removing array bounds checks in simple cases anyways (if its passed in or a local).

    I showed the particular optimization here http://codebetter.com/blogs/gregyoung/archive/2006/06/11/146343.aspx

    So I am not sure how "good" these tests actually are since the JIT removes many anyways.

    My concern is that the JIT does not remove all that it should. I discuss it in detail here http://codebetter.com/blogs/gregyoung/archive/2006/07/08/147230.aspx.

    A bigger concern is that it is impossible to write an efficient bit of safe code that does something similar to the following (note this is a quick example typed up for clarity, not necesarily compiling or well designed code)

    public static int [] CopyWithSquare(int [] in) {
       int [] ret = new int[in.Length];
       int tmp;
       for(int i=0;i<in.Length;i++) {
           ret[i] =  in[i] * in[i];
       }
    }

    One of the two arrays will always have a bounds check because the optimizer will only remove the bounds check for the array that is used in the condition.

    It would seem to me that in this case the JIT should probably atleast hoist the other array bounds check (especially when it is held in a register as you know the operation is safe)

    Of course I can get this to not happen using unsafe code but ...

    1) Most languages don't support unsafe code
    2) Many situations are beginning to require verifiable code which would keep people from calling my routine.

    I agree that in insert application your overall gain from the optimization will not be huge (there are many other items which are more likely at fault). When you are however dealing with optimizing code that you have isolated as a problem point through profiling it seems to be common that you find arrays with bounds checks :)

  • Anonymous
    July 22, 2006
    to happen using unsafe code.
  • Anonymous
    August 01, 2006
    try skipping the array bound checking using Exceptions. In java, (performance tuning manual) they suggested make the loop infinite and use a "try...catch" exception.
    If this works for csharp...let me know.
    Also, I am using the generic exception eg."}catch(Exception e){} ". Will narrowing down to the exact exception prove faster exception handling. eg "(System.IndexOutOfRangeException e){}"


    Code:
    public static int [] CopyWithSquare(int [] in) {
      int [] ret = new int[in.Length];
      int tmp;
      try {
        for(int i=0;;i++) {
           ret[i] =  in[i] * in[i];
        }
      }catch(Exception e){}
    }

    Please let me know what you make of this.
    jeremygwa At hotmail.com
  • Anonymous
    August 03, 2006
    Didn't expect it to work and it doesn't. This is actually slower than the original loop because the original loop removed one of the bounds checks where this has two.

           public static int[] CopyWithSquare(int[] val) {
               int[] ret = new int[val.Length];
    00000000  push        ebp  
    00000001  mov         ebp,esp
    00000003  push        edi  
    00000004  push        esi  
    00000005  push        ebx  
    00000006  sub         esp,18h
    00000009  xor         eax,eax
    0000000b  mov         dword ptr [ebp-18h],eax
    0000000e  mov         edi,ecx
    00000010  mov         edx,dword ptr [edi+4]
    00000013  mov         ecx,7915982Ah
    00000018  call        FFCB0FB0
    0000001d  mov         dword ptr [ebp-24h],eax
               int tmp;
               try {
                   for (int i = 0; ; i++) {
    00000020  xor         esi,esi
    00000022  mov         ebx,dword ptr [edi+4]
    00000025  mov         eax,dword ptr [ebp-24h]
    00000028  mov         ecx,dword ptr [eax+4]
                       ret[i] = val[i] * val[i];
    0000002b  cmp         esi,ebx
    0000002d  jae         0000004A
    0000002f  mov         edx,dword ptr [edi+esi4+8]
    00000033  mov         eax,edx
    00000035  imul        eax,edx
    00000038  mov         edx,eax
    0000003a  mov         eax,dword ptr [ebp-24h]
    0000003d  cmp         esi,ecx
    0000003f  jae         0000004A
    00000041  mov         dword ptr [eax+esi
    4+8],edx
                   for (int i = 0; ; i++) {
    00000045  add         esi,1
    00000048  jmp         0000002B
    0000004a  call        79443501
                   }
               }
               catch (Exception e) { }
    0000004f  call        79345119
               return ret;
    00000054  mov         eax,dword ptr [ebp-24h]
    00000057  lea         esp,[ebp-0Ch]
    0000005a  pop         ebx  
    0000005b  pop         esi  
    0000005c  pop         edi  
    0000005d  pop         ebp  
    0000005e  ret              
  • Anonymous
    August 09, 2006
    Rico,

    Can you comment a little on array performance on .net from a number crunching point of view? I would like to be able to write everything in C#, but worry that it'll be too much of a performance hit (for various reasons I'd like the array code to be managed, not external native code I'm using from a C# app).

    I understand the differences in support for 1D vs Jagged vs MultiD arrays. My concerns include lack of of use of SIMD instructions by the JIT, plus lack of blocking and loop transformations.

    Any experience on perf of fortran style .net code?
  • Anonymous
    August 29, 2006
    Questa &#232; una domanda che mi sono posto anch'io parecchie volte: il controllo&amp;nbsp;dei limiti inferiore...
  • Anonymous
    October 03, 2006
    Questa è una domanda che mi sono posto anch'io parecchie volte: il controllo dei limiti inferiore...