Why cant PowerShell run loops fast ?

So recently I had to write a PS script for a customer to loop through a large file and process every character. Seems simple enough and worked great on sample files, but as soon as you give large files to process the performance of PS script was really bad. I assumed it was due to some FileOption parameter that i was not specifying or Filebuffer size not being appropriate. After some debugging I was able to repro the problem even without a file operation. So basically running a loop in PS is slow. Take a look at this:

So here is what I do: Declare an array, go through each item of array. The difference is almost 200 times in perf. 6 sec in .net and 10 minutes in PS.  

PS Code (10 minutes)

.net CODE (6 sec)

$num = 0$buffersize = 4096*1024*200 

# declare a buffer

[Byte[]]$ba = new-object Byte[] $buffersize$linecount = 1$start = [System.DateTime]::Nowdo{         $num++              if ($ba[$num] -eq 10) { $linecount++}

   }while ($num -lt $buffersize)

$end = [System.DateTime]::Now$time = $end - $startwrite-host "Time taken -" $time

static void testloop(){

  System.DateTime  start = DateTime.Now;  int  buffer = 4096*1024*200;       byte[] b =  new byte[buffer];  int count  = 1;  int  lastcount = 0;       do  {        {                             if  (b[lastcount] == 10) { count++; }             }           lastcount++;    } while  (lastcount < buffer);  Console.WriteLine("Time Taken - "  + (DateTime.Now  - start));}

 

After discussion with the PS product team. Found out the reason. So since PS is interpreted vs a compiled .net you would expect some perf difference. But internally what PS does is if a loop iterates for more than 16 times, then on the 16th iteration the content of the loop is compiled dynamically as a .net code and the dynamic method is then invoked inside the loop. Great. The only gotcha being that the dynamically invoked method could be a security risk (think viruses duplicating itself) so .net runs a security check on the stack and that is what is slowing the loop down for long running loops in PS.

Workaround:
==================
We can declare .net code inside PS using the Add-type command. And this way the dynamically invoked method will encapsulate the loop and the iteration so the security check is only done once. Running the code below I consistently got performance similar to .net if not better in few cases.

# declare a buffer
$buffersize=1024*4
[Byte[]]$ba = new-object Byte[] $buffersize
$linecount = 1
$ba[30] = 10
$ba[300] = 10
$start = [System.DateTime]::Now

$source = @"
   public class myArrayIterator
   {
                public static int GetCount(byte[]b, byte tofind)
                {
                    int count = 1;
            int lastcount = 0;
                    int max = b.Length ;
            do
            {
               
                {
                    if (b[lastcount] == tofind) { count++; }
               }
                lastcount++;
            } while (lastcount < max);
                   return count ;
                }
   }
"@

Add-Type -TypeDefinition $source

$count = [myArrayIterator]::GetCount($ba, 10)
write-host "Count - " $count

$end = [System.DateTime]::Now
$time = $end - $start
write-host "Time taken -" $time

 

Hope this helps with your long running loops in Powershell.

Comments

  • Anonymous
    July 25, 2014
    The comment has been removed