VSTO & VBA - Improving Excel's performance when writing data into Sheets by using memory arrays instead of inefficient code loops
In my previous article (VSTO & VBA - How to troubleshoot Excel memory and performance problems caused by inefficient code loops) I wrote about a few common issues which may be encountered when trying to read data from an Excel workbook using COM automation.
But after you finished reading it, maybe you asked yourself: What about writing data into Excel sheets? Can we optimize that operation too?
The answer is: YES, we generally can use the same techniques. Get a handle to the range of cells you wish to write, process and write the data into a memory array structure and finally, write it in one operation.
Of course, when we want to write data, most of the time we need to format it properly (cell font color, cell borders, size ..etc). Sometimes we also need to write formulas, or we need to search data one cell at a time. Not every operation can be implemented with this approach, but when the situation allows for this optimization, the results are very good.
So, let's start by assuming that we have to write a code which generates a report based on some template which can be customized by the end-user. Thus, the end user is in charge with writing and formatting the column headers and our program has to keep the same look-and-feel and it also has to insert the data. The blank report workbook could look like the one from below:
You may notice that the range [A1:G1] contains the header, and the data has to be inserted starting from row #2. I have highlighted with red markers the cells which have custom formatting applied:
> cell C2 has its text written in Bold, a light blue Font color and a light
blue Background;
> cell F2 had an yellow Background;
> cell G2 has a text with Italic effect;
The $varX strings from cell C2 are user-defined variables and our code has to dynamically replace them at run-time with numbers.
First I am going to demonstrate how poor the performance is when we attempt to write the values in sequential mode. To preserve the formatting, I am going to Select our template row [B2:G2], then I am going to execute a Selection.Insert Shift:=xlDown to shift the information one row lower, then the previous level will be filled in with data.
This is the VBA code I used:
Option Explicit Public Declare Function GetTickCount Lib " kernel32.dll " () As Long Sub Test1() On Error Resume Next Dim r As Integer Dim c As Integer Dim sht As Worksheet Dim counter1 As Integer Dim rowInput As Integer Dim colInput As Integer Dim rngInput As Range Dim rngOutput As Range Dim t1 As Long Dim t2 As Long Set sht = ActiveWorkbook.Sheets(1) Application.ScreenUpdating = False Application.EnableEvents = False Application.Calculation = xlCalculationManual Application.DisplayAlerts = False Application.DisplayStatusBar = False counter1 = 0 t1 = GetTickCount For r = 2 To 10002 DoEvents 'select the range used as template sht.Range(Cells(r, 1), Cells(r, 8)).Select 'copy the formatting one row down Selection.Copy Selection.Insert Shift:=xlDown 'save a handle for the row which has to be filled in Set rngInput = sht.Range(sht.Cells(r, 2), sht.Cells(r, 7)) For colInput = 1 To 5 Dim strTmp As String Dim strIn As String strIn = rngInput.Cells(1, colInput).Value2 'try to match $var1 and replace it with a dummy value If InStr(1, strIn, "$var1") > 0 Then strIn = Replace(strIn, "$var1", counter1) counter1 = counter1 + 1 rngInput.Cells(1, colInput).Value2 = strIn End If 'try to match $var2 and replace it with a dummy value If InStr(1, strIn, "$var2") > 0 Then strIn = Replace(strIn, "$var2", (counter1 + 11)) counter1 = counter1 + 1 rngInput.Cells(1, colInput).Value2 = strIn End If 'try to match $var3 and replace it with a dummy value If InStr(1, strIn, "$var3") > 0 Then strIn = Replace(strIn, "$var3", "3") rngInput.Cells(1, colInput).Value2 = strIn End If Next Debug.Print r Next r t2 = GetTickCount Debug.Print t1 & " >> " & t2 Application.ScreenUpdating = True Application.EnableEvents = True Application.Calculation = xlCalculationAutomatic Application.DisplayAlerts = True Application.DisplayStatusBar = True'--------------------------------------------------------------- Debug.Print "Writing cells sequentially" & vbNewLine 'Debug.Print "Range.Cells.Count : " & r * 6 Debug.Print "Range.Columns.Count : " & 6 Debug.Print "Range.Rows.Count : " & r & vbNewLine Debug.Print "Operation started at (CPU tick count): " & t1 Debug.Print "Operation ended at (CPU tick count) : " & t2 Debug.Print "Milliseconds duration : " & t2 - t1 End Sub |
First of all I want to let you know that this code is so badly written that it crashed my Excel 2010 around iteration # 200 !!! Then I had to add the DoEvents instruction and I had to disable ScreenUpdating, Event triggers ...etc.
This is the output:
Output =================================================================== Writing cells sequentially Range.Columns.Count : 6Range.Rows.Count : 10003 Operation started at (CPU tick count): 40921230Operation ended at (CPU tick count) : 41078354Milliseconds duration : 157124 |
OK, so the badly designed code took 157 seconds (157124 milliseconds) to complete.
Now let's take a look at the optimized version. You will notice that I don't need to use any Excel performance booster (DisableEvents, ScreenUpdating) because it takes too little for anyone to notice my code finished its job :) !
Option Explicit Public Declare Function GetTickCount Lib " kernel32.dll " () As Long Sub Test2() On Error Resume Next Dim arrOut(0 To 10000, 0 To 5) As Variant 'if you want to send Formulas to the sheet, use this code ... 'arrFor(0, 0) = "=R1C2 + R1C3" 'arrFor(0, 1) = "=R10C2 + R10C3" 'arrFor(1, 0) = "=R1C4 + R1C4" 'arrFor(1, 1) = "=R1C3 + R1C3" 'ActiveSheet.Range("A1:B2").FormulaR1C1 = arrFor Dim r As Integer Dim c As Integer Dim counter1 As Integer Dim rowInput As Integer Dim colInput As Integer Dim rngInput As Range Dim rngOutput As Range Dim t1 As Long Dim t2 As Long Set rngInput = ActiveWorkbook.Sheets(1).Range("A2:H2") 'obtain a handle on the destination range (10000 cells) Set rngOutput = ActiveWorkbook.Sheets(1).Range("A3:H10002") rowInput = 1 'the input file has all the data on row 2, columns 2,3,4,5,6 counter1 = 0 t1 = GetTickCount '-------------------------------------------------------------- 'format the new cells rngInput.Select Selection.Copy rngOutput.Select Selection.PasteSpecial Paste:=xlPasteFormats, _ Operation:=xlNone, SkipBlanks:=False, Transpose:=False rngInput.Cells(1, 1).Select Application.CutCopyMode = False 'build the Array structure line by line Set rngInput = ActiveWorkbook.Sheets(1).Range("B2:G2") For r = 0 To 10000 For c = 0 To 5 Dim strTmp As String Dim strIn As String strIn = rngInput.Cells(rowInput, c + 1).Value arrOut(r, c) = strIn 'try to match $var1 and replace it with a dummy value If InStr(1, strIn, "$var1") > 0 Then strIn = Replace(strIn, "$var1", counter1) counter1 = counter1 + 1 arrOut(r, c) = strIn End If 'try to match $var2 and replace it with a dummy value If InStr(1, strIn, "$var2") > 0 Then strIn = Replace(strIn, "$var2", (counter1 + 11)) counter1 = counter1 + 1 arrOut(r, c) = strIn End If 'try to match $var3 and replace it with a dummy value If InStr(1, strIn, "$var3") > 0 Then strIn = Replace(strIn, "$var3", "3") arrOut(r, c) = strIn End If Next c Next r '--------------------------------------------------------------- 'finished building memory array '----------------------------------------------------------------'send the raw data to Excel in one operation Set rngOutput = ActiveWorkbook.Sheets(1).Range("B2:G10002") rngOutput.Value2 = arrOut '----------------------------------------------------------------'done! t2 = GetTickCount Debug.Print "Writing memory array to destination range" & _ vbNewLine Debug.Print "Range.Cells.Count : " & rngOutput.Cells.Count Debug.Print "Range.Columns.Count : " & rngOutput.Columns.Count Debug.Print "Range.Rows.Count : " & rngOutput.Rows.Count & _ vbNewLine Debug.Print "Operation started at (CPU tick count): " & t1 Debug.Print "Operation ended at (CPU tick count) : " & t2 Debug.Print "Milliseconds duration : " & t2 - t1 End Sub |
Output =================================================================== Writing memory array to destination range
Range.Cells.Count : 60006Range.Columns.Count : 6Range.Rows.Count : 10001 Operation started at (CPU tick count): 43972391Operation ended at (CPU tick count) : 43972984Milliseconds duration : 593 |
The operation was completed 264% faster.
As I wrote before, this can't be implemented in every scenario, but when it fits into your design, the improvement in speed will make you love Excel!
Thank you for reading my article! Bye :-)