Running parallel Excel on Windows HPC 2008 R2 hosted in Windows Azure IaaS VMs: Part 3
click here to Move to next blog-Part4
click here to Move to previous blog-Part2
1 Introduction
In previous part of my blog (Part2), I have presented a basic Excel workbook, running a multiplication table in parallel on HPC cluster. We saw the usage of HPC macros framework (HPC_GetVersion, HPC_Initialize, HPC_Partition, HPC_Execute, HPC_Merge, HPC_Finalize and HPC_ExecutionError) called at specific point by the HPC Services for Excel client library during a HPC cluster calculation.
In the example examined, the outcome was the execution in parallel of multiplication values in Egress_Table cells. Our code strategy to pass data between HPC macros has been the usage of array of Variant with size equal to three, Dim data(2) As Variant having following elements:
- data(0) : contains the row index of the Egress_Table
- data(1) : contains the column index of the Egress_Table
- data(2) : it is the value of the cell (data(0), data(1)) in Egress_Table which value is the product equal to "row value" times "column value" times "fix number stored in B1 cell".
Figure1: data array to set value in a specific cell of Egress_Table
We saw as HPC_Partition is a VBA macro responsible of split the total computation in individual calculation step; in our previous example we used HPC_Partition to segment the total number of cells in Egress_Table [ 26 rows (D2:D27) and 11 columns (E1:O1)] in single cell. Here I want to use different criteria to slit data in HPC_Partition, using a block of cells instead a single cell.
2 How to partition data in blocks
In our new Excel workbook the structure of “Sheet1” (cells naming, meaning of cells, buttons, size of table, etc.) is kept unchanged compare with previous example (Part2). Logic of execution of Excel workbook on the HPC cluster is still executed through two VBA modules:
- HPCControlMacros.bas: a set of VBA macros contains the “control” functions; this module is unchanged respect to the equivalent module in previous example (https://blogs.msdn.com/b/faber/archive/2013/06/25/running-parallel-excel-on-windows-hpc-2008-r2-hosted-in-windows-azure-iaas-vms-part-2.aspx)
- HPCExcelMacros.bas: it contains all macro framework functions; compare with the equivalent module in previous example (Part2) we have some changes in algorithms to partition data in blocks.
We proceed on how is implemented the new algorithm in HPCExcelMacros.bas; most of changes are done in HPC_Partition.
We use couple of new variables to manage the block of data:
Public Const SingleBlock As Integer= 3
Public Const NumBlocks As Integer = 7
Public Const SizeVector As Integer = NumBlocks * SingleBlock
Dim data(SizeVector) As Variant
where:
- Single block is the size of array of a single block
- NumBlocks is the number of elements of Egress_Table we want to partition in single turn
- SizeVector is the size of array data()
In Figure 2 is shown a case with number of contiguous cells block equal to 3 (NumBlocks=3)
Figure 2: flowchart with pseudo-code
In Figure 3 and Figure 4 is reported the flowchart with pseudo-code to slit up the Egress_Table cells in blocks of cells.
Figure 3: flowchart with pseudo-code
Figure 4: flowchart with VB statements
Short description of variables in Figure 3 and Figure 4 flowcharts:
- IndexBlock is a counter (Integer) that we use to increase number of cell in a single block; this counter is between [0, NumBlock-1] for a total number of NumBlock elements.
- IndexRow is counter (Integer) to identify the number of row in the Egress_Table
- indexCol is a counter (Integer) to identify the number of column in the Egress_Table
- NumRows is the total number of rows in the Egress_Table
- NumCols is the total number of columns in the Egress_Table
Now we can draw a more detailed view with VBA statements (Figure 5-Figure 6)
Figure 5: flowchart with VB statements
In HPC_INIT the content of Egress_table is deleted through the statement: Range("Egress_Table".ClearContents
Figure 6: flowchart for HPC_EXECUTION and HPC_MERGE
The calculation will be completed when we’ve reached the end of the table. We can do that by incrementing the row and column numbers as we move through the table. Whenever we reach the end of a column, we move to the next column. When we’ve passed the last column, the table (Egress_Table) is finished and we want the function to return Null.
Respect to algorithm of multiplication table described in Part2, processing of Egress_Table values is here execute by column.
4 HPCExcelMacros.bas
Attribute VB_Name = "HPCExcelMacros"
'==========================================================
' This is a skeleton macro file for using the HPC/Excel
' VBA macros with an HPC cluster.
'
' Be sure to add references to the required component:
' Microsoft_Hpc_Excel
' using the VBA editor menu Tools -> References.
'==========================================================
Option Explicit
'==========================================================
' Section 1: Variables and constants
'==========================================================
Dim indexRow As Integer
Dim indexCol As Integer
Dim CalculationComplete As Boolean
Dim StartTime As Double
Dim FinishTime As Double
Public NumRows As Integer
Public NumCols As Integer
Public CounterStatus As Integer
Public Const SingleBlock As Integer = 3
Public Const NumBlocks As Integer = 3
Public Const SizeVector As Integer = NumBlocks * SingleBlock
'==========================================================
' Section 2: HPC Calculation Macros
'==========================================================
'----------------------------------------------------------
' HPC_GetVersion returns the version of the macro framework
' implemented in the workbook. That ensures that future
' versions of the HPC Excel components will always be able
' to run this workbook.
'
' We've implemented version 1.0 of the macro framework,
' so we return the string "1.0".
'----------------------------------------------------------
Public Function HPC_GetVersion()
HPC_GetVersion = "1.0"
End Function
'----------------------------------------------------------
' HPC_Initialize will be called when the client starts
' a calculation. Put any pre-calculation steps in this
' function.
'----------------------------------------------------------
Public Function HPC_Initialize()
NumCols = 1
indexRow = 1
indexCol = 1
CounterStatus = 0
CalculationComplete = False
Range("Egress_Table").ClearContents
NumRows = Range("MyRows").Count
NumCols = Range("MyCols").Count
StartTime = Timer
End Function
'----------------------------------------------------------
' HPC_Partition is used to collect required data for a
' single calculation step (or iteration). Whatever data
' is returned from this function will be passed to the
' HPC_Execute macro, running on the HPC compute nodes.
'
' When all calculation steps have been completed, return
' "Null" from this function to end the calculation.
'----------------------------------------------------------
Public Function HPC_Partition() As Variant
Dim data(SizeVector) As Variant
Dim indexBlock As Integer
For indexBlock = 0 To (NumBlocks - 1)
If indexCol > NumCols Then
indexCol = 1
indexRow = indexRow + 1
End If
If indexRow > NumRows Then
' return null to indicate the end of the calculation
HPC_Partition = Null
' and exit the function now, we can skip the rest
Exit Function
End If
data(0 + indexBlock * SingleBlock) = indexRow
data(1 + indexBlock * SingleBlock) = indexCol
If ((indexRow = NumRows) And (indexCol = NumCols)) Then
indexCol = indexCol + 1
Exit For
End If
' move to the next column
indexCol = indexCol + 1
Next
HPC_Partition = data
End Function
'----------------------------------------------------------
' HPC_Execute performs a single calculation step (or iteration).
' The input data will match whatever was returned from the HPC_Partition function, above.
'
' The return value from this function should be the results of the calculation; those results will be
' passed to the HPC_Merge macro, running on the desktop.
'----------------------------------------------------------
Public Function HPC_Execute(data As Variant) As Variant
Dim rws As Integer
Dim cols As Integer
Dim multiplyFactor As Integer
Dim rngCols As Range
Dim rngRows As Range
Dim indexBlock As Integer
multiplyFactor = ActiveSheet.Range("Factor").Value
Set rngCols = ActiveSheet.Range("MyCols")
Set rngRows = ActiveSheet.Range("MyRows")
For indexBlock = 0 To (((UBound(data)) / SingleBlock) - 1)
' (rws, cols): number of row and column of the specific cell in Egress Table range
rws = data(0 + indexBlock * SingleBlock)
cols = data(1 + indexBlock * SingleBlock)
data(2 + indexBlock * SingleBlock) = rngRows(rws, 1).Value * rngCols(1, cols).Value * multiplyFactor
Next
HPC_Execute = data ' return value from the function
End Function
'----------------------------------------------------------
' HPC_Merge is called after a single calculation step (or iteration) is complete;
' the input data will match whatever was returned from the HPC_Execute function, above.
'
' Use this function to store results: insert results into
' the spreadsheet, write to a database, write a text file, or anything else.
'----------------------------------------------------------
Public Function HPC_Merge(data As Variant)
Dim indexBlock As Integer
Dim i As Integer
Dim j As Integer
' MsgBox "data size: " & UBound(data)
Application.ScreenUpdating = False
For indexBlock = 0 To (((UBound(data)) / SingleBlock) - 1)
Range("Egress_Table").Cells(data(0 + indexBlock * SingleBlock), data(1 + indexBlock * SingleBlock)).Value = data(2 + indexBlock * SingleBlock)
i = CInt(data(0 + indexBlock * SingleBlock))
j = CInt(data(1 + indexBlock * SingleBlock))
If (i = NumRows) And (j = NumCols) Then
Call UpdateStatus
Application.ScreenUpdating = True
Exit For
End If
Call UpdateStatus
Next
Application.ScreenUpdating = True
End Function
'----------------------------------------------------------
' HPC_Finalize is called after the last calculation step
' (or iteration) is complete.
' Use this funtion for any post-processing steps you want to run after the calculation.
'
' The function here cleans up the HPC client object, to close the session and end the calculation.
'----------------------------------------------------------
Public Function HPC_Finalize()
FinishTime = Timer
CalculationComplete = True
Application.ScreenUpdating = True
Call UpdateStatus
' Clean up the calculation. It's a good idea to
' leave this here, even if you make changes to this function.
' The function we call here is in the "HPCControlMacros" module.
CleanUpClusterCalculation
End Function
'----------------------------------------------------------
' HPC_ExecutionError is called when there is any error in
' the calculation.
'
' The function here shows a pop-up error message. You
' can modify this to display the error in a different
' way (for example, in the Excel status bar).
'----------------------------------------------------------
Public Function HPC_ExecutionError(errorMessage As String, errorContents As String)
MsgBox errorMessage & vbCrLf & vbCrLf & errorContents
End Function
Sub UpdateStatus()
Dim statusMessage As String
If Not CalculationComplete Then
CounterStatus = CounterStatus + 1
statusMessage = "Calculated " & CounterStatus & "/" & (NumRows * NumCols)
Else
statusMessage = "Calculated " & CounterStatus & "/" & (NumRows * NumCols)
statusMessage = statusMessage & "; completed in " & FormatNumber(FinishTime - StartTime) & "s"
End If
Application.StatusBar = statusMessage
ActiveSheet.Range("percentageCompletion").Value = CounterStatus / (NumRows * NumCols)
End Sub
5 HPCControlMacros.bas
Attribute VB_Name = "HPCControlMacros"
'==========================================================
' This is a skeleton macro file for using the HPC/Excel
' VBA macros with an HPC cluster.
'
' Be sure to add references to the required component:
'
' Microsoft_Hpc_Excel
'
' using the VBA editor menu Tools -> References.
'==========================================================
Option Explicit
'==========================================================
' Section 1: Variables and constants
'==========================================================
'----------------------------------------------------------
' This is the cluster scheduler, or head node. Fill in
' the hostname of your cluster scheduler.
'----------------------------------------------------------
Private Const HPC_ClusterScheduler = "exhn001"
'----------------------------------------------------------
' This is a network share used to store a temporary copy
' of the workbook. Make sure that the directory exists,
' that you have write access to the directory, and that
' the compute nodes in the cluster have read access.
'----------------------------------------------------------
Private Const HPC_NetworkShare = "\\exhn001\share"
'----------------------------------------------------------
' Optionally, use a job template on the cluster. See the
' HPC Server documentation for more about job templates.
' Fill in the name of the template you want to use, or
' leave this empty ("") to use the default job template.
'----------------------------------------------------------
Private Const HPC_JobTemplate = ""
'----------------------------------------------------------
' This object is our client for connecting to the HPC
' cluster and running calculations.
'----------------------------------------------------------
Private HPCExcelClient As IExcelClient
'==========================================================
' Section 2: Control Functions
'==========================================================
'----------------------------------------------------------
' This is the main calculation function, which connects
' to the client object and runs a calculation. The method
' can run both desktop and cluster calculations, controlled
' by the function parameter "CalculateOnDesktop".
'
' You'll see below two functions that run calculations
' by calling this method, with the parameter set for either
' cluster or desktop calculation.
'----------------------------------------------------------
Private Sub CalculateWorkbook(CalculateOnDesktop As Boolean)
Dim HPCWorkbookPath As String
On Error GoTo ErrorHandler
' Create a new excelCient instance per session
Set HPCExcelClient = New ExcelClient
' Initialize the excel client object with the current workbook
HPCExcelClient.Initialize ActiveWorkbook
If CalculateOnDesktop = False Then
' We need a copy of the file on the network, so it's accessible
' by the cluster compute nodes. Save a temporary copy to the
' share directory.
HPCWorkbookPath = HPC_NetworkShare & Application.PathSeparator & ActiveWorkbook.name
ActiveWorkbook.SaveCopyAs HPCWorkbookPath
' Create a cluster session with the desired options. Here, we're
' just using the scheduler name and (optionally) a job template.
If HPC_JobTemplate <> "" Then
HPCExcelClient.OpenSession headNode:=HPC_ClusterScheduler, remoteWorkbookPath:=HPCWorkbookPath, jobTemplate:=HPC_JobTemplate
Else
HPCExcelClient.OpenSession headNode:=HPC_ClusterScheduler, remoteWorkbookPath:=HPCWorkbookPath
End If
End If
' Run on local machine or cluster as chosen in workbook
HPCExcelClient.Run CalculateOnDesktop
Exit Sub
ErrorHandler:
' Notify user of error and clean up any allocated resources
MsgBox Prompt:=Err.Description, Title:="HPC Calculation Error"
If Not HPCExcelClient Is Nothing Then
HPCExcelClient.Dispose
End If
End Sub
'----------------------------------------------------------
' This is a public method for running a calculation on the
' desktop. It uses the "CalculateWorkbook" function, above,
' and sets the "Desktop" parameter to True.
'----------------------------------------------------------
Public Sub CalculateWorkbookOnDesktop()
CalculateWorkbook (True)
End Sub
'----------------------------------------------------------
' This is a public method for running a calculation on the
' cluster. It uses the "CalculateWorkbook" function, above,
' and sets the "Desktop" parameter to False.
'----------------------------------------------------------
Public Sub CalculateWorkbookOnCluster()
CalculateWorkbook (False)
End Sub
'----------------------------------------------------------
' This method is used to clean up a calculation after it's
' finished; here, we're closing the cluster session so we
' don't waste resources.
'----------------------------------------------------------
Public Sub CleanUpClusterCalculation()
On Error Resume Next
HPCExcelClient.CloseSession
HPCExcelClient.Dispose
On Error GoTo 0
End Sub
MultiplicationTable_byGroupColumns.xlsb
Excel workbook enabled to run on Windows HPC