Поделиться через


Running parallel Excel on Windows HPC 2008 R2 hosted in Windows Azure IaaS VMs: Part 3

click here to Move to next blog-Part4

click here to Move to previous blog-Part2

 1 Introduction

In previous part of my blog (Part2), I have presented a basic Excel workbook, running a multiplication table in parallel on HPC cluster. We saw the usage of HPC macros framework (HPC_GetVersion, HPC_Initialize, HPC_Partition, HPC_Execute, HPC_Merge, HPC_Finalize and HPC_ExecutionError) called at specific point by the HPC Services for Excel client library during a HPC cluster calculation.

In the example examined, the outcome was the execution in parallel of multiplication values in Egress_Table cells. Our code strategy to pass data between HPC macros has been the usage of array of Variant with size equal to three, Dim data(2) As Variant having following elements:

  • data(0) : contains the row index of the Egress_Table
  • data(1) : contains the column index of the Egress_Table
  • data(2) : it is the value of the cell (data(0), data(1)) in Egress_Table which value is the product equal to "row value" times "column value" times "fix number stored in B1 cell".

 

Figure1: data array to set value in a specific cell of Egress_Table

 We saw as HPC_Partition is a VBA macro responsible of split the total computation in individual calculation step; in our previous example we used HPC_Partition to segment the total number of cells in Egress_Table [ 26 rows (D2:D27) and 11 columns (E1:O1)] in single cell. Here I want to use different criteria to slit data in HPC_Partition, using a block of cells instead a single cell.

 

2 How to partition data in blocks

In our new Excel workbook the structure of “Sheet1” (cells naming, meaning of cells, buttons, size of table, etc.) is kept unchanged compare with previous example (Part2). Logic of execution of Excel workbook on the HPC cluster is still executed through two VBA modules:

We proceed on how is implemented the new algorithm in HPCExcelMacros.bas; most of changes are done in HPC_Partition.

We use couple of new variables to manage the block of data:

Public Const SingleBlock As Integer= 3

Public Const NumBlocks As Integer = 7

Public Const SizeVector As Integer = NumBlocks * SingleBlock

Dim data(SizeVector) As Variant

where:

  • Single block is the size of array of a single block
  • NumBlocks is the number of elements of Egress_Table we want to partition in single turn
  • SizeVector is the size of array data()

 In Figure 2 is shown a case with number of contiguous cells block equal to 3 (NumBlocks=3)

 

 Figure 2: flowchart with pseudo-code

In Figure 3 and Figure 4 is reported the flowchart with pseudo-code to slit up the Egress_Table cells in blocks of cells.

 

Figure 3: flowchart with pseudo-code

  

 

Figure 4: flowchart with VB statements

 

Short description of variables in Figure 3 and Figure 4 flowcharts:

  • IndexBlock is a counter (Integer) that we use to increase number of cell in a single block; this counter is between [0, NumBlock-1] for a total number of NumBlock elements.
  • IndexRow is counter (Integer) to identify the number of row in the Egress_Table
  • indexCol is a counter (Integer) to identify the number of column in the Egress_Table
  • NumRows is the total number of rows in the Egress_Table
  • NumCols is the total number of columns in the Egress_Table

 Now we can draw a more detailed view with VBA statements (Figure 5-Figure 6)  

 

 

Figure 5: flowchart with VB statements

 In HPC_INIT the content of Egress_table is deleted through the statement: Range("Egress_Table".ClearContents

 

 Figure 6: flowchart for HPC_EXECUTION and HPC_MERGE

The calculation will be completed when we’ve reached the end of the table. We can do that by incrementing the row and column numbers as we move through the table. Whenever we reach the end of a column, we move to the next column. When we’ve passed the last column, the table (Egress_Table) is finished and we want the function to return Null.

Respect to algorithm of multiplication table described in Part2, processing of Egress_Table values is here execute by column.

 

 4 HPCExcelMacros.bas

 Attribute VB_Name = "HPCExcelMacros"  
 
  '==========================================================  
  ' This is a skeleton macro file for using the HPC/Excel 
  ' VBA macros with an HPC cluster.  
  '  
  ' Be sure to add references to the required component:  
  ' Microsoft_Hpc_Excel 
  ' using the VBA editor menu Tools -> References.  
  '==========================================================  
 
 Option Explicit 
 
  '==========================================================  
  ' Section 1: Variables and constants 
  '==========================================================  
 Dim indexRow As Integer 
 Dim indexCol As Integer 
 
 Dim CalculationComplete As Boolean 
 Dim StartTime As Double 
 Dim FinishTime As Double 
 
 Public NumRows As Integer 
 Public NumCols As Integer 
 Public CounterStatus As Integer 
 
 
 Public Const SingleBlock As Integer = 3 
 Public Const NumBlocks As Integer = 3 
 Public Const SizeVector As Integer = NumBlocks * SingleBlock 
 
  '==========================================================  
  ' Section 2: HPC Calculation Macros 
  '==========================================================  
 
  '----------------------------------------------------------  
  ' HPC_GetVersion returns the version of the macro framework 
  ' implemented in the workbook. That ensures that future 
  ' versions of the HPC Excel components will always be able 
  ' to run this workbook.  
  '  
  ' We've implemented version 1.0 of the macro framework,  
  ' so we return the string "1.0".  
  '----------------------------------------------------------  
 Public Function HPC_GetVersion()  
 
 HPC_GetVersion = "1.0"  
 
 End Function 
 
  '----------------------------------------------------------  
  ' HPC_Initialize will be called when the client starts 
  ' a calculation. Put any pre-calculation steps in this 
  ' function.  
  '----------------------------------------------------------  
 Public Function HPC_Initialize()  
 NumCols = 1 
 indexRow = 1 
 indexCol = 1 
 
 CounterStatus = 0 
 CalculationComplete = False 
 
 Range("Egress_Table").ClearContents 
 
 NumRows = Range("MyRows").Count 
 NumCols = Range("MyCols").Count 
 StartTime = Timer 
 
 End Function 
 
  '----------------------------------------------------------  
  ' HPC_Partition is used to collect required data for a 
  ' single calculation step (or iteration). Whatever data 
  ' is returned from this function will be passed to the 
  ' HPC_Execute macro, running on the HPC compute nodes.  
  '  
  ' When all calculation steps have been completed, return 
  ' "Null" from this function to end the calculation.  
  '----------------------------------------------------------  
 Public Function HPC_Partition() As Variant 
 Dim data(SizeVector) As Variant 
 Dim indexBlock As Integer 
 
 For indexBlock = 0 To (NumBlocks - 1)  
 If indexCol > NumCols Then 
 indexCol = 1 
 indexRow = indexRow + 1 
 End If 
 
 If indexRow > NumRows Then 
  ' return null to indicate the end of the calculation 
 HPC_Partition = Null 
  ' and exit the function now, we can skip the rest 
 Exit Function 
 End If 
 
 data(0 + indexBlock * SingleBlock) = indexRow 
 data(1 + indexBlock * SingleBlock) = indexCol 
 
 If ((indexRow = NumRows) And (indexCol = NumCols)) Then 
 indexCol = indexCol + 1 
 Exit For 
 End If 
 
  ' move to the next column 
 indexCol = indexCol + 1 
 Next 
 HPC_Partition = data 
 End Function 
 
  '----------------------------------------------------------  
  ' HPC_Execute performs a single calculation step (or iteration).  
  ' The input data will match whatever was returned from the HPC_Partition function, above.  
  '  
  ' The return value from this function should be the results of the calculation; those results will be 
  ' passed to the HPC_Merge macro, running on the desktop.  
  '----------------------------------------------------------  
 Public Function HPC_Execute(data As Variant) As Variant 
 Dim rws As Integer 
 Dim cols As Integer 
 Dim multiplyFactor As Integer 
 Dim rngCols As Range 
 Dim rngRows As Range 
 Dim indexBlock As Integer 
 
 multiplyFactor = ActiveSheet.Range("Factor").Value 
 
 Set rngCols = ActiveSheet.Range("MyCols")  
 Set rngRows = ActiveSheet.Range("MyRows")  
 
 For indexBlock = 0 To (((UBound(data)) / SingleBlock) - 1)  
  ' (rws, cols): number of row and column of the specific cell in Egress Table range 
 rws = data(0 + indexBlock * SingleBlock)  
 cols = data(1 + indexBlock * SingleBlock)  
 data(2 + indexBlock * SingleBlock) = rngRows(rws, 1).Value * rngCols(1, cols).Value * multiplyFactor 
 Next 
 
 HPC_Execute = data ' return value from the function 
 End Function 
 
  '----------------------------------------------------------  
  ' HPC_Merge is called after a single calculation step (or iteration) is complete;  
  ' the input data will match whatever was returned from the HPC_Execute function, above.  
  '  
  ' Use this function to store results: insert results into 
  ' the spreadsheet, write to a database, write a text file, or anything else.  
  '----------------------------------------------------------  
 Public Function HPC_Merge(data As Variant)  
 Dim indexBlock As Integer 
 Dim i As Integer 
 Dim j As Integer 
 
  ' MsgBox "data size: " & UBound(data)  
 
 Application.ScreenUpdating = False 
 For indexBlock = 0 To (((UBound(data)) / SingleBlock) - 1)  
 
 Range("Egress_Table").Cells(data(0 + indexBlock * SingleBlock), data(1 + indexBlock * SingleBlock)).Value = data(2 + indexBlock * SingleBlock)  
 i = CInt(data(0 + indexBlock * SingleBlock))  
 j = CInt(data(1 + indexBlock * SingleBlock))  
 If (i = NumRows) And (j = NumCols) Then 
 Call UpdateStatus 
 Application.ScreenUpdating = True 
 
 Exit For 
 End If 
 Call UpdateStatus 
 Next 
 
 Application.ScreenUpdating = True 
 
 End Function 
 
  '----------------------------------------------------------  
  ' HPC_Finalize is called after the last calculation step 
  ' (or iteration) is complete.  
  ' Use this funtion for any post-processing steps you want to run after the calculation.  
  '  
  ' The function here cleans up the HPC client object, to close the session and end the calculation.  
  '----------------------------------------------------------  
 Public Function HPC_Finalize()  
 
 FinishTime = Timer 
 CalculationComplete = True 
 Application.ScreenUpdating = True 
 
 Call UpdateStatus 
 
  ' Clean up the calculation. It's a good idea to 
  ' leave this here, even if you make changes to this function.  
  ' The function we call here is in the "HPCControlMacros" module.  
 
 CleanUpClusterCalculation 
 
 End Function 
 
  '----------------------------------------------------------  
  ' HPC_ExecutionError is called when there is any error in 
  ' the calculation.  
  '  
  ' The function here shows a pop-up error message. You 
  ' can modify this to display the error in a different 
  ' way (for example, in the Excel status bar).  
  '----------------------------------------------------------  
 Public Function HPC_ExecutionError(errorMessage As String, errorContents As String)  
 
 MsgBox errorMessage & vbCrLf & vbCrLf & errorContents 
 
 End Function 
 
 Sub UpdateStatus()  
 
 Dim statusMessage As String 
 If Not CalculationComplete Then 
 CounterStatus = CounterStatus + 1 
 statusMessage = "Calculated " & CounterStatus & "/" & (NumRows * NumCols)  
 Else 
 statusMessage = "Calculated " & CounterStatus & "/" & (NumRows * NumCols)  
 statusMessage = statusMessage & "; completed in " & FormatNumber(FinishTime - StartTime) & "s"  
 End If 
 Application.StatusBar = statusMessage 
 ActiveSheet.Range("percentageCompletion").Value = CounterStatus / (NumRows * NumCols)  
 End Sub 
 
 

 

 5 HPCControlMacros.bas

 

 Attribute VB_Name = "HPCControlMacros"  
 
  '==========================================================  
  ' This is a skeleton macro file for using the HPC/Excel 
  ' VBA macros with an HPC cluster.  
  '  
  ' Be sure to add references to the required component:  
  '  
  ' Microsoft_Hpc_Excel 
  '  
  ' using the VBA editor menu Tools -> References.  
  '==========================================================  
 
 Option Explicit 
 
  '==========================================================  
  ' Section 1: Variables and constants 
  '==========================================================  
 
  '----------------------------------------------------------  
  ' This is the cluster scheduler, or head node. Fill in 
  ' the hostname of your cluster scheduler.  
  '----------------------------------------------------------  
 Private Const HPC_ClusterScheduler = "exhn001"  
 
  '----------------------------------------------------------  
  ' This is a network share used to store a temporary copy 
  ' of the workbook. Make sure that the directory exists,  
  ' that you have write access to the directory, and that 
  ' the compute nodes in the cluster have read access.  
  '----------------------------------------------------------  
 Private Const HPC_NetworkShare = "\\exhn001\share"  
 
  '----------------------------------------------------------  
  ' Optionally, use a job template on the cluster. See the 
  ' HPC Server documentation for more about job templates.  
  ' Fill in the name of the template you want to use, or 
  ' leave this empty ("") to use the default job template.  
  '----------------------------------------------------------  
 Private Const HPC_JobTemplate = ""  
 
  '----------------------------------------------------------  
  ' This object is our client for connecting to the HPC 
  ' cluster and running calculations.  
  '----------------------------------------------------------  
 Private HPCExcelClient As IExcelClient 
 
 
  '==========================================================  
  ' Section 2: Control Functions 
  '==========================================================  
 
  '----------------------------------------------------------  
  ' This is the main calculation function, which connects 
  ' to the client object and runs a calculation. The method 
  ' can run both desktop and cluster calculations, controlled 
  ' by the function parameter "CalculateOnDesktop".  
  '  
  ' You'll see below two functions that run calculations 
  ' by calling this method, with the parameter set for either 
  ' cluster or desktop calculation.  
  '----------------------------------------------------------  
 Private Sub CalculateWorkbook(CalculateOnDesktop As Boolean)  
 
 Dim HPCWorkbookPath As String 
 
 On Error GoTo ErrorHandler 
 
  ' Create a new excelCient instance per session 
 Set HPCExcelClient = New ExcelClient 
 
  ' Initialize the excel client object with the current workbook 
 HPCExcelClient.Initialize ActiveWorkbook 
 
 If CalculateOnDesktop = False Then 
 
  ' We need a copy of the file on the network, so it's accessible 
  ' by the cluster compute nodes. Save a temporary copy to the 
  ' share directory.  
 
 HPCWorkbookPath = HPC_NetworkShare & Application.PathSeparator & ActiveWorkbook.name 
 
 ActiveWorkbook.SaveCopyAs HPCWorkbookPath 
 
  ' Create a cluster session with the desired options. Here, we're 
  ' just using the scheduler name and (optionally) a job template.  
 
 If HPC_JobTemplate <> "" Then 
 HPCExcelClient.OpenSession headNode:=HPC_ClusterScheduler, remoteWorkbookPath:=HPCWorkbookPath, jobTemplate:=HPC_JobTemplate 
 Else 
 HPCExcelClient.OpenSession headNode:=HPC_ClusterScheduler, remoteWorkbookPath:=HPCWorkbookPath 
 End If 
 
 End If 
 
  ' Run on local machine or cluster as chosen in workbook 
 HPCExcelClient.Run CalculateOnDesktop 
 Exit Sub 
 
 ErrorHandler:  
  ' Notify user of error and clean up any allocated resources 
 MsgBox Prompt:=Err.Description, Title:="HPC Calculation Error"  
 If Not HPCExcelClient Is Nothing Then 
 HPCExcelClient.Dispose 
 End If 
 End Sub 
 
  '----------------------------------------------------------  
  ' This is a public method for running a calculation on the 
  ' desktop. It uses the "CalculateWorkbook" function, above,  
  ' and sets the "Desktop" parameter to True.  
  '----------------------------------------------------------  
 Public Sub CalculateWorkbookOnDesktop()  
 CalculateWorkbook (True)  
 End Sub 
 
  '----------------------------------------------------------  
  ' This is a public method for running a calculation on the 
  ' cluster. It uses the "CalculateWorkbook" function, above,  
  ' and sets the "Desktop" parameter to False.  
  '----------------------------------------------------------  
 Public Sub CalculateWorkbookOnCluster()  
 CalculateWorkbook (False)  
 End Sub 
 
  '----------------------------------------------------------  
  ' This method is used to clean up a calculation after it's 
  ' finished; here, we're closing the cluster session so we 
  ' don't waste resources.  
  '----------------------------------------------------------  
 Public Sub CleanUpClusterCalculation()  
 
 On Error Resume Next 
 HPCExcelClient.CloseSession 
 HPCExcelClient.Dispose 
 On Error GoTo 0 
 
 End Sub 
 
 

 

 

  MultiplicationTable_byGroupColumns.xlsb 
Excel workbook enabled to run on Windows HPC

 

click here to Move to next blog-Part4

click here to Move to previous blog-Part2