共用方式為


Converting an iterative sequential application to data parallel using CMD scripts

I have an application that is based on a sequential iterative call from a command script. It is a Monte Carlo simulation application is essentially easy to put in a data parallel format and deployed to run on CCS cluster easily.

 

Desciption of the sequential iterative version:

 

A main.exe program generates the n inputs for variation runs. The program has three inputs: Directory for results output, Project name, Number of variations
For example:
main Results2008 test_ccs 100

A DOS script (run_sub_DOS.bat) called by main execute the ‘distributed’ program n times iteratively. it calls the application "simulation" iteratively.

this is how the script looks:

@echo off

FOR /L %%i IN (1,1,%3) DO simulation %1 %2.%%i%

To modify this into a data parallel version I first changed this script to the following (call it run_sub_CCS.bat). This script, as mentioned above in the sequential case, is called by "main.exe".

 

set workdir=\\RRS-HN\lhuang\RRS-Example\Test1
FOR /F "usebackq tokens=4" %%j in (`job new /numprocessors:4-4 /jobname:RRS_TESTjob`) do set JobID=%%j
FOR /L %%i IN (1,1,%3) DO job add %JobID% /workdir:%workdir% /stdout:frf.out.%%i simulation %1 %2.%%i%

Then I had the following two wrap scripts to call this. That is I calle the scripts - CCS_Begin (which calls main) and CCS_Finalize in that order. These does some extra work of copying data to the CCS nodes. I decided to leave them as it is to point out the real environment.

@Rem CCS_Begin: script to run main

@call clusrun /readynodes rmdir \\localhost\C$\ranjan\RRS /Q /S

@call clusrun /readynodes mkdir \\localhost\C$\ranjan\RRS\Sub

@call clusrun /readynodes xcopy \\RRS-HN\ranjan\RRS\Sub\\localhost\C$\ranjan\RRS /Y /S

main Results2008 test_css 4

@FOR /F "usebackq tokens=1" %%k in (`job list /status:NotSubmitted`) do set JobID=%%k

job submit /ID:%JobID%

@Rem CCS_Finalize

FOR /F "usebackq tokens=1" %%k in (`job list /status:finished`) do set JobID=%%k
if %JobID%==%1 call clusrun /readynodes rmdir \\localhost\C$\ranjan\RRS /Q /S

(adopted from real application scenario).