Understanding Application and Data Files

 

Applies To: Microsoft HPC Pack 2012, Microsoft HPC Pack 2012 R2

HPC Pack is flexible regarding the organization of task input, output, and error files. You can use the Working Directory, Standard Input, Standard Output, and Standard Error task properties to specify a local or Universal Naming Convention (UNC) file path to any shared location. Tasks can also operate on files stored in the default working directory. In either case, using a central file store on a shared folder, preferably on a file server, is recommended.

If you do not specify Standard Output and Standard Error files for your task, the results are directed to the hpcsheduler database and appear as the task’s output fields in the View Job dialog box. The database stores up to 4 KB of data per task. Any additional data beyond 4 KB is truncated. If you specify the Standard Output and Standard Error files, the task output is directed to those locations.

Note

When specifying file paths, remember that these files are accessed from the compute node. For example, “C:\Temp” refers to the Temp directory on the compute node that is running the application, not the Temp directory on the head node or on the client computer.

Program files

Use the following guidelines when you specify the program file in the command line for your task:

  • If the application exists on all compute nodes and has been added to the Path environment variable, type only the executable name. For example, type myapp.exe.

  • If the application exists on all compute nodes and has not been added to the Path environment variable, type the full local path to the application on each compute node. For example, C:\Program Files\myapp.exe.

  • If the application is installed on a file share, specify the UNC path to the executable file. For example, type \\server_name\Program Files\myapp.exe.

Data files

By default, the standard input, output, and error files are relative to the working directory of the compute node that is running the application. The default value for the Working Directory task property is the submitting user's home directory on the node (%userprofile%, which typically points to C:\Users\user_name\Documents).

You can use the Working Directory task property to simplify task access to data files on a shared folder. For example, if you set a working directory of \\fileserver\fileshare\ and a Standard Input of somefile.txt, the Standard Input will be read from \\fileserver\fileshare\somefile.txt.

If you do not specify Standard Output and Standard Error files for your task, the results are directed to the Job Scheduler service database and appear as the task’s output fields in the Task Properties dialog box. The database stores up to 4 KB of data per task. Any additional data beyond 4 KB is truncated. If you specify the Standard Output and Standard Error files, the task output is directed to those locations and is not stored in the Job Scheduler service database.

Tasks with sub-tasks and the asterisk (*)

In tasks that include sub-tasks, you can use the asterisk (*) character as a placeholder for the parametric sweep index (in Parametric Sweep tasks) or for the sub-task ID (in Service, Node Preparation, and Node Release tasks). For example, in the first sub-task, \\datashare\userName\file*.txt is interpreted as \\datashare\userName\file1.txt (or in a Parametric Sweep task, as the first index value).

You can include more than one asterisk (*) to indicate the minimum number of positions to use when expressing the number of the index or sub-task. This does not limit numbers that require more positions. For example, \\datashare\userName\file****.txt is interpreted as \\datashare\userName\file0001.txt on the first sub-task. 

The job scheduler interprets commands before sending them to the compute nodes. To run a command that uses an asterisk (*), include the caret (^) as an escape character. For example, to create a Node Release task that deletes all files from a folder, you can type the command like this:

delete c:\temp\^*

Note

Commands that are submitted from a command prompt window are interpreted before they are passed to the job scheduler. At a command prompt window, to submit a task that runs the same command you need to add an extra escape character. For example: delete c:\temp\^^*

The job scheduler receives the command as delete c:\temp\^*, and the compute node receives the command as delete c:\temp\*

Additional considerations

  • When accessing a network share, use the full UNC path instead of using Driver Letter Mappings, since mappings do not persist between different logon sessions.

  • Creating a file store for input, output, and error files is usually a coordinated effort between the cluster administrator and the user, and requires administrator's permissions and oversight over shared resources and the user's specific knowledge of the projects, jobs, and files involved.

Additional references