Compartilhar via


Understanding Job and Task Properties

 

Applies To: Microsoft HPC Pack 2008 R2, Microsoft HPC Pack 2012, Microsoft HPC Pack 2012 R2

The tables below contain a list of all the job and task properties that you can set in HPC Cluster Manager. These properties define how jobs and tasks run.

Note

Job templates are created by the cluster administrator for different types of jobs. Job templates define default values and constraints for job properties. Depending on the job template that you select for your job, you might see differences in the available values for job properties. For example, one template might allow the full range of Priority values, and another template might only allow values of Normal or below.

In this topic:

  • Job properties

  • Task properties

Job properties

Job Property

Description

Job ID

The numeric ID of the job. The Job Scheduler assigns this number when a job is created.

Job name

The user-assigned name of the job. The maximum length for this property is 128 characters.

Job template

The name of the job template used to submit the job. When you create a job, the Job template drop-down list displays the available templates. Job templates are created by the cluster administrator for different types of jobs. Job templates define default values and constraints for job properties. After you select a job template for a new job, the available values for job properties in the New Job dialog box change accordingly. For example, the Priority drop-down menu only shows the priority levels that are valid under the selected template, and the run time setting cannot be raised above the defined maximum. For more information, see Job Templates.

Note

Cluster administrators can specify permissions regarding which users can use a particular job template. For information about the job templates that you have access to and should use, contact your cluster administrator.

Project

The name of the project to which the job belongs. The maximum length for this property is 128 characters.

In some cases, the cluster administrator might define a list of project names for a specific job template. If the job template that you selected includes a list of project names, the names will appear in the Project drop-down list.

Priority

The priority of the job. Priority and submit time help determine when the job will run, and how many resources the job will get. You can specify priority in terms of a priority band, a priority number, or a combination of the two. The priority bands and their corresponding numerical values are as follows:

  • Lowest (0)

  • BelowNormal (1000)

  • Normal (2000)

  • AboveNormal (3000)

  • Highest (4000)

The numerical priority can have a value between 0 (Lowest) and 4000 (Highest). If you enter a value numerically, it will be displayed as the corresponding priority band, or as a combination. For example, if you specify a value of 2500, the priority is displayed as Normal+500.

Note

When you filter jobs by priority band, the numerical value of the band is treated as the middle of the range. The filter returns jobs with priorities between band value minus 499 and band value plus 500. For example, if you filter the job list to see Normal priority jobs, the filter returns jobs that have a numerical priority between 1501-2500.

This behavior changes when you are using the HPC APIs to filter the job list. When you filter jobs by priority band using the HPC APIs, the band value is treated as the beginning of the range. The filter returns jobs with priorities between the band value and the band value plus 999. For example, if you define the filter as filter.Add(FilterOperator.Equal, PropId.Job_Priority, JobPriority.Normal);, the filter returns jobs with a numerical priority between 2000-2999.

Run time

The amount of time (dd:hh:mm) the job is allowed to run. If the task is still running after the specified run time is reached, it is automatically canceled by the job scheduler.

The total runtime for the job includes Node Preparation, Node Release, and primary tasks. For more information, see Understanding Task Types.

If a job has a maximum run time and a Node Release task, the job scheduler cancels the other tasks in the job before the run time of the job expires (job run time minus Node Release task run time). This allows the Node Release task to run within the allocated time for the job.

Run until canceled

If True, the job runs until it is canceled or until its run time expires. It does not stop when there are no tasks remaining.

Fail on task failure

If True, the failure of any task in the job causes the entire job to fail immediately.

Notify on start

If True, you can receive an email or other notification when the job starts. Notification must be enabled on the cluster by the cluster administrator.

Notify on completion

If True, you can receive an email or other notification when the job completes. Notification must be enabled on the cluster by the cluster administrator.

Number of cores

The number of cores required by the job. You can set minimum and maximum values, or select Auto calculate to have the job scheduler automatically calculate the minimum and maximum number of required cores based on the job’s tasks.

Number of sockets

The number of sockets required by the job. You can set minimum and maximum values, or select Auto calculate to have the job scheduler automatically calculate the minimum and maximum number of required sockets based on the job’s tasks.

Number of nodes

The number of nodes required by the job. You can set minimum and maximum values, or select Auto calculate to have the job scheduler automatically calculate the minimum and maximum number of required nodes based on the job’s tasks.

Exclusive

If True, no other jobs can run on a compute node at the same time as this job.

Node preferences (node groups operator)

The way that the job scheduler uses node groups to allocate resources to a job. The following preferences are available:

Node preference

Description

Run only on nodes that are members of all the following groups (Intersection)

The job should run only on the nodes that belong to all of the node groups in the list.

For example, if you have a node group for nodes that have at least 4 gigabytes (GB) of memory, and another node group for nodes that have at least 8 cores, you specify those node groups and this preference to run an application on nodes that have at least 4 GB of memory and at least 8 cores.

Run on nodes that are members of any one of the following node groups (Uniform)

The job should run only on nodes that all belong to any one node group in the list.

For example, this preference is useful for hybrid clusters that contain on-premises compute nodes and Windows Azure nodes. You may want to run an application in either environment, but not allow the application to span both on-premises and Windows Azure nodes concurrently.

Run on nodes that are members of any of the following groups (Union)

The job can run on nodes that belong to any node group in the list.

Note

This property was introduced in HPC Pack 2012. It is not available in previous versions.

Run on a single node

If True, run the job on a single node without reserving all the cores of the node. For example, you can specify that this job should run on a minimum of 2 cores and a maximum of 4 cores, but still must run on a single node.

Note

This property was introduced in HPC Pack 2012. It is not available in previous versions.

Node groups

A list of node groups that helps define the candidate resources for this job. In HPC Pack 2008 R2, the job can only run on nodes that are members of all listed groups. For example, if you list the groups “Have Application X” and “Have Big Memory”, the node must belong to both groups. In the New Job dialog box, selecting one or more node groups filters the nodes that are available in the node selection list. If no nodes appear in the list, it means that there are no nodes that belong to all of the specified groups.

In HPC Pack 2012, the node preferences setting determines whether all or a subset of the nodes in the node groups are candidate resources for the job.

The following are default node groups that you can use to run jobs:

  • Compute Nodes

  • Workstation Nodes

  • AzureNodes (introduced in HPC Pack 2008 R2 with Service Pack 1 (SP1)

  • UnmanagedServerNodes (introduced in HPC Pack 2008 R2 with Service Pack 3 (SP3)

Cluster administrators can create additional custom node groups and assign nodes to one or more groups. Cluster administrators can change node group membership at any time, which might affect your available resources. If a task is running on a node that no longer belongs to the specified node group, the task is canceled. If you no longer have the minimum required resources to run your job, your job is requeued.

Requested nodes

A list of nodes. The job can only run on nodes that are in this list.

Memory

The minimum amount of memory (in MB) that must be present on any node that the job is run on.

Cores per node

The minimum number of cores that must be present on any node that the job is run on.

Node ordering

The order to use when selecting nodes for the job. This property gives preference to nodes based on their available memory or core resources. The value options are:

  • More memory

  • Less Memory

  • More Cores

  • Fewer Cores

Licenses

A list of licenses that are required for the job. Values in this list can be validated by a job activation filter that is defined by the cluster administrator.

Environment Variables

A list of environment variable name and value pairs that are set in the context of all tasks for the job. The maximum length for the name is 128 characters. There is no maximum length for the value.

If different values are set for the same environment variable, the environment variable hierarchy determines which value is used in the context of your task. For example, if %TMP% is set as a job and as a task variable, the value of the task variable takes precedence in the context of that specific task.

The hierarchy used for tasks running on the cluster is as follows:

  1. Task

  2. Job

  3. Cluster wide

  4. User

  5. System

Exit codes

A list of one or more numerical codes that indicate tasks completed successfully. If no list is specified, then 0 is the only task exit code that indicates success. If specified, the list of success exit codes applies to all tasks within the job, unless you override that list by specifying a different value for the task itself.

Note

The default job success exit code is 0. If this field is cleared, the exit code is set at 0 (the default value).

Note

This property was introduced in HPC Pack 2012. It is not available in previous versions.

Depends on jobs

A list of jobs, by ID, that need to finish before the job starts running.

Note

This property was introduced in HPC Pack 2012. It is not available in previous versions.

Hold job until

The date and time at which the job is queued. Any user can set this property, and it can be changed any time before the job starts running.

Important

Once a job has run, the Hold job until property is cleared and there is no way to determine if the job had been held at any point.

Note

This property was introduced in HPC Pack 2012. It is not available in previous versions.

Estimated memory per process

An estimate of the maximum amount of memory (in MB) that a process in a job will consume. The job scheduler only considers running the job on nodes that have at least the amount of memory specified.

You can select a value that is in the range of values specified for the template for the job. A value of 0, if valid, indicates that the job scheduler will not allocate jobs to nodes based on the memory requirements of the job.

For more information, see Set up Memory-Aware Scheduling.

Note

This property was introduced in HPC Pack 2012. It is not available in previous versions.

You can set a few additional job properties by using HPC Power Shell or at a command prompt window. For example, you can specify nodes to exclude from the job or manually set job progress or a progress message. You cannot set these properties in HPC Job Manager, but you can see their values in the job list by displaying the corresponding columns. For more information, see Define Excluded Nodes for a Job and Set the Progress and Progress Message Job Properties from a Script File.

Task properties

Task Property

Description

Task ID

The numeric ID of the task. The job scheduler assigns this number when a task is created.

Task name

The user-assigned name of the task. The maximum length for this property is 128 characters.

Type

Helps define how to run a command. The default value for task Type is Basic. A Basic task runs a command once. The other task types create sub-tasks that each run an instance of the command. A task can include up to 1,000,000 sub-tasks. For more information, see Understanding Task Types.

Type can have the following values:

  • Basic

  • Parametric Sweep

  • Node Preparation

  • Node Release

  • Service

Command line

The command that runs for the task. The path to the executable file is relative to the working directory for the task. For more information, see Understanding Application and Data Files.

Jobs that work with parallel tasks through Microsoft® Message Passing Interface (MS-MPI) require the use of the mpiexec command, so commands for parallel tasks must be in the following format: mpiexec [mpi_options] <myapp.exe> [arguments], where myapp.exe is the name of the application to run.

In tasks that include sub-tasks, you can use the asterisk (*) character as a placeholder for the parametric sweep index (in Parametric Sweep tasks) or for the sub-task ID (in Service, Node Preparation, and Node Release tasks). For example, in the first sub-task, echo * is interpreted as echo 1 (or in a Parametric Sweep task, as the first index value).

You can include more than one asterisk (*) to indicate the minimum number of positions to use when expressing the number of the index or sub-task. This does not limit numbers that require more positions. For example, echo **** is interpreted as echo 0001 on the first sub-task. 

To run a command that uses an asterisk (*), include the caret (^) as an escape character. For example, to create a Node Release task that deletes all files from a folder, you can use a command like this:

delete c:\temp\^*

Working directory

The working directory to be used while the task runs. In tasks that include sub-tasks, you can use the asterisk (*) character as a placeholder for the parametric sweep index (in Parametric Sweep tasks) or for the sub-task ID (in Service, Node Preparation, and Node Release tasks). For more information, see Understanding Application and Data Files.

Standard input

The path (relative to the working directory for the task) to the file from which the input of the task should be read. The maximum length for this property is 160 characters.

In tasks that include sub-tasks, you can use the asterisk (*) character as a placeholder for the parametric sweep index (in Parametric Sweep tasks) or for the sub-task ID (in Service, Node Preparation, and Node Release tasks). For more information, see Understanding Application and Data Files.

Standard output

The path (relative to the working directory for the task) to the file to which the output of the task should be written. The maximum length for this property is 160 characters.

In tasks that include sub-tasks, you can use the asterisk (*) character as a placeholder for the parametric sweep index (in Parametric Sweep tasks) or for the sub-task ID (in Service, Node Preparation, and Node Release tasks). For more information, see Understanding Application and Data Files.

If Standard Output and Standard Error are not specified, the results are directed to the HPC Job Scheduler Service database and appear as the task’s output and error fields. The database stores up to 4000 characters of data per task. In HPC Pack 2012, the most recent 4000 characters of data is stored. In HPC Pack 2008 R2, any additional data beyond the first 4000 characters is truncated.

Standard error

The path (relative to the working directory for the task) to the file to which the errors of the task should be written. The maximum length for this property is 160 characters.

In tasks that include sub-tasks, you can use the asterisk (*) character as a placeholder for the parametric sweep index (in Parametric Sweep tasks) or for the sub-task ID (in Service, Node Preparation, and Node Release tasks). For more information, see Understanding Application and Data Files.

If Standard Output and Standard Error are not specified, the results are directed to the HPC Job Scheduler Service database and appear as the task’s output and error fields. The database stores up to 4000 characters of data per task. In HPC Pack 2012, the most recent 4000 characters of data is stored. In HPC Pack 2008 R2, any additional data beyond the first 4000 characters is truncated.

Number of cores

The number of cores required by the task. You can set minimum and maximum values for this property.

Exclusive

If True, no other tasks can be run on a compute node at the same time as the task.

Rerunnable

If True, the job scheduler can attempt to rerun the task if the task is preempted or if it fails due to a cluster issue, such as a node becoming unreachable. If Rerunnable is False, the task fails after the first run attempt fails.

Note

The job scheduler does not attempt to rerun tasks that run to completion and return a with an exit code that indicates failure (by default, any non-zero exit code). In HPC Pack 2012, success error codes can be defined for individual tasks or all tasks in the job.

Run time

The amount of time (dd:hh:mm) the task is allowed to run. If the task is still running after the specified run time is reached, it is automatically canceled by the job scheduler.

Environment variables

Specifies the environment variables to set in the task's run-time environment. Environment variables must be separated by commas in the format: name1=value1. The maximum length for the name is 128 characters. There is no maximum length for the value.

You can also set environment variables at the job level. Job level environment variables are set in the context of all tasks for the job.

If different values are set for the same environment variable, the environment variable hierarchy determines which value is used in the context of your task. For example, if %TMP% is set as a job and as a task variable, the value of the task variable takes precedence in the context of that specific task.

Required nodes

Lists the nodes that must be assigned to the task and its job in order for the task to run.

Sweep start index*

The starting index for a parametric sweep task. The index can apply to the instances of your application, your working directory, and to your input, output, and error files, if specified. For the index to be applied, you must include the asterisk (*) in the command line and in the file names. For example, myTask.exe *, and myInput*.dat.

Sweep end index*

The ending index for a parametric sweep task. The index can apply to the instances of your application, your working directory, and to your input, output, and error files, if specified. For the index to be applied, you must include the asterisk (*) in the command line and in the file names. For example, myTask.exe *, and myInput*.dat.

Sweep increment

The amount to increment the parametric sweep index at each step of the sweep. The index can apply to the instances of your application, your working directory, and to your input, output, and error files, if specified. For the index to be applied, you must include the asterisk (*) in the command line and in the file names. For example, myTask.exe *, and myInput*.dat.

Depends on tasks

A list of tasks, by ID, assigned to groups that define the order in which tasks should run. For more information, see Define Task Dependencies.

Task exit codes

A list of one or more numerical codes that indicate that the task completed successfully. If no list is specified, then 0 is the only task exit code.

Note

This property was introduced in HPC Pack 2012. It is not available in previous versions.

Additional references