Understanding Job Scheduling Policies
Applies To: Windows HPC Server 2008
The HPC Job Scheduler Service queues jobs and tasks, allocates resources, and dispatches tasks to compute nodes. Job scheduling policies determine the order in which to run jobs from the queue, and determine how cluster resources are allocated to these jobs.
As a cluster administrator, you can adjust how resources are allocated to jobs, and how jobs are handled, by configuring job scheduling policy options and by creating job templates that leverage the job scheduling policies.
Job scheduling policy options
You can configure options for the following job scheduling policies:
Policy | Default setting |
---|---|
Preemption |
Graceful pre-emption |
Adaptive resource allocation (grow/shrink) |
Automatic growth and shrink both enabled |
Backfilling |
Enabled, with the backfill look ahead set at 1000 jobs |
You can configure these job scheduling policies in HPC Cluster Manager, or by using the cluscfg setparams command-line tool or the Set-HpcClusterProperty cmdlet. For more information about how to configure job scheduling policies using HPC Cluster Manager, see Configure the HPC Job Scheduler Service.
Job templates
You can create job templates to define a set of job submission policies. Each job template consists of a list of job properties and associated value settings, and a list of users with permission to submit jobs using that job template. You can optimize cluster usage by creating job templates that work with the job scheduling policies. For more information about using and creating job templates, see Job Templates.
Note
In some cases, you may want to provide additional checks and controls on jobs that are submitted to your cluster, or even change job property values. You can enforce site-specific job submission policies and job activation policies by creating custom filters. For more information, see Creating and Installing Job Submission and Activation Filters in Windows HPC Server 2008 Step-by-Step Guide.
Job scheduling policies
The following table describes Windows® HPC Server 2008 job scheduling policies, and how you can set job scheduling policy options and use job templates to manage cluster usage.
Policy | Description | Management Options |
---|---|---|
Priority-based first come, first served (FCFS) |
Combines priority sorting and FCFS to determine the order of the job queue. The priority level is based on the
All Highest priority jobs are queued ahead of AboveNormal priority jobs, and so on. The job submit time determines the order within each priority level. |
Job templates: Use job templates to define the default priority level and valid priority values that different sets of users can assign to their jobs. |
Preemption |
Allows higher priority jobs to take resources away from lower priority, preemptable jobs that are already running. Graceful preemption shrinks preempted jobs, allowing running tasks to finish so that work is not lost. Immediate preemption cancels all running tasks of the preempted jobs so that resources can be allocated to the high priority job immediately. The |
Policy options: Set scheduler configuration to one of the following:
Job templates: Use job templates to define the types of jobs that can or cannot be preempted, or the sets of users who can submit preemptable or nonpreemptable jobs. |
Adaptive resource allocation (grow/shrink) |
Dynamically adjusts the resources allocated to a job based on its tasks. Enabling resource adjustments can result in a significant improvement in cluster utilization and reduced job queue times, especially for clusters which run jobs composed of multiple tasks, such as parametric sweep computations. Only jobs that contain more than one task (including jobs with parametric sweeps) can benefit from automatic resource adjustment. With automatic growth enabled, the HPC Job Scheduler Service can allocate free resources to running jobs that have additional tasks to run. The service will not allocate more resources than the maximum requested for the job. This results in jobs spending more time in the queue waiting for resources, but they finish more quickly after they are started. Available resources are allocated first to the highest-priority job in the system, whether this job is running or queued. With automatic shrink enabled, the HPC Job Scheduler Service can release unused resources from running jobs that have no additional tasks to run. The service will not shrink resources below the minimum requested for the job. Automatic shrink results in better overall cluster utilization, but it may cause problems if you add tasks to jobs that are already in progress. |
Policy options: Automatic grow and shrink are both enabled by default. Use scheduler configuration settings to enable or disable either option. Job templates: In the default job template, the job properties |
Backfilling |
Maximizes cluster utilization and throughput by allowing smaller jobs lower in the queue to run ahead of a job waiting at the top of the queue, as long as the job at the top is not delayed as a result. When a job reaches the top of the queue, a sufficient number of nodes may not be available to meet its minimum core requirement. When this happens, the job reserves any nodes that are immediately available and waits for the job that is currently running to complete. Backfilling then utilizes the reserved idle nodes as follows:
|
Job templates: Backfilling is only effective when jobs submitted to the cluster have a maximum run time specified. Use job templates to define a maximum run time on all jobs. For example, you can create a series of job templates named BigJob, MediumJob, and SmallJob with maximum run times of one day, one hour, and one minute, respectively. Note that you can also write a job submission filter that checks that the runtime job term is not set to infinite. For more information, see Creating and Installing Job Submission and Activation Filters in Windows HPC Server 2008 Step-by-Step Guide. Policy options: Backfilling is enabled by default. Use scheduler configuration settings to modify or disable backfilling. The |
Nonexclusive scheduling |
By default, a job or a task has nonexclusive use of the nodes reserved by it. For example, when requesting two cores for a task on a cluster with four-core nodes, the task will be assigned to two cores on a node, and other tasks may run on the other two cores on that node. If such a task were exclusive, the task would be assigned the entire node. When a job is Note that you cannot have an exclusive task in a nonexclusive job. The |
Job templates: Use job templates to define the types of jobs or the sets of users that can enable job exclusivity. |