Partager via


Understanding Activation and Submission Filters

 

Applies To: Microsoft HPC Pack 2008 R2, Microsoft HPC Pack 2012, Microsoft HPC Pack 2012 R2

This topic provides an overview of the types of custom job filters that an administrator can add to the cluster, and how the HPC Job Scheduler Service processes jobs based on return values from these filters.

Job submission and job activation filters are custom applications that provide additional checks and controls for jobs on your cluster. For example, submission filters can check job properties against information of your choosing or can change job property values. Activation filters can check for factors such as license availability before resources are allocated to a job. Depending on the return value from your filter, the HPC Job Scheduler Service takes the appropriate action on the job. To help determine if the scheduling policies that you want to enforce require custom filters, see When to Use Job Submission or Job Activation Filters in Windows HPC Server.

This topic describes activation and submission filters, and how the return values from the filters are interpreted by the HPC Job Scheduler Service. For information about installing and configuring the filters, see Install Submission and Activation Filters in Microsoft HPC Pack.

In this topic:

  • When custom filters run

  • Cluster-wide and job template specific filters

  • Job submission filters and their return values

  • Job activation filters and their return values

When custom filters run

The HPC Job Scheduler Service can run custom filters when jobs are submitted to the cluster (submission filters) or when jobs are about to get cluster resources (activation filters).

Submission filters run as soon as a job is submitted, before the job is checked against the job template (submission filters can change job properties, including the assigned job template). If the job passes the submission filter, the user credentials are verified and then the job template defaults and value constraints are applied. For more information, see Understanding Activation and Submission Filters [Help link?].

Activation filters run when candidate resources are allocated to a queued or running job (candidate resources for a job are based on the job and task properties and on the scheduling policies). The activation filter can determine whether or not the job should be started on those resources, or whether the resources should be held for the job or released. Because activation filters run every time that resources are allocated to a job, the activation filter might run multiple times for the same job. For example, the activation filter can run when the job is about to be started, and then run again as new resources are about to be added to the job (dynamic growth).

Cluster-wide and job template specific filters

Custom filters can be defined at the cluster-wide level, and will run on every job. Cluster-wide filters are implemented as executable applications or as scripts. Starting with Service Pack 2 of HPC Pack 2008 R2, custom filters can also be defined at the job template level. These filters will only run on jobs that are submitted with the associated job template. Template level filters allow you to run specific filters on specific types of jobs, and also allow you to run a series of filters, if desired.

Note

Job template level filters must be defined as DLLs and implement the IActivationFilter or the ISubmissionFilter interface. Additionally, the DLLs must be built using the Microsoft .NET Framework 3.5. If the filter DLL is built using .NET Framework 4, jobs will fail to pass the filter.

You can add both cluster-wide and job template level filters to the cluster. When a job is submitted or ready for activation, any job template filters will run before the cluster-wide filter.

To walk through a sample scenario and steps for building a simple submission filter and adding the filter to a job template, see Specify a custom job filter at the job template level.

Job submission filters and their return values

The HPC Job Scheduler Service can run a job submission filter every time a job is submitted. The filter can check the job properties to determine if the job should be added to the queue.

The submission filter parses the job description XML (that specifies the terms of the job) to check for options that you want to disallow or limit, or for failure to include a required option. A submission filter can also make changes to job property values by modifying the job XML file. Task property values cannot be changed.

Based on the return value from the job submission filter, the HPC Job Scheduler Service will process the job as described in the following table.

Exit code

Job scheduler action

0

The job is added to the queue as-is.

1

The filter modified one or more job properties, and the job is added to the queue.

Any other exit code

The job is marked as Failed with an error message that the submission filter failed the job.

Filter timeout

The job is marked as Failed with an error message that the submission filter timed out.

The default timeout is 15 seconds. The setting can be modified in the Job Scheduler Configuration dialog box.

Filter not found

The job is marked as Failed with an error message that the filter could not be found.

Note

If you have specified a chain of submission filters, a job will run through each filter in the order listed until it has successfully passed through all the filters. With an exit code of 0, the job is passed to the next filter. With an exit code of 1, the modified job is passed to the next filter. If the job fails at any point in the chain, all submission filters that already ran on the job are called again in reverse order to allow the filters to revert actions, if necessary.

Job activation filters and their return values

The HPC Job Scheduler Service can run an activation filter when candidate resources are about to be allocated to a queued or running job. The job activation filter can check the job for factors that would cause the job to fail if activated, such as unavailability of licenses or exceeded usage time for the submitting user.

The activation filter parses the job description XML (that specifies the terms of the job), and can check the job properties and other data sources to determine if the job will be able to use the resources. Additional parameters are passed to the filter to provide information such as the number of candidate resources available during the current scheduling pass, the job’s position in the queue, and whether or not backfilling is enabled on the cluster. The developer who creates the filter can use these parameters to help fine tune filter behavior.

Based on the filter return value from the activation filter, the HPC Job Scheduler Service will start the job, block the queue until the job can start, reserve resources for the job without blocking the queue, or put the job on hold. The amount of time to hold a specific job can be set with the Hold Until job property. If a job is put on hold and no Hold Until value is specified for that job, the job is held for the amount of seconds specified by the Default Hold Duration cluster setting. The valid values for Default Hold Duration are 60-604800 (between one minute and one week). The default is 900 seconds (15 minutes).

Important

Once a job has run, the Hold Until property is cleared and there is no way to determine if the job had been held at any point.

Note

Activation filters and backfilling: A job can only run in a backfill window with an activation filter return value of 0.

Based on the return value (exit code) from the job activation filter, the HPC Job Scheduler Service will process the job as described in the following table.

Exit code

Queued Jobs

Running Jobs

0

Start job.

The job is started on the candidate resources.

Grow job.

The candidate resources are added to the running job.

1

Do not start job, block queue.

The job is not started and remains in the queue. No other jobs or equal or lower priority are started until the job passes or is canceled. The filter reevaluates the job periodically until either the job passes, or until the job is canceled.

Do not grow job.

The candidate resources are not added to the running job. The queue is not blocked, and the resources can be used for other jobs.

2

Do not start job, hold resources, and continue scheduling other jobs.

The job is not started, but candidate resources are reserved for it depending on the Scheduling Mode: In Queued, up to the job’s maximum resources are reserved; in Balanced, the minimum resources are reserved. Other jobs can be started on other resources. The filter reevaluates the job periodically until the job passes.

Undefined.

The filter should not return this exit code for Running jobs.

3

Hold job, release resources, and continue scheduling other jobs.

The job is put on hold until the date and time specified by the Hold Until job property. After the hold period, the job is reevaluated by the filter program.

If the filter returns with exit code 3 and no Hold Until value is specified for that job, the job is held for the amount of time specified by the Default Hold Duration cluster setting.

Undefined.

The filter should not return this exit code for Running jobs.

4

Fail job.

The job is marked as Failed with an error message that the job was failed by the activation filter.

Undefined.

The filter should not return this exit code for Running jobs.

Any other exit code

Undefined. But treated the same as a value of 2.

Undefined.

The filter should not return this exit code for Running jobs.

Filter timeout

Same as exit code 2.

The default timeout is 15 seconds. The setting can be modified in the Job Scheduler Configuration dialog box.

Undefined.

Filter not found

Same as exit code 2.

Undefined.

Note

If you have specified a chain of activation filters, a job will be evaluated by each filter in the order listed as long as it passes with an exit code of 0. If a filter returns a non-zero exit code, that value is passed to the HPC Job Scheduler, and any activation filters that already ran on the job are called again in reverse order to allow the filters to revert actions, if necessary. For example, an activation filter that checks for available licenses might include code to release the licenses if the revert function is called.

Additional references