Jobs CLI (legacy)

Άρθρο
12/13/2024

Important

This documentation has been retired and might not be updated.

This information applies to legacy Databricks CLI versions 0.18 and below. Databricks recommends that you use newer Databricks CLI version 0.205 or above instead. See What is the Databricks CLI?. To find your version of the Databricks CLI, run databricks -v.

To migrate from Databricks CLI version 0.18 or below to Databricks CLI version 0.205 or above, see Databricks CLI migration.

You run Databricks jobs CLI subcommands by appending them to databricks jobs and Databricks job runs CLI subcommands by appending them to databricks runs. For Databricks job runs CLI subcommands, see the Runs CLI (legacy). Together, these subcommands call the Jobs API and Jobs API 2.0.

Important

The Databricks jobs CLI supports calls to two versions of the Databricks Jobs REST API: versions 2.1 and 2.0. Version 2.1 adds support for orchestration of jobs with multiple tasks; see Overview of orchestration on Databricks and Updating from Jobs API 2.0 to 2.1. Databricks recommends that you call version 2.1, unless you have legacy scripts that rely on version 2.0 and that cannot be migrated.

Unless otherwise specified, the programmatic behaviors that are described in this article apply equally to versions 2.1 and 2.0.

Requirements to call the Jobs REST API 2.1

To set up and use the Databricks jobs CLI (and job runs CLI) to call the Jobs REST API 2.1, do the following:

Update the CLI to version 0.16.0 or above.
Do one of the following:
- Run the command databricks jobs configure --version=2.1. This adds the setting jobs-api-version = 2.1 to the file ~/.databrickscfg on Unix, Linux, or macOS, or %USERPROFILE%\.databrickscfg on Windows. All jobs CLI (and job runs CLI) subcommands will call the Jobs REST API 2.1 by default.
- Manually add the setting jobs-api-version = 2.1 to the file ~/.databrickscfg on Unix, Linux, or macOS, or %USERPROFILE%\.databrickscfg on Windows. All jobs CLI (and job runs CLI) subcommands will call the Jobs REST API 2.1 by default.
- Append the option --version=2.1 (for example, databricks jobs list --version=2.1) to instruct the jobs CLI to call the Jobs REST API 2.1 for that call only.
If you take none of the preceding actions, the jobs CLI (and job runs CLI) will call the Jobs REST API 2.0 by default.

Requirements to call the Jobs REST API 2.0

To set up and use the Databricks jobs CLI (and job runs CLI) to call the Jobs REST API 2.0, do one of the following:

Use a version of the Databricks CLI below 0.16.0, or
Update the CLI to version X.Y.Z or above, and then do one of the following:
- Run the command databricks jobs configure --version=2.0. This adds the setting jobs-api-version = 2.0 to the file ~/.databrickscfg on Unix, Linux, or macOS, or %USERPROFILE%\.databrickscfg on Windows. All jobs CLI (and job runs CLI) subcommands will call the Jobs REST API 2.0 by default.
- Manually add the setting jobs-api-version = 2.0 to the file ~/.databrickscfg on Unix, Linux, or macOS, or %USERPROFILE%\.databrickscfg on Windows. All jobs CLI (and job runs CLI) subcommands will call the Jobs REST API 2.0 by default.
- Append the option --version=2.1 (for example, databricks jobs list --version=2.0) to instruct the jobs CLI to call the Jobs REST API 2.0 for that call only.

If you take none of the preceding actions, the jobs CLI (and job runs CLI) will call the Jobs REST API 2.0 by default.

Subcommands and general usage

databricks jobs -h

Usage: databricks jobs [OPTIONS] COMMAND [ARGS]...

  Utility to interact with jobs.

  Job runs are handled by ``databricks runs``.

Options:
  -v, --version  [VERSION]
  -h, --help     Show this message and exit.

Commands:
  create   Creates a job.
    Options:
      --json-file PATH            File containing JSON request to POST to /api/2.0/jobs/create.
      --json JSON                 JSON string to POST to /api/2.0/jobs/create.
  delete   Deletes a job.
    Options:
      --job-id JOB_ID             Can be found in the URL at https://<databricks-instance>/?o=<16-digit-number>#job/$JOB_ID. [required]
  get      Describes the metadata for a job.
    Options:
    --job-id JOB_ID               Can be found in the URL at https://<databricks-instance>/?o=<16-digit-number>#job/$JOB_ID. [required]
  list     Lists the jobs in the Databricks Job Service.
  reset    Resets (edits) the definition of a job.
    Options:
      --job-id JOB_ID             Can be found in the URL at https://<databricks-instance>/?o=<16-digit-number>#job/$JOB_ID. [required]
      --json-file PATH            File containing JSON request to POST to /api/2.0/jobs/create.
      --json JSON                 JSON string to POST to /api/2.0/jobs/create.
  run-now  Runs a job with optional per-run parameters.
    Options:
      --job-id JOB_ID             Can be found in the URL at https://<databricks-instance>/#job/$JOB_ID. [required]
      --jar-params JSON           JSON string specifying an array of parameters. i.e. '["param1", "param2"]'
      --notebook-params JSON      JSON string specifying a map of key-value pairs. i.e. '{"name": "john doe", "age": 35}'
      --python-params JSON        JSON string specifying an array of parameters. i.e. '["param1", "param2"]'
      --spark-submit-params JSON  JSON string specifying an array of parameters. i.e. '["--class", "org.apache.spark.examples.SparkPi"]'

Create a job

To display usage documentation, run databricks jobs create --help.

General usage

databricks jobs create --json-file create-job.json

Jobs CLI 2.1 usage notes and request example

See Create in Updating from Jobs API 2.0 to 2.1.

Jobs CLI 2.0 request payload and response example

create-job.json:

{
  "name": "my-job",
  "existing_cluster_id": "1234-567890-reef123",
  "notebook_task": {
    "notebook_path": "/Users/someone@example.com/My Notebook"
  },
  "email_notifications": {
    "on_success": ["someone@example.com"],
    "on_failure": ["someone@example.com"]
  }
}

{ "job_id": 246 }

Tip

To copy a job, run the create command and pass a JSON object with the settings of the job to copy. This example copies the settings of the job with the ID of 246 into a new job. It requires the jq utility.

SETTINGS_JSON=$(databricks jobs get --job-id 246 | jq .settings)

databricks jobs create --json "$SETTINGS_JSON"

{ "job_id": 247 }

Delete a job

To display usage documentation, run databricks jobs delete --help.

databricks job delete --job-id 246

If successful, no output is displayed.

Tip

To delete multiple jobs having the same setting, get the list of job IDs that match that setting, and then run the delete command for each matching job ID. This example deletes all jobs with the job name of Untitled. It requires the jq utility.

databricks jobs list --output json | jq '.jobs[] | select(.settings.name == "Untitled") | .job_id' | xargs -n 1 databricks jobs delete --job-id

List information about a job

To display usage documentation, run databricks jobs get --help.

General usage

databricks jobs get --job-id 246

Jobs CLI 2.1 usage notes and response example

See Get in Updating from Jobs API 2.0 to 2.1.

Jobs CLI 2.0 response example

{
  "job_id": 246,
  "settings": {
    "name": "my-job",
    "existing_cluster_id": "1234-567890-reef123",
    "email_notifications": {
      "on_success": [
        "someone@example.com"
      ],
      "on_failure": [
        "someone@example.com"
      ]
    },
    "timeout_seconds": 0,
    "notebook_task": {
      "notebook_path": "/Users/someone@example.com/My Notebook"
    },
    "max_concurrent_runs": 1
  },
  "created_time": 1620163107742,
  "creator_user_name": "someone@example.com"
}

List information about available jobs

To display usage documentation, run databricks jobs list --help.

General usage

databricks jobs list

Jobs CLI 2.1 usage notes and response example

See List in Updating from Jobs API 2.0 to 2.1.

Jobs CLI 2.0 response example

{
  "jobs": [
    {
      "job_id": 246,
      "settings": {
        "name": "my-job",
        "existing_cluster_id": "1234-567890-reef123",
        "email_notifications": {
          "on_success": [
            "someone@example.com"
          ],
          "on_failure": [
            "someone@example.com"
          ]
        },
        "timeout_seconds": 0,
        "notebook_task": {
          "notebook_path": "/Users/someone@example.com/My Notebook"
        },
        "max_concurrent_runs": 1
      },
      "created_time": 1620163107742,
      "creator_user_name": "someone@example.com"
    },
    ...
  ]
}

List all jobs (API 2.1 only)

To instruct the CLI to return all jobs by making sequential calls to the API, use the --all option. To use the --all option, you must set the API version to 2.1.

databricks jobs list --all

Page the jobs list (API 2.1 only)

To return a paginated jobs list, use the --limit and --offset arguments. By default, the jobs list is returned as a table containing the job ID and job name. To optionally return a JSON document containing job information, use the --output JSON argument.

To use the --limit and --offset arguments, you must set the API version to 2.1.

When using --output JSON, the list is returned in descending order by job creation date. When using --output TABLE, the list is returned in descending order by job creation date and then sorted alphabetically by job name.

The following example pages through the jobs list 10 jobs at a time and returns the results in JSON format:

databricks jobs list --output JSON --limit 10
databricks jobs list --output JSON --limit 10 --offset 10
databricks jobs list --output JSON --limit 10 --offset 20

Change settings for a job

To display usage documentation, run databricks jobs reset --help.

General usage

databricks jobs reset --job-id 246 --json-file reset-job.json

Jobs CLI 2.1 usage notes and request example

See Update and Reset in Updating from Jobs API 2.0 to 2.1.

Jobs CLI 2.0 request example

reset-job.json:

{
  "job_id": 246,
  "existing_cluster_id": "2345-678901-batch234",
  "name": "my-changed-job",
  "notebook_task": {
    "notebook_path": "/Users/someone@example.com/My Other Notebook"
  },
  "email_notifications": {
    "on_success": ["someone-else@example.com"],
    "on_failure": ["someone-else@example.com"]
  }
}

If successful, no output is displayed.

Run a job

To display usage documentation, run databricks jobs run-now --help.

databricks jobs run-now --job-id 246

{
  "run_id": 122,
  "number_in_job": 1
}

Κοινή χρήση μέσω

Jobs CLI (legacy)

Requirements to call the Jobs REST API 2.1

Requirements to call the Jobs REST API 2.0

Subcommands and general usage

Create a job

General usage

Jobs CLI 2.1 usage notes and request example

Jobs CLI 2.0 request payload and response example

Delete a job

List information about a job

General usage

Jobs CLI 2.1 usage notes and response example

Jobs CLI 2.0 response example

List information about available jobs

General usage

Jobs CLI 2.1 usage notes and response example

Jobs CLI 2.0 response example

List all jobs (API 2.1 only)

Page the jobs list (API 2.1 only)

Change settings for a job

General usage

Jobs CLI 2.1 usage notes and request example

Jobs CLI 2.0 request example

Run a job

Σχόλια

Πρόσθετοι πόροι