Authorizating access to Azure Databricks resources
This topic discusses the basic approaches for making secured Azure Databricks CLI or REST API calls using Azure Databricks account credentials, such as user accounts or service principals.
Authorization for the Azure Databricks CLI and APIs
To access a Azure Databricks resource with the Databricks CLI or REST APIs, clients must authorize using a Azure Databricks account. This account must have permissions to access the resource, which can be configured by your Azure Databricks administrator or a user account with adminstrator privileges.
There are two types of accounts that you can use, depending on how you intend to access your Azure Databricks resources:
- User account: Use this to interactively enter Azure Databricks CLI commands or REST API calls.
- Service principal: Use this to automate Azure Databricks CLI commands or REST API calls without human interaction.
Once you have decided on the Azure Databricks account type, you must acquire an access token that represents the account’s credentials. You will provide this access token when accessing the account’s resources in your scripts or code, or in interactive sessions.
- If you are using Azure Databricks, you can also use an MS Entra service principal to authorize access to your Azure Databricks account or workspace. However, Databricks recommends that you use a Databricks service principal with our provided OAuth authorization instead of MS Entra service principal authorization. This is because Databricks’ authorization uses OAuth access tokens that are more robust when authorizing only with Azure Databricks.
For more details on using an MS Entra service principal to access Databricks resources, see MS Entra service principal authentication.
Acquire an access token
Your account’s credentials are represented by a secure access token, which you provide either directly or indirectly to the CLI command or API call.
To securely run a Databricks CLI command or API request that requires authorized access to an account or workspace, you must provide an access token based on valid Azure Databricks account credentials.
The following table shows the authorization methods available to your Azure Databricks account.
Azure Databricks authorization methods
Because Azure Databricks tools and SDKs work with one or more supported Azure Databricks authorization methods, you can select the best authorization method for your use case. For details, see the tool or SDK documentation in Local development tools.
Azure Databricks users may require access to Azure-specific resources not directly managed under your Databricks account. The methods for accessing those resources are also included in this table. For Azure resource access, you use an Azure managed service identity (MSI) or an MS Entra ID (depending on your scenario) and not your Azure Databricks account credentials.
Method | Description | Use case |
---|---|---|
Databricks OAuth for service principals | Short-lived OAuth tokens for service principals. | Unattended authorization scenarios, such as fully automated and CI/CD workflows. |
Databricks OAuth for users | Short-lived OAuth tokens for users. | Attended authorization scenarios, where you use your web browser or other interactive method to authorize with Databricks, when prompted. |
Databricks personal access tokens (PATs) | Short-lived or long-lived tokens for users or service principals. | Only use this in cases where your target tool does not support OAuth. |
Azure managed service identity authorization | Microsoft Entra ID tokens for Azure managed identities. | Use only with Azure resources that support managed identities, such as Azure virtual machines. |
Microsoft Entra ID service principal authorization | Microsoft Entra ID tokens for Microsoft Entra ID service principals. | Use only with Azure resources that support Microsoft Entra ID tokens and do not support managed identities, such as Azure DevOps. |
Azure CLI authorization | Microsoft Entra ID tokens for users or Microsoft Entra ID service principals. | Use to authorize access to Azure resources and Azure Databricks using the Azure CLI. |
Microsoft Entra ID user authorization | Microsoft Entra ID tokens for users. | Use only with Azure resources that only support Microsoft Entra ID tokens. Databricks does not recommend that you create Microsoft Entra ID tokens for Azure Databricks users manually. |
What authorization option should I choose?
Azure Databricks provides 2 options for authorization or authentication with an access token:
- OAuth 2.0-based access tokens.
- Personal access tokens (PATs).
Note
Azure Databricks strongly recommends you use OAuth over PATs for authorization as OAuth tokens are automatically refreshed by default and do not require the direct management of the access token, improving your security against token hijacking and unwanted access.
Because OAuth creates and manages the access token for you, you provide an OAuth token endpoint URL, a client ID, and a secret you generate from your Azure Databricks workspace instead of directly providing a token string. Choose PATs only when you are integrating a 3rd-party tool or service that is unsupported by Azure Databricks unified client authentication or has no OAuth support.
How do I use OAuth to authorize access to Azure Databricks resources?
Azure Databricks provides unified client authentication to assist you with authorization by using a default set of environment variables you can set to specific credential values. This helps you work more easily and securely since these environment variables are specific to the environment that will be running the Azure Databricks CLI commands or calling Azure Databricks APIs.
- For user account authorization, the authentication part of Azure Databricks OAuth—the creation and management of access tokens—is handled for you with Databricks client unified authentication, as long as the tools and SDKs implement its standard. If they don’t, you can manually generate an OAuth code verifier and challenge pair to use directly in your Azure Databricks CLI commands and API requests. See Step 1: Generate an OAuth code verifier and code challenge pair.
- For service principal authorization, Azure Databricks OAuth requires that the caller provide client credentials along with a token endpoint URL where the request can be authorized. (This is handled for you if you use Azure Databricks tools and SDKs that support Databricks unified client authentication.) The credentials include a unique client ID and client secret. The client, which is the Databricks service principal that will run your code, must be assigned to Databricks workspaces. After you assign the service principal to the workspaces it will access, you are provided with a client ID and a client secret that you will set with specific environment variables.
These environment variables are:
Environment variable | Description |
---|---|
DATABRICKS_HOST |
This environment variable is set to the URL of either your Azure Databricks account console (http://accounts.cloud.databricks.com ) or your Azure Databricks workspace URL (https://{workspace-id}.cloud.databricks.com ). Choose a host URL type based on the type of operations you will be performing in your code. Specifically, if you are using Azure Databricks account-level CLI commands or REST API requests, set this variable to your Azure Databricks account URL. If you are using Azure Databricks workspace-level CLI commands or REST API requests, use your Azure Databricks workspace URL. |
DATABRICKS_ACCOUNT_ID |
Used for Azure Databricks account operations. This is your Azure Databricks account ID. To get it, see Locate your account ID. |
DATABRICKS_CLIENT_ID |
(Service principal OAuth only) The client ID you were assigned when creating your service principal. |
DATABRICKS_CLIENT_SECRET |
(Service principal OAuth only) The client secret you generated when creating your service principal. |
You can set these directly, or through the use of a Databricks configuration profile (.databrickscfg
) on your client machine.
To use an OAuth access token, your Azure Databricks workspace or account administrator must have granted your user account or service principal the CAN USE
privilege for the account and workspace features your code will access.
For more details on configuring OAuth authorization for your client and to review cloud provider-specific authorization options, see Unified client authentication.
Authentication for third-party services and tools
If you are writing code which accesses third-party services, tools, or SDKs you must use the authentication and authorization mechanisms provided by the third-party. However, if you must grant a third-party tool, SDK, or service access to your Azure Databricks account or workspace resources, Databricks provides the following support:
Databricks Terraform Provider: This tool can access Azure Databricks APIs from Terraform on your behalf, using your Azure Databricks user account. For more details, see Provision a service principal by using Terraform.
Git providers such as GitHub, GitLab, and Bitbucket can access Azure Databricks APIs using a Databricks service principal. For more details, see Service principals for CI/CD.
Jenkins can access Azure Databricks APIs using a Databricks service principal. For more details, see CI/CD with Jenkins on Azure Databricks.
Azure DevOps can access Azure Databricks APIs using an MS Entra ID-based service principal. For more details, see Authenticate with Azure DevOps on Databricks.
Azure Databricks configuration profiles
An Azure Databricks configuration profile contains settings and other information that Azure Databricks needs to authorize access. Azure Databricks configuration profiles are stored in local client files for your tools, SDKs, scripts, and apps to use. The standard configuration profile file is named .databrickscfg
.
For more information, see Azure Databricks configuration profiles.