How to set up secrets in Azure Databricks Cluster configs?

Swathi 0 Reputation points
2025-02-19T16:41:50.09+00:00

Hi there,

I am currently working on setting up Spark configurations within our Databricks cluster to access Azure Data Lake Storage (ADLS) using OAuth and a service principal. Our goal is to configure these settings at the cluster level, so that we can directly access ADLS without the need for mounting it in Databricks.

Below I have shared the configs used to set up cluster,

fs.azure.account.auth.type.<storage-account>.dfs.core.windows.net OAuth
fs.azure.account.oauth.provider.type.<storage-account>.dfs.core.windows.net org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider
fs.azure.account.oauth2.client.id.<storage-account>.dfs.core.windows.net {{secrets/<secret-scope>/<client-id-key>}}
fs.azure.account.oauth2.client.secret.<storage-account>.dfs.core.windows.net {{secrets/<secret-scope>/<client-secret-key>}}
fs.azure.account.oauth2.client.endpoint.<storage-account>.dfs.core.windows.net https://login.microsoftonline.com/{{secrets/<secret-scope>/<tenant-id-key>}}/oauth2/token

When we pass the hard-coded secrets value into the cluster config, I am able to access the files in ADLS perfectly. However, if I pass them as secrets (client-id, client-secret and tenant-id) I get error when I access ADLS.

Error message:

(replaced our secrets, secret-scope and tenant-id with default syntax to "secrets/<secret-scope>/<tenant-id-key>")

HTTP Error -1; url='https://login.microsoftonline.com/{{secrets/<secret-scope>/<tenant-id-key>}}/oauth2/token' AzureADAuthenticator.getTokenCall threw java.io.FileNotFoundException : https://login.microsoftonline.com/{{secrets/<secret-scope>/<tenant-id-key>}}/oauth2/tokenshaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.oauth2.AzureADAuthenticator$HttpException: HTTP Error -1; url='https://login.microsoftonline.com/{{secrets/<secret-scope>/<tenant-id-key>}}/oauth2/token' AzureADAuthenticator.getTokenCall threw java.io.FileNotFoundException : https://login.microsoftonline.com/{{secrets/<secret-scope>/<tenant-id-key>}}/oauth2/token

  1. How to set up secrets in cluster configs?
  2. Is this related to enabling the dbutils.secrets.get functionality in our Databricks cluster to securely access the secrets? If so, will it solve the issue?
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,345 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Chandra Boorla 9,120 Reputation points Microsoft Vendor
    2025-02-20T02:38:08.2166667+00:00

    Hi @Swathi

    Thank you for posting you query!

    The issue you are facing occurs because Databricks does not automatically expand secrets in cluster configurations. The {{secrets/...}} syntax only works within Databricks notebooks or jobs when using dbutils.secrets.get(), but not in the cluster’s Spark configurations.

    Here's how to properly set up your ADLS access with secrets:

    How to set up secrets in cluster configs?

    Using Init Scripts for Cluster-Level Secrets - For cluster-wide settings, meaning settings that apply to all notebooks and jobs on the cluster, you'll want to use an init script.

    Example Python Init Script:

    Create a file (e.g., adls_secrets_init.py) with the following Python code:

    from pyspark.sql import SparkSession
    
    # Define function to get dbutils
    def get_dbutils():
        try:
            from pyspark.dbutils import DBUtils
            return DBUtils(SparkSession.builder.getOrCreate())
        except ImportError:
            import IPython
            return IPython.get_ipython().user_ns["dbutils"]
    
    # Initialize dbutils
    dbutils = get_dbutils()
    
    # Retrieve secrets from Databricks Secret Scope
    client_id = dbutils.secrets.get(scope="<secret-scope>", key="<client-id-key>")
    client_secret = dbutils.secrets.get(scope="<secret-scope>", key="<client-secret-key>")
    tenant_id = dbutils.secrets.get(scope="<secret-scope>", key="<tenant-id-key>")
    
    # Get Spark session
    spark = SparkSession.builder.getOrCreate()
    
    # Set Spark configurations
    spark.conf.set("fs.azure.account.oauth2.client.id.<storage-account>.dfs.core.windows.net", client_id)
    spark.conf.set("fs.azure.account.oauth2.client.secret.<storage-account>.dfs.core.windows.net", client_secret)
    spark.conf.set("fs.azure.account.oauth2.client.endpoint.<storage-account>.dfs.core.windows.net", f"https://login.microsoftonline.com/{tenant_id}/oauth2/token")
    

    Important - Replace <secret-scope>, <client-id-key>, <client-secret-key>, <tenant-id-key>, and <storage-account> with your actual values. Save this file to a location accessible by your Databricks cluster (e.g., DBFS).

    Configure Your Cluster - In your Databricks cluster configuration, go to "Advanced Options" -> "Init Scripts." Add the init script you created (either by uploading it or specifying the DBFS path). Restart your cluster.

    Verify the Secret Scope - Double-check that your Secret Scope is correctly set up in Databricks and contains the correct client ID, client secret, and tenant ID.

    Is this related to enabling the dbutils.secrets.get functionality in our Databricks cluster to securely access the secrets? If so, will it solve the issue?

    Why This Works?

    Secrets are retrieved at runtime using dbutils.secrets.get(), ensuring security. The init script exports them as environment variables. Spark configs use environment variables instead of hardcoded secrets. This approach ensures that your Databricks cluster can access ADLS securely without hardcoding credentials.

    I hope this information helps. Please do let us know if you have any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.