Hi @Bhargav R
Greetings & Welcome to Microsoft Q&A forum! Thanks for posting your query!
Databricks doesn't have a simple out-of-the-box cross-workspace service principal connection for metastore tables (like a "cross-workspace Databricks connector"). You generally connect to a Databricks workspace using a Personal Access Token (PAT) or Azure AD Token. The main challenge here is connecting to another Databricks workspace's metastore via a service principal, which isn't explicitly documented because the tools typically use either PATs or databricks-sdk (with a service principal in Azure). However, service principals can still be used for Azure Active Directory-based authentication, but you'll need to rely on Azure AD token acquisition as I mentioned in my previous response.
To read a table from another Databricks workspace (i.e., from a different Databricks account) using a service principal in your own Databricks workspace, you will need to follow a few steps. This involves authenticating via Azure Active Directory (Azure AD) using the service principal and then using the appropriate method to read the data.
- Create and Configure a Service Principal
- Grant Permissions in the Target Workspace
- Obtain an Azure AD Token Using the Service Principal
- Use the Databricks REST API or JDBC to Access the Table in the Other Workspace
Create and Configure the Service Principal:
- Create a Service Principal in Azure Active Directory:
- Go to Azure Portal > Azure Active Directory > App registrations > New registration.
- Record the Client ID, Tenant ID, and Client Secret for the service principal, as you'll use these to authenticate via the Azure AD token.
- Assign the Service Principal Permissions in the Target Databricks Workspace:
- In the target Databricks workspace, navigate to Admin Console > User Management > Add User.
- Add the Service Principal (by its Client ID) and assign appropriate roles (like SQL User, Workspace Admin, or Cluster User) based on your needs.
- The service principal must have access to Databricks SQL warehouses, Databricks Delta tables, and any underlying resources.
Grant Required Permissions for Azure Resources (if using storage accounts):
- If the Delta tables in the target workspace rely on Azure resources like Azure Data Lake Storage (ADLS) or Azure Blob Storage, you also need to ensure that the service principal has the appropriate Azure Storage permissions (e.g., Storage Blob Data Reader).
Obtain an Azure AD Token Using the Service Principal:
- To authenticate the service principal, you need to acquire an Azure Active Directory (AAD) token. This is done by making a request to the Microsoft Identity Platform token endpoint using the service principal’s client ID, client secret, and the tenant ID.
Use the Databricks REST API or JDBC to Access the Table:
- Once you have the access token, you can use it to authenticate against the Databricks REST API or JDBC connector to read from or write to the Delta table in the target workspace.
- Use the Databricks JDBC Connector - If you're working with Delta tables directly in Spark (in your Databricks notebook), you can use the JDBC connector to access the tables in the other workspace.
- Use Databricks REST API - You can also interact with Databricks REST APIs to access Delta tables in the target workspace, typically using the SQL Queries API to execute queries or interact with tables.
Alternatives:
- Create External Tables in Your Workspace If you want to reference the Delta table in the other workspace without copying the data, you can create an external table in your Databricks workspace that links to the Delta table in the other workspace. This works if the Delta table is in a shared storage location (e.g., Azure Data Lake or Blob Storage).
Reference link: Use the Databricks connector to connect to another Databricks workspace
- Azure Databricks supports sharing feature tables across multiple workspaces. For example, from your own workspace, you can create, write to, or read from a feature table in a centralized feature store. This is useful when multiple teams share access to feature tables or when your organization has multiple workspaces to handle different stages of development.
Reference link: Share feature tables across workspaces (legacy)
For details, please refer to the below links:
I hope this information helps. Please do let us know if you have any further queries.
Thank you.