Run notebooks in clean rooms
Important
This feature is in Public Preview.
This article describes how to run notebooks in clean rooms. Notebooks are the interface that collaborators use to run data analysis in collaboration.
To learn how to add a notebook to a clean room, see Create clean rooms.
Before you begin
To run a notebook in a clean room, you must be:
- The owner of the clean room or have the
EXECUTE CLEAN ROOM TASK
privilege on the clean room. - A collaborator who did not create the notebook. The notebook creator cannot run the notebook. This enforces implicit approval of the notebook by both parties.
Note
The creator is automatically assigned as the owner of the clean room in their Databricks account. The collaborator organization’s metastore admin is automatically assigned ownership of the clean room in their Databricks account. You can transfer ownership. See Manage Unity Catalog object ownership.
Run a notebook in a clean room
To run a notebook in a clean room, you must use Catalog Explorer.
In your Azure Databricks workspace, click Catalog.
At the top of the Catalog pane, click the gear icon and select Clean Rooms.
Alternatively, from the Quick access page, click the Clean Rooms > button.
Select the clean room from the list.
Under Notebooks, click the notebook to open it in preview mode.
Click the Run button.
You can only run notebooks that the other collaborator has shared.
(Optional) On the Run notebook with parameters dialog, click + Add to pass parameter values to the notebook job task.
For more information about parameters for job tasks, see Parameterize jobs.
Click the confirmation checkbox.
Click Run.
Click See details to view the progress of the run.
Alternatively, you can view run progress by going to Runs on this page or by clicking Workflows in the workspace sidebar and going to the Job runs tab.
View the results of the notebook run.
The notebook results appear after the run is complete. To view past runs, go to Runs and click the link in the Start time column.
Share notebook output using output tables
Output tables are temporary read-only tables generated by a notebook run and shared to the notebook runner’s metastore. If the notebook creates an output table, the notebook runner can access it in an output catalog and share it with other users in their workspace. See Create and work with output tables in Databricks Clean Rooms.
Use Azure Databricks Workflows to run clean room notebooks
You can use Azure Databricks jobs to run notebooks and perform tasks on output tables, enabling you to build complex workflows that involve your clean room assets. These features in particular make such workflows possible:
- The Clean Room notebook task type lets you select and run a clean room notebook as a Workflows task. See Clean Room notebook task for jobs.
- Workflow-initiated notebook runs can generate output tables that can be referenced by other workflow tasks. See Create and work with output tables in Databricks Clean Rooms.
- Workflows can use Task values that pass job parameter values to clean room notebooks or capture clean room notebook output and pass that output to other workflow tasks. See Use task values to pass information between tasks.
For example, you can create a workflow that propagates the dynamically-generated output schema name across tasks, by doing the following:
Create a task of task type Clean Rooms notebook that runs a notebook that includes the following task value setting:
dbutils.jobs.taskValues.set(key="output_schema", value=dbutils.widgets.get("cr_output_schema"))
Create a subsequent task that references the
output_schema
value to process the output.