Run notebooks in clean rooms

Important

This feature is in Public Preview.

This article describes how to run notebooks in clean rooms. Notebooks are the interface that collaborators use to run data analysis in collaboration.

To learn how to add a notebook to a clean room, see Create clean rooms.

Before you begin

To run a notebook in a clean room, you must be:

  • The owner of the clean room or have the EXECUTE CLEAN ROOM TASK privilege on the clean room.
  • A collaborator who did not create the notebook. The notebook creator cannot run the notebook. This enforces implicit approval of the notebook by both parties.

Note

The creator is automatically assigned as the owner of the clean room in their Databricks account. The collaborator organization’s metastore admin is automatically assigned ownership of the clean room in their Databricks account. You can transfer ownership. See Manage Unity Catalog object ownership.

Run a notebook in a clean room

To run a notebook in a clean room, you must use Catalog Explorer.

  1. In your Azure Databricks workspace, click Catalog icon Catalog.

  2. At the top of the Catalog pane, click the Gear icon gear icon and select Clean Rooms.

    Alternatively, from the Quick access page, click the Clean Rooms > button.

  3. Select the clean room from the list.

  4. Under Notebooks, click the notebook to open it in preview mode.

  5. Click the Run button.

    You can only run notebooks that the other collaborator has shared.

  6. (Optional) On the Run notebook with parameters dialog, click + Add to pass parameter values to the notebook job task.

    For more information about parameters for job tasks, see Parameterize jobs.

  7. Click the confirmation checkbox.

  8. Click Run.

  9. Click See details to view the progress of the run.

    Alternatively, you can view run progress by going to Runs on this page or by clicking Workflows in the workspace sidebar and going to the Job runs tab.

  10. View the results of the notebook run.

    The notebook results appear after the run is complete. To view past runs, go to Runs and click the link in the Start time column.

Share notebook output using output tables

Output tables are temporary read-only tables generated by a notebook run and shared to the notebook runner’s metastore. If the notebook creates an output table, the notebook runner can access it in an output catalog and share it with other users in their workspace. See Create and work with output tables in Databricks Clean Rooms.

Use Azure Databricks Workflows to run clean room notebooks

You can use Azure Databricks jobs to run notebooks and perform tasks on output tables, enabling you to build complex workflows that involve your clean room assets. These features in particular make such workflows possible:

For example, you can create a workflow that propagates the dynamically-generated output schema name across tasks, by doing the following:

  1. Create a task of task type Clean Rooms notebook that runs a notebook that includes the following task value setting:

    dbutils.jobs.taskValues.set(key="output_schema", value=dbutils.widgets.get("cr_output_schema"))
    
  2. Create a subsequent task that references the output_schema value to process the output.