แชร์ผ่าน


Test Databricks notebooks

This page briefly describes some techniques that are useful when testing code directly in Databricks notebooks. You can use these methods separately or together.

For a detailed walkthrough of how to set up and organize functions and unit tests in Databricks notebooks, see Unit testing for notebooks.

Many unit testing libraries work directly within the notebook. For example, you can use the built-in Python unittest package to test notebook code.

def reverse(s):
    return s[::-1]

import unittest

class TestHelpers(unittest.TestCase):
    def test_reverse(self):
        self.assertEqual(reverse('abc'), 'cba')

r = unittest.main(argv=[''], verbosity=2, exit=False)
assert r.result.wasSuccessful(), 'Test failed; see logs above'

Test failures appear in the output area of the cell.

Unit test failure

Use Databricks widgets to select notebook mode

You can use widgets to distinguish test invocations from normal invocations in a single notebook. The following code produces the example shown in the screenshot:

dbutils.widgets.dropdown("Mode", "Test", ["Test", "Normal"])

def reverse(s):
  return s[::-1]

if dbutils.widgets.get('Mode') == 'Test':
  assert reverse('abc') == 'cba'
  print('Tests passed')
else:
  print(reverse('desrever'))

The first line generates the Mode dropdown menu:

Widget customize execution

Hide test code and results

To hide test code and results, select Hide Code or Hide Result from the cell actions menu. Errors are displayed even if results are hidden.

Schedule tests to run automatically

To run tests periodically and automatically, you can use scheduled notebooks. You can configure the job to send notification emails to an email address that you specify.

Scheduled notebook test

Separate test code from the notebook

You can keep your test code separate from your notebook using either %run or Databricks Git folders. When you use %run, test code is included in a separate notebook that you call from another notebook. When you use Databricks Git folders, you can keep test code in non-notebook source code files.

This section shows some examples of using %run and Databricks Git folders to separate your test code from the notebook.

Use %run

The screenshot below shows how to use %run to run a notebook from another notebook. For more information about using %run, see Use %run to import a notebook. The code used to generate the examples is shown following the screenshot.

Separating test code

Here is the code used in the example. This code assumes that the notebooks shared-code-notebook and shared-code-notebook-test are in the same workspace folder.

shared-code-notebook:

def reverse(s):
  return s[::-1]

shared-code-notebook-test:

In one cell:

%run ./shared-code-notebook

In a subsequent cell:

import unittest

class TestHelpers(unittest.TestCase):
    def test_reverse(self):
        self.assertEqual(reverse('abc'), 'cba')

r = unittest.main(argv=[''], verbosity=2, exit=False)
assert r.result.wasSuccessful(), 'Test failed; see logs above'

Use Databricks Git folders

For code stored in a Databricks Git folder, you can call the test and run it directly from a notebook.

Notebook testing invocation

You can also use web terminal to run tests in source code files just as you would on your local machine.

Git folders testing invocation

Set up a CI/CD-style workflow

For notebooks in a Databricks Git folder, you can set up a CI/CD-style workflow by configuring notebook tests to run for each commit. See Databricks GitHub Actions.