Κοινή χρήση μέσω


Feature Engineering and Workspace Feature Store Python API

This page provides links to the Python API documentation of Databricks Feature Engineering and Databricks legacy Workspace Feature Store, and information about the client packages databricks-feature-engineering and databricks-feature-store.

Note

As of version 0.17.0, databricks-feature-store has been deprecated. All existing modules from this package are now available in databricks-feature-engineering version 0.2.0 and later. For information about migrating to databricks-feature-engineering, see Migrate to databricks-feature-engineering.

Compatibility matrix

The package and client you should use depend on where your feature tables are located and what Databricks Runtime ML version you are running, as shown in the following table.

To identify the package version that is built in to your Databricks Runtime ML version, see the Feature Engineering compatibility matrix.

Databricks Runtime version For feature tables in Use package Use Python client
Databricks Runtime 14.3 ML and above Unity Catalog databricks-feature-engineering FeatureEngineeringClient
Databricks Runtime 14.3 ML and above Workspace databricks-feature-engineering FeatureStoreClient
Databricks Runtime 14.2 ML and below Unity Catalog databricks-feature-engineering FeatureEngineeringClient
Databricks Runtime 14.2 ML and below Workspace databricks-feature-store FeatureStoreClient

Note

  • databricks-feature-engineering<=0.7.0 is not compatible with mlflow>=2.18.0. To use databricks-feature-engineering with MLflow 2.18.0 and above, upgrade to databricks-feature-engineering version 0.8.0 or above.

Release notes

See Release notes for Databricks feature engineering and legacy Workspace Feature Store.

Feature Engineering Python API reference

See the Feature Engineering Python API reference.

Workspace Feature Store Python API reference (deprecated)

Note

  • As of version 0.17.0, databricks-feature-store has been deprecated. All existing modules from this package are now available in databricks-feature-engineering version 0.2.0 and later.

For databricks-feature-store v0.17.0, see Databricks FeatureStoreClient in Feature Engineering Python API reference for the latest Workspace Feature Store API reference.

For v0.16.3 and below, use the links in the table to download or display the Feature Store Python API reference. To determine the pre-installed version for your Databricks Runtime ML version, see the compatibility matrix.

Version Download PDF Online API reference
v0.3.5 to v0.16.3 Feature Store Python API 0.16.3 reference PDF Online API reference
v0.3.5 and below Feature Store Python API 0.3.5 reference PDF Online API reference not available

Python package

This section describes how to install the Python packages to use Databricks Feature Engineering and Databricks Workspace Feature Store.

Feature Engineering

Note

  • As of version 0.2.0, databricks-feature-engineering contains modules for working with feature tables in both Unity Catalog and Workspace Feature Store. databricks-feature-engineering below version 0.2.0 only works with feature tables in Unity Catalog.

The Databricks Feature Engineering APIs are available through the Python client package databricks-feature-engineering. The client is available on PyPI and is pre-installed in Databricks Runtime 13.3 LTS ML and above.

For a reference of which client version corresponds to which runtime version, see the compatibility matrix.

To install the client in Databricks Runtime:

%pip install databricks-feature-engineering

To install the client in a local Python environment:

pip install databricks-feature-engineering

Workspace Feature Store (deprecated)

Note

  • As of version 0.17.0, databricks-feature-store has been deprecated. All existing modules from this package are now available in databricks-feature-engineering, version 0.2.0 and later.
  • See Migrate to databricks-feature-engineering for more information.

The Databricks Feature Store APIs are available through the Python client package databricks-feature-store. The client is available on PyPI and is pre-installed in Databricks Runtime for Machine Learning. For a reference of which runtime includes which client version, see the compatibility matrix.

To install the client in Databricks Runtime:

%pip install databricks-feature-store

To install the client in a local Python environment:

pip install databricks-feature-store

Migrate to databricks-feature-engineering

To install the databricks-feature-engineering package, use pip install databricks-feature-engineering instead of pip install databricks-feature-store. All of the modules in databricks-feature-store have been moved to databricks-feature-engineering, so you do not have to change any code. Import statements such as from databricks.feature_store import FeatureStoreClient will continue to work after you install databricks-feature-engineering.

To work with feature tables in Unity Catalog, use FeatureEngineeringClient. To use Workspace Feature Store, you must use FeatureStoreClient.

Supported scenarios

On Databricks, including Databricks Runtime and Databricks Runtime for Machine Learning, you can:

  • Create, read, and write feature tables.
  • Train and score models on feature data.
  • Publish feature tables to online stores for real-time serving.

From a local environment or an environment external to Databricks, you can:

  • Develop code with local IDE support.
  • Unit test using mock frameworks.
  • Write integration tests to be run on Databricks.

Limitations

The client library can only be run on Databricks, including Databricks Runtime and Databricks Runtime for Machine Learning. It does not support calling Feature Engineering in Unity Catalog or Feature Store APIs from a local environment, or from an environment other than Databricks.

Use the clients for unit testing

You can install the Feature Engineering in Unity Catalog client or the Feature Store client locally to aid in running unit tests.

For example, to validate that a method update_customer_features correctly calls FeatureEngineeringClient.write_table (or for Workspace Feature Store, FeatureStoreClient.write_table), you could write:

from unittest.mock import MagicMock, patch

from my_feature_update_module import update_customer_features
from databricks.feature_engineering import FeatureEngineeringClient

@patch.object(FeatureEngineeringClient, "write_table")
@patch("my_feature_update_module.compute_customer_features")
def test_something(compute_customer_features, mock_write_table):
  customer_features_df = MagicMock()
  compute_customer_features.return_value = customer_features_df

  update_customer_features()  # Function being tested

  mock_write_table.assert_called_once_with(
    name='ml.recommender_system.customer_features',
    df=customer_features_df,
    mode='merge'
  )

Use the clients for integration testing

You can run integration tests with the Feature Engineering in Unity Catalog client or the Feature Store client on Databricks. For details, see Developer Tools and Guidance: Use CI/CD.