Feature Engineering and Workspace Feature Store Python API
This page provides links to the Python API documentation of Databricks Feature Engineering and Databricks legacy Workspace Feature Store, and information about the client packages databricks-feature-engineering
and databricks-feature-store
.
Note
As of version 0.17.0, databricks-feature-store
has been deprecated. All existing modules from this package are now available in databricks-feature-engineering
version 0.2.0 and later. For information about migrating to databricks-feature-engineering
, see Migrate to databricks-feature-engineering.
Compatibility matrix
The package and client you should use depend on where your feature tables are located and what Databricks Runtime ML version you are running, as shown in the following table.
To identify the package version that is built in to your Databricks Runtime ML version, see the Feature Engineering compatibility matrix.
Databricks Runtime version | For feature tables in | Use package | Use Python client |
---|---|---|---|
Databricks Runtime 14.3 ML and above | Unity Catalog | databricks-feature-engineering |
FeatureEngineeringClient |
Databricks Runtime 14.3 ML and above | Workspace | databricks-feature-engineering |
FeatureStoreClient |
Databricks Runtime 14.2 ML and below | Unity Catalog | databricks-feature-engineering |
FeatureEngineeringClient |
Databricks Runtime 14.2 ML and below | Workspace | databricks-feature-store |
FeatureStoreClient |
Note
databricks-feature-engineering<=0.7.0
is not compatible withmlflow>=2.18.0
. To usedatabricks-feature-engineering
with MLflow 2.18.0 and above, upgrade todatabricks-feature-engineering
version 0.8.0 or above.
Release notes
See Release notes for Databricks feature engineering and legacy Workspace Feature Store.
Feature Engineering Python API reference
See the Feature Engineering Python API reference.
Workspace Feature Store Python API reference (deprecated)
Note
- As of version 0.17.0,
databricks-feature-store
has been deprecated. All existing modules from this package are now available indatabricks-feature-engineering
version 0.2.0 and later.
For databricks-feature-store
v0.17.0, see Databricks FeatureStoreClient
in Feature Engineering Python API reference for the latest Workspace Feature Store API reference.
For v0.16.3 and below, use the links in the table to download or display the Feature Store Python API reference. To determine the pre-installed version for your Databricks Runtime ML version, see the compatibility matrix.
Version | Download PDF | Online API reference |
---|---|---|
v0.3.5 to v0.16.3 | Feature Store Python API 0.16.3 reference PDF | Online API reference |
v0.3.5 and below | Feature Store Python API 0.3.5 reference PDF | Online API reference not available |
Python package
This section describes how to install the Python packages to use Databricks Feature Engineering and Databricks Workspace Feature Store.
Feature Engineering
Note
- As of version 0.2.0,
databricks-feature-engineering
contains modules for working with feature tables in both Unity Catalog and Workspace Feature Store.databricks-feature-engineering
below version 0.2.0 only works with feature tables in Unity Catalog.
The Databricks Feature Engineering APIs are available through the Python client package databricks-feature-engineering
. The client is available on PyPI and is pre-installed in Databricks Runtime 13.3 LTS ML and above.
For a reference of which client version corresponds to which runtime version, see the compatibility matrix.
To install the client in Databricks Runtime:
%pip install databricks-feature-engineering
To install the client in a local Python environment:
pip install databricks-feature-engineering
Workspace Feature Store (deprecated)
Note
- As of version 0.17.0,
databricks-feature-store
has been deprecated. All existing modules from this package are now available indatabricks-feature-engineering
, version 0.2.0 and later. - See Migrate to databricks-feature-engineering for more information.
The Databricks Feature Store APIs are available through the Python client package databricks-feature-store
. The client is available on PyPI and is pre-installed in Databricks Runtime for Machine Learning. For a reference of which runtime includes which client version, see the compatibility matrix.
To install the client in Databricks Runtime:
%pip install databricks-feature-store
To install the client in a local Python environment:
pip install databricks-feature-store
Migrate to databricks-feature-engineering
To install the databricks-feature-engineering
package, use pip install databricks-feature-engineering
instead of pip install databricks-feature-store
. All of the modules in databricks-feature-store
have been moved to databricks-feature-engineering
, so you do not have to change any code. Import statements such as from databricks.feature_store import FeatureStoreClient
will continue to work after you install databricks-feature-engineering
.
To work with feature tables in Unity Catalog, use FeatureEngineeringClient
. To use Workspace Feature Store, you must use FeatureStoreClient
.
Supported scenarios
On Databricks, including Databricks Runtime and Databricks Runtime for Machine Learning, you can:
- Create, read, and write feature tables.
- Train and score models on feature data.
- Publish feature tables to online stores for real-time serving.
From a local environment or an environment external to Databricks, you can:
- Develop code with local IDE support.
- Unit test using mock frameworks.
- Write integration tests to be run on Databricks.
Limitations
The client library can only be run on Databricks, including Databricks Runtime and Databricks Runtime for Machine Learning. It does not support calling Feature Engineering in Unity Catalog or Feature Store APIs from a local environment, or from an environment other than Databricks.
Use the clients for unit testing
You can install the Feature Engineering in Unity Catalog client or the Feature Store client locally to aid in running unit tests.
For example, to validate that a method update_customer_features
correctly calls
FeatureEngineeringClient.write_table
(or for Workspace Feature Store,
FeatureStoreClient.write_table
), you could write:
from unittest.mock import MagicMock, patch
from my_feature_update_module import update_customer_features
from databricks.feature_engineering import FeatureEngineeringClient
@patch.object(FeatureEngineeringClient, "write_table")
@patch("my_feature_update_module.compute_customer_features")
def test_something(compute_customer_features, mock_write_table):
customer_features_df = MagicMock()
compute_customer_features.return_value = customer_features_df
update_customer_features() # Function being tested
mock_write_table.assert_called_once_with(
name='ml.recommender_system.customer_features',
df=customer_features_df,
mode='merge'
)
Use the clients for integration testing
You can run integration tests with the Feature Engineering in Unity Catalog client or the Feature Store client on Databricks. For details, see Developer Tools and Guidance: Use CI/CD.