Limitations with Databricks Connect for Python
Note
This article covers Databricks Connect for Databricks Runtime 13.3 LTS and above.
This article lists limitations with Databricks Connect for Python. Databricks Connect enables you to connect popular IDEs, notebook servers, and custom applications to Azure Databricks clusters. See What is Databricks Connect?. For the Scala version of this article, see Limitations with Databricks Connect for Scala.
Important
Depending on the version of Python, Databricks Runtime, and Databricks Connect that you are using, there may be version requirements for some features. See Requirements.
Feature availability
Not available on Databricks Connect for Databricks Runtime 13.3 LTS and below:
- Streaming
foreachBatch
- Creating DataFrames larger than 128 MB
- Long queries over 3600 seconds
Not available:
dataframe.display()
API- Databricks Utilities:
credentials
,library
,notebook workflow
,widgets
- Spark Context
- RDDs
- Libraries that use RDDs, Spark Context, or access the underlying Spark JVM, such as Mosaic geospatial, GraphFrames, or GreatExpectations
CREATE TABLE <table-name> AS SELECT
(instead, usespark.sql("SELECT ...").write.saveAsTable("table")
)ApplyinPandas()
andCogroup()
with shared clusters- Changing the log4j log level through
SparkContext
- Distributed ML training
- Synchronizing the local development environment with the remote cluster
- On serverless compute, UDFs cannot include custom libraries.