January 2018
Releases are staged. Your Databricks account may not be updated until a week after the initial release date.
Mount points for Azure Blob storage containers and Data Lake Stores
Jan 16-23, 2018: Version 2.63
We have provided instructions for mounting Azure Blob storage containers and Data Lake Stores through the Databricks File System (DBFS). This gives all users in the same workspace the ability to access the Blob storage container or Data Lake Store (or folder inside the container or store) through the mount point. DBFS manages the credentials used to access a mounted Blob storage container or Data Lake Store and automatically handles the authentication with Azure Blob storage or Data Lake Store in the background.
Mounting Blob storage containers and Data Lake Stores requires Databricks Runtime 4.0 and above. Once a container or store is mounted, you can use Runtime 3.4 or above to access the mount point.
See Connect to Azure Data Lake Storage Gen2 and Blob Storage and Accessing Azure Data Lake Storage Gen1 from Azure Databricks for more information.
Cluster tags
Jan 4-11, 2018: Version 2.62
You can now specify cluster tags that will be propagated to all Azure resources (VMs, disks, NICs, etc) associated with a cluster. In addition to user-provided tags, resources will automatically be tagged with the cluster name, cluster ID, and cluster creator username.
See Tags for more information.
Table Access Control for SQL and Python (Private Preview)
Jan 4-11, 2018: Version 2.62
Note
This feature is in private preview. Please contact your account manager to request access. This feature also requires Databricks Runtime 3.5+.
Last year, we introduced data object access control for SQL users. Today we are excited to announce the private preview of Table Access Control (ACL) for both SQL and Python users. With Table Access Control, you can restrict access to securable objects like tables, databases, views, or functions. You can also provide fine-grained access control (to rows and columns matching specific conditions, for example) by setting permissions on derived views containing arbitrary queries.
See Hive metastore privileges and securable objects (legacy) for more information.
Exporting notebook job run results via API
Jan 4-11, 2018: Version 2.62
To improve your ability to share and collaborate on the results of jobs, we now have a new Jobs API endpoint, jobs/runs/export
that lets you retrieve the static HTML representation of a notebook job’s run results in both code and dashboard view.
See Runs export for more information.
Apache Airflow 1.9.0 includes Databricks integration
Jan 2, 2018
Last year, we released a preview feature in Airflow—a popular solution for managing ETL scheduling—that allows customers to natively create tasks that trigger Databricks runs in an Airflow DAG. We’re pleased to announce that these integrations have been released publicly in the 1.9.0 release of Airflow.
See Orchestrate Azure Databricks jobs with Apache Airflow for more information.