Need for volumes in Databricks

Dhruv Singla 105 Reputation points
2024-09-25T06:47:42.8166667+00:00

Why do we need Volumes when we can access the location using external locations? The doc says it is to add governance, but we can already govern using external locations. So, why add another layer of governance? I am guessing that instead of giving access to the entire external location, we can provide access to a specific subfolder of the location using volumes.

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,355 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Paras Vora 0 Reputation points
    2025-03-03T13:54:55.6+00:00

    Hello,

    At the risk of over simplification, Volumes in the Unity Catalog are primarily created to access the files or unstructured data which may not be supported by Delta lake protocols. If you have tabular data as external tables, you don't really need to use Volumes. There are a certain types of compute clusters through which you can either access the Unity Catalog features or the traditional / classic features like DBUtils and data access from mount points directly. Here, using volumes, you can do both i.e. use the Unity Catalog features and be able to read files from the external storage

    For example, with a 'shared' access cluster using which you can access Unity catalog features but you cannot read files from mount points /mnt/filestore/ . you will simply get an access denied error.

    Hope this helps,

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.