Is there a feature in Azure Synapse Analytics similar to Databricks' DBFS FileStore system?

Question

User's image

Specifically, I'm looking for a way to upload CSV files and read them directly into pandas dataframes within Azure Synapse notebooks without having to load the data into a database. In Databricks, I can easily do this using:

pd.read_csv('/dbfs/FileStore/name_of_file.csv')

However, in Azure Synapse, I don't see an equivalent option to upload CSV files directly into a filesystem-like structure (e.g., DBFS' FileStore). Is there a way to achieve this in Synapse, or is there a recommended alternative approach?

Accepted Answer

Hi @Gabriel-2005
Welcome to Microsoft Q&A platform and thanks for posting your query here.

The Azure Synapse equivalent of using FileStore in Databricks is leveraging the data lake storage account linked to your Synapse workspace. Here's how you can achieve this:

Accessing the Data Lake Storage:

Navigate to Synapse Studio.
Go to the Data tab and select Linked. Here, you’ll find the storage account linked to your Synapse workspace. This storage account is created or assigned during the workspace setup and serves as the primary data lake.
You can use the Synapse Studio UI to upload your CSV files directly to the data lake. Simply navigate to the desired folder in the storage account and upload your files.

enter image description here

Once the file is uploaded, you can right-click on it and select New Notebook -> Load to DataFrame. Synapse will automatically generate code to load the file into a Spark DataFrame.

enter image description here

For example:

df = spark.read.load('abfss://******@datalk1506.dfs.core.windows.net/sample_1.csv', format='csv'
## If header exists uncomment line below
##, header=True
)
display(df.limit(10))

If you prefer to work with pandas, you can modify the code to load the file directly into a pandas DataFrame:

import pandas as pd
df = pd.read_csv('abfss://******@datalk1506.dfs.core.windows.net/sample_1.csv')

The data lake storage is connected to your Synapse workspace via a linked service. You can view this under Manage -> Linked Services. This linked service is automatically created during workspace setup using the data lake and file system details you provide.

enter image description here

This approach provides a seamless way to work with files in Azure Synapse, similar to how you would use FileStore in Databricks.

Hope this helps. Do let us know if you any further queries.

If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

Share via

Is there a feature in Azure Synapse Analytics similar to Databricks' DBFS FileStore system?

0 additional answers

Your answer