Is there a feature in Azure Synapse Analytics similar to Databricks' DBFS FileStore system?

Gabriel-2005 445 Reputation points
2025-02-03T11:06:11.5233333+00:00

User's image

Specifically, I'm looking for a way to upload CSV files and read them directly into pandas dataframes within Azure Synapse notebooks without having to load the data into a database. In Databricks, I can easily do this using:

 

pd.read_csv('/dbfs/FileStore/name_of_file.csv')

 

However, in Azure Synapse, I don't see an equivalent option to upload CSV files directly into a filesystem-like structure (e.g., DBFS' FileStore). Is there a way to achieve this in Synapse, or is there a recommended alternative approach?

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,172 questions
{count} votes

Accepted answer
  1. Smaran Thoomu 19,880 Reputation points Microsoft Vendor
    2025-02-03T11:57:57.9266667+00:00

    Hi @Gabriel-2005
    Welcome to Microsoft Q&A platform and thanks for posting your query here.

    The Azure Synapse equivalent of using FileStore in Databricks is leveraging the data lake storage account linked to your Synapse workspace. Here's how you can achieve this:

    Accessing the Data Lake Storage:

    • Navigate to Synapse Studio.
    • Go to the Data tab and select Linked. Here, you’ll find the storage account linked to your Synapse workspace. This storage account is created or assigned during the workspace setup and serves as the primary data lake.
    • You can use the Synapse Studio UI to upload your CSV files directly to the data lake. Simply navigate to the desired folder in the storage account and upload your files.

    enter image description here

    • Once the file is uploaded, you can right-click on it and select New Notebook -> Load to DataFrame. Synapse will automatically generate code to load the file into a Spark DataFrame.

    enter image description here

    For example:

    df = spark.read.load('abfss://******@datalk1506.dfs.core.windows.net/sample_1.csv', format='csv'
    ## If header exists uncomment line below
    ##, header=True
    )
    display(df.limit(10))
    

    If you prefer to work with pandas, you can modify the code to load the file directly into a pandas DataFrame:

    import pandas as pd
    df = pd.read_csv('abfss://******@datalk1506.dfs.core.windows.net/sample_1.csv')
    
    • The data lake storage is connected to your Synapse workspace via a linked service. You can view this under Manage -> Linked Services. This linked service is automatically created during workspace setup using the data lake and file system details you provide.

    enter image description here

    This approach provides a seamless way to work with files in Azure Synapse, similar to how you would use FileStore in Databricks.

    Hope this helps. Do let us know if you any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.