You should be able to implement this by using Spark in your Synapse Notebook to write the intermediate transformation results as a CSV file to Azure Data Lake Gen2 (ADLS Gen2)
1. Set up the storage account configuration First, ensure that your Synapse workspace has access to the ADLS Gen2 container using Linked Service or Account Key / SAS Token / Managed Identity.
2. Use the following code in the Synapse notebook
If you're using Apache Spark (PySpark), you can write your DataFrame (df
) as a CSV file.
from pyspark.sql import SparkSession
# Define your Storage Account Name and Container
storage_account_name = "yourstorageaccount"
container_name = "yourcontainer"
folder_path = "intermediate-results/" # Folder where CSV will be stored
# Mount ADLS Gen2 using abfss:// (Azure Blob File System)
adls_path = f"abfss://{container_name}@{storage_account_name}.dfs.core.windows.net/{folder_path}"
# Writing DataFrame as CSV
df.write.mode("overwrite").option("header", "true").csv(adls_path)
print(f"Data saved to {adls_path}")
3. Authentication methods Ensure that you have access to the storage account via one of the following:
- Managed Identity (recommended)
- Grant Storage Blob Data Contributor role to the Synapse Managed Identity.
- No need to specify credentials in the notebook.
- Account Key (If not using Managed Identity)
spark.conf.set( f"fs.azure.account.key.{storage_account_name}.dfs.core.windows.net", "your-storage-account-key" )
If the above response helps answer your question, remember to "Accept Answer" so that others in the community facing similar issues can easily find the solution. Your contribution is highly appreciated.
hth
Marcin