Need help to copy Data from Snowflake to Azure Blob Storage via ADF Pipeline

Nalini Bhavaraju 20 Reputation points
2025-02-14T17:26:39.2166667+00:00

I want to create an Azure Data Factory pipeline to perform a bulk data copy from Snowflake tables to Azure Blob Storage. Since I have large tables to handle, I aim to organize the data into folders named after each table. Inside each table’s folder, the corresponding data will be copied into multiple part files, as shown below:

data/

table1/

    part-0000.parquet

    part-0001.parquet

table2/

    part-0000.parquet

    part-0001.parquet

Issue is : I am trying to use Dataset Parameters to specify the FilePath with item().Table_Name for folder but the copy activity is trying to copy to single file and then the Pipeline fails with "Max Limit Exceeded trying to unload to single File" I tried to set property as SingleFile :false but not working.

Can you help with this?

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,271 questions
{count} votes

Accepted answer
  1. Vinodh247 28,211 Reputation points MVP
    2025-02-15T12:03:33.0466667+00:00

    Hi ,

    Thanks for reaching out to Microsoft Q&A.

    Solution Steps

    1. Enable Snowflake Query Partitioning:
      • Use Snowflake's SQL COPY INTO command with parallelization by specifying the PARTITION BY clause in the query.
      • In the ADF pipeline, specify the SQL query in the Source dataset that generates partitions. For example: SELECT * FROM table1 WHERE MOD(ROW_NUMBER() OVER(), 2) = 0
      • Or partition based on data logic (ex: date range, region, etc.).
    2. Set ADF Dataset Parameters:
      • Create a dataset parameter in ADF for the folder path, ex:@dataset().folderPath.
      • Bind the parameter dynamically to the sink path in the Copy Activity,
    3. Disable Single File Setting:
      • Ensure that the Copy Behavior in the sink dataset is set to Preserve Hierarchy or Flatten Hierarchy to allow for multiple files to be written.
      • In the Sink Settings of the copy activity, uncheck the "Write to single file" option (if enabled). Also, confirm that the SingleFile property is explicitly set to false.
    4. Dynamic Folder Creation for Each Table:
      • Use the ForEach activity to loop through the list of Snowflake tables.
      • Inside the loop, add a Copy Data activity with the following configuration:
      • Source: The Snowflake table.
      • Sink: Azure Blob Storage with dynamic pathing
        • container-name/@{item().TableName}/
    5. Use Partition Count in Snowflake Copy Command:
      • Snowflake's external unload supports generating multiple files using the max_file_size and parallel parameters. Ensure the data unload is set to generate multiple files.

    Please feel free to click the 'Upvote' (Thumbs-up) button and 'Accept as Answer'. This helps the community by allowing others with similar queries to easily find the solution.


1 additional answer

Sort by: Most helpful
  1. Nalini Bhavaraju 20 Reputation points
    2025-02-19T17:06:56.1766667+00:00

    Even though setting SingleFile = False didn't resolve the issue, I was able to fix it by reviewing the source dataset. As the output format is in Parquet, I tried to keep the Filename in the Dataset as Empty and included my path in the directory box. I moved the expression to the directory box, left the filename field empty, and was then able to copy the data as multiple files, overcoming the single-file limitation.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.