Hi ,
Thanks for reaching out to Microsoft Q&A.
Solution Steps
- Enable Snowflake Query Partitioning:
- Use Snowflake's SQL
COPY INTO
command with parallelization by specifying thePARTITION BY
clause in the query. - In the ADF pipeline, specify the SQL query in the Source dataset that generates partitions. For example: SELECT * FROM table1 WHERE MOD(ROW_NUMBER() OVER(), 2) = 0
- Or partition based on data logic (ex: date range, region, etc.).
- Use Snowflake's SQL
- Set ADF Dataset Parameters:
- Create a dataset parameter in ADF for the folder path, ex:
@dataset().folderPath
. - Bind the parameter dynamically to the sink path in the Copy Activity,
- Create a dataset parameter in ADF for the folder path, ex:
- Disable Single File Setting:
- Ensure that the
Copy Behavior
in the sink dataset is set to Preserve Hierarchy or Flatten Hierarchy to allow for multiple files to be written. - In the Sink Settings of the copy activity, uncheck the "Write to single file" option (if enabled). Also, confirm that the
SingleFile
property is explicitly set tofalse
.
- Ensure that the
- Dynamic Folder Creation for Each Table:
- Use the ForEach activity to loop through the list of Snowflake tables.
- Inside the loop, add a Copy Data activity with the following configuration:
- Source: The Snowflake table.
- Sink: Azure Blob Storage with dynamic pathing
- container-name/@{item().TableName}/
- Use Partition Count in Snowflake Copy Command:
- Snowflake's external unload supports generating multiple files using the
max_file_size
andparallel
parameters. Ensure the data unload is set to generate multiple files.
- Snowflake's external unload supports generating multiple files using the
Please feel free to click the 'Upvote' (Thumbs-up) button and 'Accept as Answer'. This helps the community by allowing others with similar queries to easily find the solution.