I was able to fix the issue and copy data successfully with Json format instead of Parquet. But, I see a small drop in the number of rows copied for huge tables. Need help to understand the reason for the data drop.
Thanks.
This browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Hi Team,
Azure Data Factory pipeline to extract data from Snowflake and load it into Azure Blob Storage. However, I’m facing some challenges with the data copy process and could use some assistance.
I have several Tables in my snowflake schema. Need to perform a bulk copy.
Some of the Tables have columns with Datatype "Timestamp_NTZ or Timestamp_TZ", Need to copy all the data into Azure Blob Storage.
Pipeline has -->
Lookup --> To find all Tables in the Snowflake Schema
Foreach --> Loop through all the Tables from Lookup1
Copy Activity --> Has code to cast the Data typed column. My copy activity is failing.
So, Is there a way I can perform copy activity successfully without casting as there might be different columns with this Data type in multiple tables. (FYI, Tried this route but facing a different error) [Can I implement Staging? Not sure how it works? Any documentation or instructions please?]
If have to cast, How can I approach?
Appreciate' your suggestions and Help.
Thank you,
Nalini.
I was able to fix the issue and copy data successfully with Json format instead of Parquet. But, I see a small drop in the number of rows copied for huge tables. Need help to understand the reason for the data drop.
Thanks.
I'm glad that you were able to resolve your issue and thank you for posting your solution so that others experiencing the same thing can easily reference this! Since the Microsoft Q&A community has a policy that "The question author cannot accept their own answer. They can only accept answers by others ", I'll repost your solution in case you'd like to accept the answer .
Ask: Azure Data Factory pipeline to extract data from Snowflake and load it into Azure Blob Storage. However, I’m facing some challenges with the data copy process and could use some assistance.
I have several Tables in my snowflake schema. Need to perform a bulk copy.
Some of the Tables have columns with Datatype "Timestamp_NTZ or Timestamp_TZ", Need to copy all the data into Azure Blob Storage.
Pipeline has -->
Lookup --> To find all Tables in the Snowflake Schema Foreach --> Loop through all the Tables from Lookup1
Copy Activity --> Has code to cast the Data typed column. My copy activity is failing.
So, Is there a way I can perform copy activity successfully without casting as there might be different columns with this Data type in multiple tables. (FYI, Tried this route but facing a different error) [Can I implement Staging? Not sure how it works? Any documentation or instructions please?]
If have to cast, How can I approach?
Appreciate' your suggestions and Help.
Solution: I was able to fix the issue and successfully copy data from Snowflake to Azure Blob Storage via Azure Data Factory Pipeline by change the dataset format to Json instead of Parquet.
Regarding the drop in data records for large tables, there are potential reasons this might be happening:
enableSkipIncompatibleRow
. If this is set to True
, rows that don't match the schema or have data issues will be skippedHigh concurrency and parallelism settings can sometimes cause issues with large datasets. Try adjusting these settings to see if it improves the data transfer.
Steps to Troubleshoot
If I missed anything please let me know and I'd be happy to add it to my answer, or feel free to comment below with any additional information.
If you have any other questions, please let me know. Thank you again for your time and patience throughout this issue.
Please don’t forget to Accept Answer
and Yes
for "was this answer helpful" wherever the information provided helps you, this can be beneficial to other community members.