Azure Data Factory ForEach/If - Force Next Iteration

Corey Livermore 20 Reputation points
2025-03-06T16:40:32.72+00:00

Our organization receives and loads files into an on-prem SQL Server using SSIS packages written in Visual Studio. As part of the file loads, we run a series of validations on the files to insure that certain things are correct - file name, is the file empty, is it a duplicate, and so on. We run these validations one at a time and, if the validation fails, we fail the file and move on to the next file in the batch. Only if a validation passes will we move on to the next validation.

We are in the process of migrating from on-prem SQL to cloud Snowflake. We cannot use SSIS for our file loads into Snowflake as the ODBC connector has really bad performance when writing records, and some of our files are in the hundreds of MB in size. So in order to migrate to Snowflake, we have to use a cloud option for file loading, and our organization has decided upon Azure Data Factory. Unfortunately, we are running into issues with rebuilding our code in ADF.

Currently, ADF does not support multiple nested IF statements. When we set up the validations, we check that the file name is properly coded, and if true we simply move on to the next validation. But when the file name is improperly coded, we fail the file...and the pipeline moves on to the next validation. This should not happen. If the file fails, the current iteration should be stopped and the For Each loop that the IF statement is inside of should be forced to start over with the next file (assuming there is a next file; if there is no next file, stop the For Each loop and move onto the next component outside the For Each loop).

As mentioned, we cannot nest multiple IF statements. We also cannot nest an Until inside a For Each, and it does not appear that there are any components that allow for Break/End/Exit/Next. I know that the stock answer is to run child pipelines for processing, but that doesn't prevent additional pipelines from being run that should not run if the IF statement returns FALSE.

So how do we go about doing this? Is it possible that, upon an IF statement executing its FALSE branch, we can force the parent level FOR EACH to move to the next iteration without processing any components after the IF statement? Basically:

  • For Each file in the list
    • Check the file name
      • If the file name is proper, move to next validation
      • If the file name is improper, fail the file and move to the next file in the list

Is this possible in ADF? Is there a way to stop processing of the current iteration in a FOR EACH and move to the next (if available) iteration? And no, I do NOT want to fail the pipeline or activity; this would then generate false positives in the error logs, which would then require someone to investigate every single time we see one of these. Failing the file is a valid outcome.

Any help here would be keen.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,323 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Nandan Hegde 34,351 Reputation points MVP
    2025-03-07T06:10:40.91+00:00

    You can use the below flow :

    within foreach loop (for diff file iterations) have the below sequential flow :

    use getmetadata activity to check get the file details from the path (filename, filetype,header etc)

    use IF activity to check whether the filename is as expected or not

    if expected, no activities within True

      if not, then add a fail activity within False to throw error and fail the iteration and move to the next one
      
      Use If activity to check whether the header is as expected or not
      
         1. True ,false would be same as above
         
         Add the necessary additional steps
         
    
    1. Depending on the file type use Switch activity to redirect for the necessary executions Now in the above scenario, you would have to explicitly fail the iteration to proceed to the next one to overcome the limitation of nested if. If you want to avoid the failure and want the nested IF scenario, you can call an execute pipeline activity from within the IF activity and call the other IF activity in the new pipeline note : you can fail an activity in ADF still ensuring the pipeline status is success
      Refer below doc:
      https://datasharkx.wordpress.com/2021/08/19/error-logging-and-the-art-of-avoiding-redundant-activities-in-azure-data-factory/

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.