Data flow gives different output in debug and trigger.

Question

Data flow gives different output in debug and trigger.

Bansal, Nimish 60

I have a few dataflows which uses sorter and aggregator. I sort my data on a column, then drop duplicates using aggregate transformation. I am using last($$) to select the last occurrence of the primary key. On using data preview or executing pipeline using debug mode, it gives the expected output. However, when I am triggering the pipeline for same dataset, I am not getting expected output. The data flow is not returning the last row, it appears to return a random row. I am not using any joins or anything else. My data is read from adls, it gets sorted, duplicates are removed and some columns are modified using date functions. Nothing else is being done.

Any suggestions on what I might be doing wrong?

Chandra Boorla 9,985 Reputation points Microsoft External Staff

2025-03-12T19:24:44.36+00:00

@Bansal, Nimish

Just checking in to see if the below answer provided by @ Alex Burlachenko helped.

If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

1 answer

Your answer

Chandra Boorla 9,985 Reputation points Microsoft External Staff

2025-03-12T19:24:44.36+00:00

@Bansal, Nimish

Just checking in to see if the below answer provided by @ Alex Burlachenko helped.

If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

Answer 1

Alex Burlachenko 1,755

Dear Nimish,

The issue you’re facing is likely due to inconsistent sorting in distributed processing.

Add a Sort Transformation, explicitly sort your data by the primary key before using last($$ in the Aggregate Transformation. Use Single Partitioning, in the Optimize Tab, set partitioning to Single Partition to ensure consistent results.

just in case - Azure Data Factory documentation.

Best regards,
Alex

p.s. If you found the answer helpful, please click on Upvote and Accept Answer. This will help other community members.

Share via

Data flow gives different output in debug and trigger.

1 answer

Your answer