Azure Data Factory on trigger event not working ,when the files has been written from spark

Sathish Surendran 31 Reputation points
2022-01-04T23:29:22.737+00:00

Azure Data Factory on trigger event not working ,when the file(parquet) has been generated from spark

When i upload the file(parquet)manually, event trigger is working as expected, adf pipeline is getting triggered.

but when the same file(parquet)has been written into blob location through spark, event based trigger is not firing the ADF pipeline?

What could be the reason ,Need your suggestion here?

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,285 questions
0 comments No comments
{count} votes

3 answers

Sort by: Most helpful
  1. Sathish Surendran 31 Reputation points
    2022-01-07T01:15:35.893+00:00

    HI Kranthi, Thanks for your time and reply,

    I did verified the same on Flush API and Azure Blob storage Event Grid source , everything looks good.

    after going through couple of testing found the issue is to Set ignore empty blob to false. Then the pipeline triggered automatically as soon the file has been uploaded by Spark.

    Regards
    Sathish

    5 people found this answer helpful.

  2. KranthiPakala-MSFT 46,607 Reputation points Microsoft Employee
    2022-01-06T01:12:32.357+00:00

    Hello @Sathish Surendran ,

    Thanks for the question and using MS Q&A platform.

    As per this statement 'When i upload the file(parquet)manually, event trigger is working as expected, adf pipeline is getting triggered.' my understanding is that you have configured your event trigger correctly hence the pipeline is being triggered when a file is manually uploaded. But issue is only when the file is being uploaded by your spark application/service. This gives me a hint that your spark application/service is not invoking correct storage API's while generating/dropping the file in storage location.

    Usually when a BlogTrigger is created, ADF will create an EventGrid subscription with the storage account. Once a blob is created, storage service will raise an event that will reach ADF and trigger the execution. Storage service will raise events based on the API calls that are being invoked when the blob is created. In particular the Flush API is needed to complete the blob creation. If your spark application/service is not invoking the correct APIs, storage service will not raise the event and there is nothing we can do on ADF side.

    Hence I would recommend you to investigate your spark application/service and ensure that it's calling the Flush API to create an event to the Event Grid.

    To get a better understanding of List of events for Blob REST APIs, I would recommend you to please refer to this doc: Azure Blob Storage as an Event Grid source

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
    1 person found this answer helpful.
    0 comments No comments

  3. NATASHA TAHARIYA 0 Reputation points
    2025-02-25T22:09:17.21+00:00

    Hi ,

    i am facing a similar issue.

    I want to trigger an ADF pipeline on blob creation. The manual upload to blob storage triggers the ADF pipeline correctly. But when i am uploading the same via AzureDataLakeStore .Net libraries from a web app, specifically package - <package id="Azure.Storage.Files.DataLake" version="12.21.0" targetFramework="net472" />

    method - await datalakeFileClient.UploadAsync(content: stream, overwrite: true)

    It is not triggering the ADF trigger ?

    Any idea how to resolve this issue ?

    Regards,

    Natasha


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.