Extraction and Loading of OData to Postgresql issue with ADF pipeline with SOOM

2024-11-20T17:41:10.1633333+00:00

our requirement to perform Extraction and Loading of OData to Postgresql using ADF pipeline is having issues related to System Out Of Memory exception randomly and sometimes with the max buffer issues. For the same pipeline run which worked and completed successfully in copying data in the subsequent runs failing in copying 200000 records. at the same time it works for few entity tables for the same records count without any issues all the times. Any pointer on what should be considered for making this pipeline to be formed apart from Infra or memory args of that server behind this ADF integration run time environment.

Operation on target Detail failed: Failure happened on 'Source' side. 'Type=System.Net.Http.HttpRequestException,Message=Cannot write more bytes to the buffer than the configured maximum buffer size: 2147483647.,Source=System.Net.Http,'

Failure happened on 'Source' side. ErrorCode=SystemErrorOutOfMemory,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=A task failed with out of memory.,Source=Microsoft.DataTransfer.TransferTask,''Type=System.OutOfMemoryException,Message=Exception of type 'System.OutOfMemoryException' was thrown.,Source=mscorlib,'

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,914 questions
{count} votes

1 answer

Sort by: Most helpful
  1. 2024-11-20T19:07:30.5466667+00:00

    Thanks for your inputs! Please find my inline comments.

    1. Memory Management
    • Increase Integration Runtime Capacity: We don't have control on this setting and we might not be given a chance to make any changes with the current situation.
    • Optimize Data Volume: Yes, this is already tried and for which it works but takes hours and hours to get the pipeline completion.
    1. Buffer Size Configuration
    • Adjust Buffer Settings: This was tried too but nothing has favored to resolve the issue.
    • Use Staging: Yes, this is considered but there are some issues to be used this way which we are still evaluating. But will keep progressing on this as well.
      1. Entity-Specific Issues
    • Analyze Entity Tables: not all have this kind of columns with large text but there are few entities that have this large text fields. Even without that it was able to completed 1 run and the subsequent run is failing where it is not giving a pattern on why it got succeeded in the first run but failed in the next runs.
    • Data Transformation: No transformations are in scope for the current design. Also, to use a SP or Dataflow or Databricks we have some limitations when it comes to Odata as source.
    1. Error Handling and Logging
    • Implement Retry Logic: Unless used the loop conditions and with the retry settings as well it is not getting succeeded.
    • Enable Detailed Logging: This setting I am still trying to enable as the activity log entries are not showing any specific information.
    1. Review ADF Limits
    • Check ADF Limits: To some extent we were able to verify and reduce the load accordingly. But if there is any information on the same please share.
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.