Extraction and Loading of OData to Postgresql issue with ADF pipeline with SOOM

Question

our requirement to perform Extraction and Loading of OData to Postgresql using ADF pipeline is having issues related to System Out Of Memory exception randomly and sometimes with the max buffer issues. For the same pipeline run which worked and completed successfully in copying data in the subsequent runs failing in copying 200000 records. at the same time it works for few entity tables for the same records count without any issues all the times. Any pointer on what should be considered for making this pipeline to be formed apart from Infra or memory args of that server behind this ADF integration run time environment.

Operation on target Detail failed: Failure happened on 'Source' side. 'Type=System.Net.Http.HttpRequestException,Message=Cannot write more bytes to the buffer than the configured maximum buffer size: 2147483647.,Source=System.Net.Http,'

Failure happened on 'Source' side. ErrorCode=SystemErrorOutOfMemory,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=A task failed with out of memory.,Source=Microsoft.DataTransfer.TransferTask,''Type=System.OutOfMemoryException,Message=Exception of type 'System.OutOfMemoryException' was thrown.,Source=mscorlib,'

Answer

Thanks for your inputs! Please find my inline comments.

Memory Management

Increase Integration Runtime Capacity: We don't have control on this setting and we might not be given a chance to make any changes with the current situation.
Optimize Data Volume: Yes, this is already tried and for which it works but takes hours and hours to get the pipeline completion.

Buffer Size Configuration

Adjust Buffer Settings: This was tried too but nothing has favored to resolve the issue.
Use Staging: Yes, this is considered but there are some issues to be used this way which we are still evaluating. But will keep progressing on this as well.
1. Entity-Specific Issues
Analyze Entity Tables: not all have this kind of columns with large text but there are few entities that have this large text fields. Even without that it was able to completed 1 run and the subsequent run is failing where it is not giving a pattern on why it got succeeded in the first run but failed in the next runs.
Data Transformation: No transformations are in scope for the current design. Also, to use a SP or Dataflow or Databricks we have some limitations when it comes to Odata as source.

Error Handling and Logging

Implement Retry Logic: Unless used the loop conditions and with the retry settings as well it is not getting succeeded.
Enable Detailed Logging: This setting I am still trying to enable as the activity log entries are not showing any specific information.

Review ADF Limits

Check ADF Limits: To some extent we were able to verify and reduce the load accordingly. But if there is any information on the same please share.

Share via

Extraction and Loading of OData to Postgresql issue with ADF pipeline with SOOM

1 answer

Your answer