How to scale out the size of data moving using Azure Data Factory - error: The length of execution output is over limit (around 4MB currently)

Anonymous
2022-01-18T18:53:31.103+00:00

I am trying to find a solution to move larger data using Azure Durable Function using Azure Data Factory.

Currently, I am using Durable function template (Starter, Orchestration and Activity).
I was hoping that having the durable function would solve moving large data, but still I am stuck at ADF.

While it is running, it hits the size limit error at ADF (Azure Data Factory) - "The length of execution output is over limit (around 4MB currently)", so I have to adjust the size of paging manually in order to pass data each time, and it is pretty manual process.

166114-image.png

I am trying to stay within Consumption plan of Azure Function because I think it is most economic.
But, when I try even with a higher tier (like Premium), it only passes thru Azure Function area (by increasing memory), but still causing error at Web portion due to 4 MB limit.

What is best strategy to move larger data from Azure Function to ADF?
I am using Azure Blob files to store data (input and output in text files).

Is there any setting I could configure in Blob storage or within ADF itself to make a difference?
Knowing that ADF should handle much bigger volume of data for other cases, where does the limitation come from (blob storage, Azure Function or something else)?

Thanks!

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,929 questions
0 comments No comments
{count} votes

Accepted answer
  1. KranthiPakala-MSFT 46,517 Reputation points Microsoft Employee
    2022-01-20T08:05:14.177+00:00

    Hello @KingJava,

    Thanks for the question and using MS Q&A platform.

    As we understand the ask here is if there is a way to break the 4MB limit. for web activity in ADF , please let me know if thats not accurate .

    Unfortunately, this is a hard limit on Web activity where the response size should be less than or equal to 4MB. This is being called out in the public documentation

    166599-image.png

    Looks like Azure Resource Manager has limits for API calls. You can make API calls at a rate within the Azure Resource Manager API limits. The maximum size of the API schema that can be used by a validation policy is 4 MB. If the schema exceeds this limit, validation policies will return errors on runtime. For more info please refer to this doc : API Management policies to validate requests and responses

    In the past I had a conversation with product team about the same, but they have confirmed that it is a hard limit and there is no roadmap to increase these capacity limits in the near future.

    And as a workaround you may try using ForEach Activity. Maybe you need to use paging query for your rest api/endpoint and return a limited number of the data each time. Then query your data in loop until the number of return data is lower than threshold value. Please refer to this source: Web activity throws overlimit error when calling rest api

    Here is existing feedback on the web activity response size limitation submitted by a user, please feel free to up-vote and comment on it as it would help increase the priority of feature request suggestion. Feedback link: Azure Data Factory - Web Activity - maximize supported output response payload

    Hope this info will help. Please let us know if any further queries.


    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
    0 comments No comments

2 additional answers

Sort by: Most helpful
  1. Anonymous
    2022-02-05T23:56:24.597+00:00

    @KranthiPakala-MSFT

    Thank you for your answer.
    I am trying to figure out how ForEachActivity would work on my case.
    Ok, I understand the concept of ForEach, but how do I apply here?
    Isn't ForEach activity more like looping over several files?

    How do I apply Web Activity (is there way to break down by 5 MB each time) in ForEach?
    I guess I am trying to understand what do we measure (what is Activity here) on "ForEach"?
    Thanks.


  2. Moore, James 1 Reputation point
    2022-09-13T03:26:48.89+00:00

    If you have SQL Server 2017 (14.x) and later (or SQL Managed Instance), you can use python to do this instead. Recommend to use Postman to generate the python script. As an additional extra, you can use the openjson function to convert turn this into tabular format too.

    DECLARE @response NVARCHAR(MAX)

    EXECUTE sp_execute_external_script @language = N'Python' , @script = N'

    import requests import json

    url = "https://..................................."

    payload = json.dumps({ "api_key": "............................" }) headers = { ''Subscription-Key'': ''.........................'', ''Content-Type'': ''application/json'', ''Ocp-Apim-Trace'': ''false'' }

    response = requests.request("POST", url, headers=headers, data=payload) response = response.text ' ,@params = N'@response NVARCHAR(MAX) OUTPUT' ,@response=@response OUTPUT

    INSERT INTO dbo.mytable ([response]) SELECT @response

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.