Hi Iwan,
Thanks for reaching out to Microsoft Q&A.
In Azure Synapse, when you run multiple notebooks in parallel within a pipeline, the capacity of the cluster can become a bottleneck. If the cluster reaches its capacity (ex: due to limited available cores or memory), synapse will typically queue the remaining jobs until resources become available. It will not fail the jobs outright unless there's an issue such as an out-of-memory error or a configuration problem.
To handle this gracefully, you can try the following:
- Ensure you monitor the cluster’s CPU and memory usage to understand when you might hit capacity limits.
- Adjust the concurrency in your "ForEach" activity in the pipeline to limit how many notebooks are run in parallel. This can be done using the "Batch Count" property.
- If you are using a cluster that supports auto-scaling, the cluster will attempt to add more nodes to accommodate additional workload, as long as it is configured and has not hit any maximum limits.
- Synapse typically queues notebook executions that exceed current resources. If you have many notebooks, it will process them as resources free up. If a notebook fails due to resource constraints, you could add retry policies in your pipeline.
- For spark jobs, If you consistently have long queue times, consider increasing your spark pool size.
Implementing these strategies can help manage resource allocation and prevent failures due to exceeding cluster capacity.
Please 'Upvote'(Thumbs-up) and 'Accept' as an answer if the reply was helpful. This will benefit other community members who face the same issue.