Welcome to the Microsoft Q&A forum.
In Synapse Pipelines, you can achieve similar partitioning and control over file sizes using the Copy Activity with parallelism and dynamic partitioning. Here are some steps to help you:
Parallel Copy: Use the Degree of copy parallelism setting to control the number of parallel threads for copying data. This can help distribute the load and improve performance.
- Go to the Source settings of the Copy Activity.
- Set the Degree of copy parallelism to a value that suits your needs (default is 20, maximum is 50).
Dynamic Partitioning: If your source table is partitioned, you can leverage these partitions for parallel copying.
- In the Source settings, enable the Physical partitions of table option.
- If your source table is not partitioned, you can define dynamic partition ranges based on a column (e.g., date or ID).
Custom Partition Ranges: Define custom partition ranges using parameters and dynamic content expressions.
- Create pipeline parameters for the partition column and ranges.
- Use these parameters in the Source settings to define the partition ranges dynamically.
Optimize Performance: Ensure your cluster size and resources are appropriate for the data size and partition count.
- Monitor and adjust the Data Integration Units (DIUs) and nodes to handle the load efficiently.
For more detailed guidance, please refer to the Copy Activity performance and scalability guideI hope the above steps will resolve the issue, please do let us know if issue persists. Thank you