How to Automate Large .SAV File to Parquet Conversion in Azure?
I'm using Azure ADLS as our primary storage, Azure Data Factory (ADF) for data transformations, and Power BI for reporting and visualization.
I have a large .SAV
file (200-300 MB, containing 2-4 million rows) stored in Azure Data Lake Storage (ADLS). To load the data into a SQL table, I need to first convert the .SAV
file into a Parquet file, as Azure Data Factory (ADF) cannot directly process .SAV
files.
I previously attempted to use an Azure Function for this conversion, but encountered a limitation where execution times out after 10 minutes, which is insufficient for processing files of this size.
I'm looking for an optimized and scalable solution to automate this conversion process within the Azure ecosystem.
Key Considerations:
- The solution must handle large files efficiently.
- It should be compatible with Azure services and integrate seamlessly into a data pipeline.
- Preferably avoid time-out or size limitations like those in Azure Functions.
Any guidance on how to approach this or examples of similar implementations would be highly appreciated.