Performing SQL Bulk Copy with Azure Automation/PowerShell while modifying source data
In this blog, I will be going over how to use AzureAutomation/PowerShell to perform bulk operations using bulk copy in SQL Server instances and as we perform the bulk copy, I will also be editing the source data and adding some logging information to the rows in the database.
The script can be downloaded from my technet blog post here
For simplicity and more usability, consider I have a CSV file with source data that I want to insert in a database table in SQL Server. The CSV in this example has the following headers:
Group,PrimarySMTPAddress
Now, while performing the bulk insert, I want to add additional columns like Id and GUID. Normally all of these operations is super time consuming in Azure Automation and a lot of times this results into the script host running out of memory and the script fails to complete. Normally you would get an error like:
“The job action 'Activate' cannot be run, because the process stopped unexpectedly. The job action was attempted 3 times”
Normally, you would write a basic bulk copy script like this where "BulkCopyTo-SQLServer" is a placeholder logic for your Bulk Insert/Update:
PowerShell
$CSVData = Import-Csv -Path $CSVPath -Delimiter ',' $dtable = $CSVData | Out-DataTable BulkCopyTo-SQLServer -Data $dtable
But this consumes tremendous amount of memory in all the three steps. Instead we should write the above as below to do everything without creating any intermediate variables.
PowerShell
Import-Csv -Path $CSVPath -Delimiter ',' | Out-DataTable | BulkCopyTo-SQLServer
With the attached script, the way that its designed is that it wont be memory intensive and yet bulk process and copy the rows into SQL - That's the beauty of PowerShell.
I was able to insert about a 1,00,000 rows within couple of seconds with this script, let me know if you guys have comments or feedback!
Cheers!