Step by Step approach to Copy data from Azure Data Lake to Azure SQL data warehouse using Azure Data Factory
Azure Data Factory is the data orchestration tool which helps to transfer data to & from Azure Data Lake, HDInsight, Azure SQL Database, Azure ML(Cognitive Services) , Azure Blob Storage etc. Here in this demo, we'll be demonstrating how to transfer data from Azure Data Lake to Azure SQL Data Warehouse database.
First, create an azure data factory from azure preview portal providing resource group name & location.
Once it's created, under Settings , select 'Copy Data(Preview) ' to start transferring data from Azure Data Lake to Azure SQL data warehouse.
Next, Start creation of the ADF pipeline to transfer data from Azure Data Lake 'Weblog' data sources to Azure SQL Data warehouse database. You can create a SQL data warehouse database in Azure in Azure Preview Portal from 'New' -> 'Databases' -> Azure SQL Data Warehouse as per as the screenshot.
Click on 'Copy Data' activity of Azure Data Factory start transferring data & create pipeline. provide the Task schedule of the transfer activities (min 15 minutes).
You can transfer from Azure Blob Storage, Azure Data Lake, Table Storage, On-premise hadoop cluster, Azure SQL/on-prem SQL database, MySQL, Cassandra, Salesforce, DB2, Amazon S3 etc.
Select Azure Data Lake to start transferring data.
Provide the source data copy connection details & elect data sources from Azure Data Lake. Check the file format settings & click on 'Next' to choose destination.
Provide Azure Data Lake Database credentials & database name to start transferring data to Azure SQL data warehouse.
Map the table where you would want to transfer data with columns /Schema mapping.
Finally in the Performance setting page, you may provide Polybase settings, azure storage account details & final parallel copy settings as 'Auto'.
Then on 'Summary' page, click on 'Authorize' to authorize the transaction of data from ADL to Azure SQL data warehouse database & finally click on 'Finish' to start Copy activity of data.
Once the deployment is completed, you can see the whole pipeline to copy data activity in ADF pipeline.
Click on 'Click here to monitor copy pipeline' to check the Activities details.
Once the Copy Activity is in progress, you can check the SQL data warehouse table with 'SELECT' command of transferred data from Azure Data Lake data source.