Loading Data to SQL Data Warehouse with the .NET SqlBulkCopy class
The Azure SQL Data Warehouse service is built on SQL Server and extends the capabilities of SQL Server for large scale relational data warehouse workloads. By leveraging SQL Server as the core relational component, customers are able to quickly take advantage of the large ecosystem of solutions and partner products to build their data story. Having a strong set of integrations with Azure Services such as Azure Data Factory, Azure Data Catalog, and Azure Stream Analytics as well as a growing number of Data Management and Data Integration partners, the SQL Data Warehouse service offers a vast solution set to migrate to the cloud.
For customers that need to build custom pipelines, the SQL Data Warehouse service offers support for bulk data loading through the .NET SqlBulkCopy class. This is a .NET class that "lets you efficiently bulk load a SQL Server table with data from another source." Practically, this means you can get data from whatever source you'd like and then insert it into SQL Data Warehouse using a standard .NET framework library. For example, you could create an HDInsight Bolt to pull data into SQL Data Warehouse from Hadoop.
Let's build out a sample to try this out.
Database Design
In this example, let's imagine we have a set of data that describes our products managed in a file (I know it simple but just illustrative).
1,Cables & Power,
2,Camera & Photo,
3,Cell Phones, That latest mobile devices.
4,Computers,
5,GPS & Navigation,
6,Headphones,
7,Home Audio,
8,Office Electronics,
9,Security & Survelliance,
10,Software,
11,Tablets, All of your tablet needs.
12,Televisions, The latest in LCD/LED and 4K.
13,Wearable Technology,
We can model this file in our database by creating a simple Products table:
CREATE TABLE DimProducts
(
[ProductID] INT NOT NULL,
[Name] VARCHAR(50) NOT NULL,
[Description] NVARCHAR(256) NULL
)
WITH
(
HEAP,
DISTRIBUTION=ROUND_ROBIN
);
Code Sample
Now that we have our table, let's start loading it from .NET. I've created a simple Windows console application that reads a file, creates a System.Data.DataTable, loads the DataTable with data, connects to SQL Data Warehouse and then bulk inserts the data. The main body is encapsulated in the Load method:
static void Load(string path)
{
// Load the file
var reader = new StreamReader(path);
// Get the Data Table to hold the rows
var datatable = GetDataTable();
// Setup our SQL Connection to our SQL Data Warehouse
var connection = GetConnection();
// Iterate through the file
string row;
while ((row = reader.ReadLine()) != null)
{
// Split the row by comma
var values = row.Split(',');
// Add the row values to the Data Table
datatable.Rows.Add(values[0], values[1], values[2]);
};
// Open the connection to SQL Data Warehouse
connection.Open();
// Create a Bulk Copy class
var bulkCopy = new SqlBulkCopy(connection);
// Define the target table
bulkCopy.DestinationTableName = "dbo.DimProducts";
// Write the rows to the table
bulkCopy.WriteToServer(datatable);
...
}
The key steps are the last four - simply connecting to SQL Data Warehouse just like SQL Server, creating a new SqlBulkCopy class, defining the target table, and then inserting the data via the WriteToServer method.
To verify your results, simply Connect to SQL Data Warehouse and run the following query:
SELECT * FROM DimProducts;
You'll see our 13 rows added:
Download
You can download the SQL Bulk Load sample from the SQL Server Samples GitHub repo.