Sdílet prostřednictvím


Loading Data to SQL Data Warehouse with the .NET SqlBulkCopy class

The Azure SQL Data Warehouse service is built on SQL Server and extends the capabilities of SQL Server for large scale relational data warehouse workloads. By leveraging SQL Server as the core relational component, customers are able to quickly take advantage of the large ecosystem of solutions and partner products to build their data story. Having a strong set of integrations with Azure Services such as Azure Data FactoryAzure Data Catalog, and Azure Stream Analytics as well as a growing number of Data Management and Data Integration partners, the SQL  Data Warehouse service offers a vast solution set to migrate to the cloud. 

For customers that need to build custom pipelines, the SQL Data Warehouse service offers support for bulk data loading through the .NET SqlBulkCopy class. This is a .NET class that "lets you efficiently bulk load a SQL Server table with data from another source." Practically, this means you can get data from whatever source you'd like and then insert it into SQL Data Warehouse using a standard .NET framework library. For example, you could create an HDInsight Bolt to pull data into SQL Data Warehouse from Hadoop. 

Let's build out a sample to try this out.

Database Design

In this example, let's imagine we have a set of data that describes our products managed in a file (I know it simple but just illustrative).

 1,Cables & Power,
2,Camera & Photo,
3,Cell Phones, That latest mobile devices.
4,Computers,
5,GPS & Navigation,
6,Headphones,
7,Home Audio,
8,Office Electronics,
9,Security & Survelliance,
10,Software,
11,Tablets, All of your tablet needs.
12,Televisions, The latest in LCD/LED and 4K.
13,Wearable Technology,

We can model this file in our database by creating a simple Products table:

 CREATE TABLE DimProducts
(
 [ProductID] INT NOT NULL,
 [Name] VARCHAR(50) NOT NULL,
 [Description] NVARCHAR(256) NULL
)
WITH
(
 HEAP,
 DISTRIBUTION=ROUND_ROBIN
);

Code Sample

Now that we have our table, let's start loading it from .NET. I've created a simple Windows console application that reads a file, creates a System.Data.DataTable, loads the DataTable with data,  connects to SQL Data Warehouse and then bulk inserts the data. The main body is encapsulated in the Load method:

 static void Load(string path)
{
    // Load the file
    var reader = new StreamReader(path);

    // Get the Data Table to hold the rows
    var datatable = GetDataTable();

    // Setup our SQL Connection to our SQL Data Warehouse
    var connection = GetConnection();

    // Iterate through the file
    string row;
    while ((row = reader.ReadLine()) != null)
    {
        // Split the row by comma
        var values = row.Split(',');

        // Add the row values to the Data Table
        datatable.Rows.Add(values[0], values[1], values[2]);
    };

    // Open the connection to SQL Data Warehouse
    connection.Open();

    // Create a Bulk Copy class
    var bulkCopy = new SqlBulkCopy(connection);

    // Define the target table
    bulkCopy.DestinationTableName = "dbo.DimProducts";

    // Write the rows to the table
    bulkCopy.WriteToServer(datatable);

    ...
 }

The key steps are the last four - simply connecting to SQL Data Warehouse just like SQL Server, creating a new SqlBulkCopy class, defining the target table, and then inserting the data via the WriteToServer method.

To verify your results, simply Connect to SQL Data Warehouse and run the following query:

 SELECT * FROM DimProducts;

You'll see our 13 rows added:

Results

Download

You can download the SQL Bulk Load sample from the SQL Server Samples GitHub repo.