Upravit

Sdílet prostřednictvím


Lesson 1: Create a project and basic package with SQL Server Integration Services (SSIS)

Applies to: SQL Server SSIS Integration Runtime in Azure Data Factory

In this lesson, you create a simple extract, transform, and load (ETL) package. The package extracts data from a single flat file source, transforms the data using two lookup transformations, and writes the transformed data to a copy of the FactCurrencyRate fact table in the AdventureWorksDW2022 sample database. As part of this lesson, you create new packages, add and configure data source and destination connections, and work with new control flow and data flow components.

Before creating a package, you need to understand the formatting used in both the source data and the destination. Then, you're ready to define the transformations necessary to map the source data to the destination.

Prerequisites

This tutorial relies on Microsoft SQL Server Data Tools, a set of example packages, and a sample database.

Note

The code samples in this article use the AdventureWorks2022 or AdventureWorksDW2022 sample database, which you can download from the Microsoft SQL Server Samples and Community Projects home page.

To install the SQL Server Data Tools, see Install SQL Server Data Tools (SSDT) for Visual Studio.

To download all of the lesson packages for this tutorial:

  1. Navigate to Integration Services tutorial files.
  2. Select the Download button.
  3. Select the Creating a Simple ETL Package.zip file, then select Next.
  4. After the file downloads, unzip its contents to a local directory.

Look at the source data

For this tutorial, the source data is a set of historical currency data in a flat file named SampleCurrencyData.txt. The source data has the following four columns: the average rate of the currency, a currency key, a date key, and the end-of-day rate.

Here's an example of the source data in the SampleCurrencyData.txt file:

1.00070049USD9/3/05 0:001.001201442
1.00020004USD9/4/05 0:001
1.00020004USD9/5/05 0:001.001201442
1.00020004USD9/6/05 0:001
1.00020004USD9/7/05 0:001.00070049
1.00070049USD9/8/05 0:000.99980004
1.00070049USD9/9/05 0:001.001502253
1.00070049USD9/10/05 0:000.99990001
1.00020004USD9/11/05 0:001.001101211
1.00020004USD9/12/05 0:000.99970009

When working with flat file source data, it's important to understand how the Flat File connection manager interprets the flat file data. If the flat file source is Unicode, the Flat File connection manager defines all columns as [DT_WSTR] with a default column width of 50. If the flat file source is ANSI-encoded, the columns are defined as [DT_STR] with a default column width of 50. You probably have to change these defaults to make the string column types more applicable for your data. You need to look at the data type of the destination, and then choose that type within the Flat File connection manager.

Look at the destination data

The destination for the source data is a copy of the FactCurrencyRate fact table in AdventureWorksDW2022. The FactCurrencyRate fact table has four columns, and has relationships to two dimension tables, as shown in the following table.

Column name Data type Lookup table Lookup column
AverageRate float None None
CurrencyKey int (FK) 1 DimCurrency CurrencyKey (PK) 2
DateKey int (FK) 1 DimDate DateKey (PK) 2
EndOfDayRate float None None

1 FK: Foreign key

2 PK: Primary key

Map the source data to the destination

Our analysis of the source and destination data formats indicates that lookups are necessary for the CurrencyKey and DateKey values. The transformations that perform these lookups get those values by using the alternate keys from the DimCurrency and DimDate dimension tables.

Flat file column Table name Column name Data type
0 FactCurrencyRate AverageRate float
1 DimCurrency CurrencyAlternateKey nchar(3)
2 DimDate FullDateAlternateKey date
3 FactCurrencyRate EndOfDayRate float

Lesson tasks

Next step