Lesson 1: Create a project and basic package with SQL Server Integration Services (SSIS)
Applies to: SQL Server SSIS Integration Runtime in Azure Data Factory
In this lesson, you create a simple extract, transform, and load (ETL) package. The package extracts data from a single flat file source, transforms the data using two lookup transformations, and writes the transformed data to a copy of the FactCurrencyRate
fact table in the AdventureWorksDW2022
sample database. As part of this lesson, you create new packages, add and configure data source and destination connections, and work with new control flow and data flow components.
Before creating a package, you need to understand the formatting used in both the source data and the destination. Then, you're ready to define the transformations necessary to map the source data to the destination.
Prerequisites
This tutorial relies on Microsoft SQL Server Data Tools, a set of example packages, and a sample database.
Note
The code samples in this article use the AdventureWorks2022
or AdventureWorksDW2022
sample database, which you can download from the Microsoft SQL Server Samples and Community Projects home page.
To install the SQL Server Data Tools, see Install SQL Server Data Tools (SSDT) for Visual Studio.
To download all of the lesson packages for this tutorial:
- Navigate to Integration Services tutorial files.
- Select the Download button.
- Select the
Creating a Simple ETL Package.zip
file, then select Next. - After the file downloads, unzip its contents to a local directory.
Look at the source data
For this tutorial, the source data is a set of historical currency data in a flat file named SampleCurrencyData.txt
. The source data has the following four columns: the average rate of the currency, a currency key, a date key, and the end-of-day rate.
Here's an example of the source data in the SampleCurrencyData.txt
file:
1.00070049USD9/3/05 0:001.001201442
1.00020004USD9/4/05 0:001
1.00020004USD9/5/05 0:001.001201442
1.00020004USD9/6/05 0:001
1.00020004USD9/7/05 0:001.00070049
1.00070049USD9/8/05 0:000.99980004
1.00070049USD9/9/05 0:001.001502253
1.00070049USD9/10/05 0:000.99990001
1.00020004USD9/11/05 0:001.001101211
1.00020004USD9/12/05 0:000.99970009
When working with flat file source data, it's important to understand how the Flat File connection manager interprets the flat file data. If the flat file source is Unicode, the Flat File connection manager defines all columns as [DT_WSTR]
with a default column width of 50
. If the flat file source is ANSI-encoded, the columns are defined as [DT_STR]
with a default column width of 50
. You probably have to change these defaults to make the string column types more applicable for your data. You need to look at the data type of the destination, and then choose that type within the Flat File connection manager.
Look at the destination data
The destination for the source data is a copy of the FactCurrencyRate
fact table in AdventureWorksDW2022
. The FactCurrencyRate
fact table has four columns, and has relationships to two dimension tables, as shown in the following table.
Column name | Data type | Lookup table | Lookup column |
---|---|---|---|
AverageRate |
float | None | None |
CurrencyKey |
int (FK) 1 | DimCurrency |
CurrencyKey (PK) 2 |
DateKey |
int (FK) 1 | DimDate |
DateKey (PK) 2 |
EndOfDayRate |
float | None | None |
1 FK: Foreign key
2 PK: Primary key
Map the source data to the destination
Our analysis of the source and destination data formats indicates that lookups are necessary for the CurrencyKey
and DateKey
values. The transformations that perform these lookups get those values by using the alternate keys from the DimCurrency
and DimDate
dimension tables.
Flat file column | Table name | Column name | Data type |
---|---|---|---|
0 |
FactCurrencyRate |
AverageRate |
float |
1 |
DimCurrency |
CurrencyAlternateKey |
nchar(3) |
2 |
DimDate |
FullDateAlternateKey |
date |
3 |
FactCurrencyRate |
EndOfDayRate |
float |
Lesson tasks
- Lesson 1-1: Create a new Integration Services project
- Lesson 1-2: Add and configure a Flat File connection manager
- Lesson 1-3: Add and configure an OLE DB connection manager
- Lesson 1-4: Add a Data Flow task to the package
- Lesson 1-5: Add and configure the Flat File source
- Lesson 1-6: Add and configure the Lookup transformations
- Lesson 1-7: Add and configure the OLE DB destination
- Lesson 1-8: Annotate and format the Lesson 1 package
- Lesson 1-9: Test the Lesson 1 package