Share via


Extract Transform Load (ETL)

ETL stands for Extract, Transform and Load, which are phases of data processing in a data warehouse.

Extract
In this phase, data is read from the source systems, and copied to the staging area of the data warehouse - in a Microsoft environment, it typically means copying data from various sources with SQL Server Integration Services ([[articles:SQL Server Integration Services|SSIS]]) to one or more staging databases in SQL Server. The most common data source formats are relational databases and flat files.

Transform
In the Transform phase, the data in the staging area is transformed: the data is validated and cleansed, then data from different sources are joined and consolidated following business rules. A massive amount of calculations and aggregations are executed on the data, to prepare the next phase. The values of derived measures are calculated here. For the list of the built-in transformations in SSIS, please check the article: [[articles:SSIS: List of transformations|SSIS: List of transformations]].

Load
In the Load phase, the data warehouse processes move data in its end target, usually the presentation area of the data warehouse. Feedback about the results is sent to other applications or to end users. The presentation area is where end users can access the data directly, through web reports, OLAP interfaces or other BI applications.