Performing an Incremental Load of Multiple Tables
In the topic, Improving Incremental Loads with Change Data Capture, the diagram illustrates a basic package that performs an incremental load on just one table. However, loading one table is not as common as having to perform an incremental load of multiple tables.
When you perform an incremental load of multiple tables, some steps have to be performed once for all the tables, and other steps have to be repeated for each source table. You have more than one option for implementing these steps in Integration Services:
Use a parent package and child packages.
Use multiple Data Flow tasks in a single package.
Note
To see complete, end-to-end samples that demonstrate the use of change data capture in packages, see the Change Data Capture for Specified Interval Package sample and the Change Data Capture since Last Request Package sample on Codeplex.
Loading Multiple Tables by Using a Parent Package and Multiple Child Packages
You can use a parent package to perform those steps that only have to be done once. The child packages will perform those steps that have to be done for each source table.
To create a parent package that performs those steps that only have to be done once
Create a parent package.
In the control flow, use an Execute SQL task or Integration Services expressions to calculate the endpoints.
For an example of how to calculate endpoints, see Specifying an Interval of Change Data.
If needed, use a For Loop container to delay execution until change data for the selected period is ready.
For an example of such a For Loop container, see Determining Whether the Change Data Is Ready.
Use multiple Execute Package tasks to execute child packages for each table to be loaded. Pass the endpoints calculated in the parent package to each child package by using Parent Package Variable configurations.
For more information, see Execute Package Task and How to: Use the Values of Parent Variables in a Child Package.
To create child packages to perform those steps that have to be done for each source table
For each source table, create a child package.
In the control flow, use a Script task or an Execute SQL task to assemble the SQL statement that will be used to query for changes.
For an example of how to assemble the query, see Preparing to Query for the Change Data.
Use a single Data Flow task in each child package to load the change data and apply it to the destination. Configure the Data Flow as described in the following steps:
In the data flow, use a source component to query the change tables for the changes that fall within the selected endpoints.
For an example of how to query the change tables, see Retrieving and Understanding the Change Data.
Use a Conditional Split transformation to direct inserts, updates, and deletes to different outputs for appropriate processing.
For an example of how to configure this transformation to direct output, see Processing Inserts, Updates, and Deletes.
Use a destination component to apply the inserts to the destination. Use OLE DB Command transformations with parameterized UPDATE and DELETE statements to apply updates and deletes to the destination.
For an example of how to use this transformation to apply updates and deletes, see Applying the Changes to the Destination.
Loading Multiple Tables by Using Multiple Data Flow Tasks in a Single Package
Alternatively, you can use a single package that contains a separate Data Flow task for each source table to be loaded.
To load multiple tables by using multiple Data Flow tasks in a single package
Create a single package.
In the control flow, use an Execute SQL Task or Integration Services expressions to calculate the endpoints.
For an example of how to calculate endpoints, see Specifying an Interval of Change Data.
If needed, use a For Loop container to delay execution until the change data for the selected interval is ready.
For an example of such a For Loop container, see Determining Whether the Change Data Is Ready.
Use a Script task or an Execute SQL task to assemble the SQL statement that will be used to query for changes.
For an example of how to assemble the query, see Preparing to Query for the Change Data.
Use multiple Data Flow tasks to load the change data from each source table and apply it to the destination. Configure each Data Flow task as described in the following steps.
In each data flow, use a source component to query the change tables for the changes that fall within the selected endpoints.
For an example of how to query the change tables, see Retrieving and Understanding the Change Data.
Use a Conditional Split transformation to direct inserts, updates, and deletes to different outputs for appropriate processing.
For an example of how to configure this transformation to direct output, see Processing Inserts, Updates, and Deletes.
Use a destination component to apply the inserts to the destination. Use OLE DB Command transformations with parameterized UPDATE and DELETE statements to apply updates and deletes to the destination.
For an example of how to use this transformation to apply updates and deletes, see Applying the Changes to the Destination.
|