SSIS Step-by-step guide: Generate XML files for each row in SQL table without custom code
Problem Statement
In SSIS we don’t have a direct Data flow destination option for generating the XML file as output. But we can generate XML data from a relational set / query. In this document let’s see the step by step procedure for generating an xml file for each row of a SQL table.
How to do it
Let’s say we need to generate an xml file for each row of the HumanResources.Department table in AdventureWorks2012 Database.
Open SQL Server Data Tools.
Click on File ->New->Project and select Integration Services Project under Business Intelligence Projects and give the name as XMLDemo.
Drag and drop an Execute SQL Task in the control flow of the SSIS package.
If you look at the above picture you can see that there is an error symbol, so we need to configure the Execute SQL task to get rid of this error.
Double Click on the Execute SQL Task to open Execute SQL Task editor.
Choose the connection type as OLEDB
Click on the connection -> New Connection to Open the OLEDB connection manage. Click on New.
Give the server name where the AdventureWorks2012 Database resides. In my scenario it resides in the localhost. select the Database as AdventureWorks2012.
Click on Test Connection. It should display the message as “Test Connection succeeded”. Click on OK twice to come out of the OLEDB connection manager wizard.
In the SQLSourceType choose the option as direct input and click on the eclipse button near the sql statement and enter the following query
select count(1) as CountDept from humanresources.department , set the Result Set as “Single Row” and click OK.
Set the BypassPrepare property to False and click on parse query. The query parsed correctly message should be displayed. Set the BypassPrepare property to True again and click on OK to come out of the ExecuteSQLTaskEditor.Now you can notice that the error symbol disappeared from the Execute SQL Task.
Right click on the Execute SQL Task and rename this Task as Get Count.
You can also notice that the connection manager whichever we have newly created is added in the connection managers window.
Now click on the variables window by clicking SSIS-> Variables
In the Variables window, click on Add Variable and create a new variable named count in the package scope.(for creating the variable in the package scope make sure that you are in the Control flow tab and doesn’t click any of the task/container in it, otherwise the variable will be created within the scope of task/container)
Now Double click on the “Get Count” Execute SQL Task and click on the Result Set and click Add.
In the Result Name enter CountDept and in the Variable Name choose the variable User::Count. Click on ok. The result of the query which gets the no. of rows in the HumanResources.Department table will be stored in the user variable Count.
We need to create a separate xml file for each row in the sql table. Hence we need to loop through the no. of rows in the table. Drag a For Loop Container to the control flow and connect the Get Count Execute SQL Task to For Loop Container by dragging the green arrow from Execute SQL Task to For Loop Container.
Click on the variables window and add a new variable named i in the package scope.
Double click on the for loop container and set the InitExpression as @i=1, EvalExpression as @i\<= @Count and assignExpression as @i=@i+1 and click OK.
Since the source has to be extracted from a sql table to an xml file, we need a data flow task for achieving the same. Drag a data flow task inside the for loop container under control flow and double click on it.
In the Data flow tab, drag an OLEDB source for extracting the data from SQL Source.
In the variables window click on Add variable and create a new string variable named SourceSql and press F4 which opens the properties of SourceSql variable.
In the SourceSql variable properties window set EvaluateAsExpression as true and click on the eclipse button in the Expression and type the following query "DECLARE @ssql varchar(max)
SET``@ssql =``''
SET``@ssql = @ssql +(``SELECT
department.*
FROM``(``SELECT
ROW_NUMBER() OVER (``ORDER``BY``departmentID)``AS``'RowNumber'``, *
FROM``HumanResources.Department) department
WHERE``department.RowNumber =``"+(DT_WSTR,10)@[User::i]+"
FOR``XML PATH(``'department'``), ROOT(``'root'``))
SELECT
@ssql``AS``XMLData"
and click on Evaluate Expression, there should not be any errors. Click on OK.
In the above sql query we are using XML Path option to generate an xml output from the sql query.
Double click on the OLEDB source which will open the OLEDB Source Editor. We can use the same connection manager which we have created for Execute SQL Task. In the data access mode, select SQL command from variable and choose the user variable SourceSql.
The reason for not using the Data access mode as sql command is “OLEDB source gives error when trying to use parameters in the complex queries”. To get rid of such errors we are using SQLCommand from Variable option.
In the Variables window, click on Add Variable and create a variable named FileName in the package scope.
Click on F4 to open the FileName variable properties window.
Set EvaluateAsExpression to true.
Click on the Eclipse button and copy paste the following expression
“Test_``"+ (DT_WSTR,10)@[User::i] +(DT_WSTR,4)YEAR(GETDATE()) + "``-"
``+ RIGHT(``"0"``+ (DT_WSTR,2)MONTH(GETDATE()), 2) +``"-"
``+ RIGHT(``"0"``+ (DT_WSTR,2)DAY( GETDATE()), 2) +``" "
``+ RIGHT(``"0"``+ (DT_WSTR,2)DATEPART(``"hh"``, GETDATE()), 2)
``+ RIGHT(``"0"``+ (DT_WSTR,2)DATEPART(``"mi"``, GETDATE()), 2)
``+ RIGHT(``"0"``+ (DT_WSTR,2)DATEPART(``"ss"``, GETDATE()), 2) +``".xml"
Click on evaluate expression and it will give the value as “Test_02014-01-20 141133.xml”
Where 0 is the row number and 2014-01-20 is the file created date and 141133 is the hh:mi:ss format for time.
Please note that in this scenario, we need the file name with created date and time, and the user variable @i is used to let us know the row number or no. of iteration. Hence we are using this expression.
Finally we need a Data Flow Destination for saving the SQL table row as xml. Drag a Flat File Destination and double click on it to open the Flat File Destination Editor.
In the connection manager click on new and select “delimited” format and give the connection manager name as “XMLfileconnection” and in the filename give “C:\Temp\Test_02010-09-30 103248.xml”, it can be some file name with .xml extension. Click on OK.
Right Click on XMLfileconnection and click on properties and set “DelayValidation” property to True.
In the expression, click on the eclipse button and in the property select connection string and give the expression as "C:\Temp\"+@[User::FileName]
In the above expression C:\Temp denotes the output path where the xml file will be saved. Please refer the below screenshot.
The above step helps us to create a new file dynamically for each iteration of the for loop container.
Please find below the screenshot of the entire package once all the above steps are done:
Click on F5 or Debug -> Start Debugging.
There will be multiple xml files generated in C:\Temp for each row of the sql server table.
Note: The output folder for XML files(in my scenario it is C:\Temp) should exist in the system, otherwise SSIS will throw error.
See Also
- [[SQL Server Integration Services Portal]]