Play with SSIS Flat File Source Error Flow Output
In general as we know that SSIS tasks error flow redirection of data through error pipe sends erroneous data in form of rows with multiple columns. For an example, if we use a lookup task in our data flow, any failing lookups can be redirected to a table or file, the data that comes through error flow pipe is always in form of rows with individual column values, an example could be EmpID, Ename, Designation, MGRID. However, there is a slight difference in this behavior if we talk about flat file sources. When flat file source finds a erroneous row, it is indeed transferred to error pipe but this row comes as one column, this means all [EmpID, Ename, Designation and MGRID] will be visible as a single column as opposed to 4 columns. So if you have a requirement where you need to pick individual column value for future error data analysis purpose, you can’t straight forward do this.
There are two ways to pick up individual columns from flat file source error flow
1. By using String functions over erroneous row (i.e. FindString and SubString).
2. By using a Script task to pick our individual columns from row.
Taking about option 1, yes you can find a column in a row using FindString and SubString but if it comes to taking multiple columns out from an error row, it’s definitely a tedious job. I indeed advise to use Script task in that case. See in below screenshot that how only ‘Flat File Source Error Output Column’ is displayed as opposed to multiple columns like EmpID, Ename, Designation and MGRID
As you see in screenshot we have to play with ‘Flat File Source Error Output Column’, however this column has DT_Text as default datatype, you need to first convert datatype of this column to DT_STR to use any string functions, in the code sample provided in this blog, you will find how conversion task is used in conjunction with derived column task to derive required columns.
As said in Option 2, we can use Script task for taking out individual columns out from a row. This approach is pretty handy, as it gives you power of .NET functions to play with your data. The point to note here is that data from ‘Flat File Source Error Output Column’ is passed to script task in the form of BLOB data, which means before using any string functions to pick your columns, it is essential to convert error output column to text data and then do the next step.
I have attached a sample demo to show how you can take individual columns out using any of two approaches.
Sample Code: 'FlatFileErrorFlow.zip'
Code Guidelines:
1. Download ‘FlatFileErrorFlow.zip’ file, unzip it in a folder.
2. Run ‘CreatTableScripts.sql’ in any of your test database, if you don’t have any test database, just create one.
3. Open the solution file under \FlatFileErrorFlowDemo\FlatFileErrorFlow\FlatFileErrorFlow.sln
4. Open package FlatFileErrorFlow_StringFunction.dtsx and FlatFileErrorFlow_Script.dtsx and change the ‘Test’ OLEDB connection manager to point to your database used in step2, also change the ‘InputFileConnection’ flat file connection manger to point to the folder where ‘Inputfile.txt’ is present. This text file is present in your \FlatFileErrorFlowDemo folder.
5. Before running the package, analyze the table schema and Inputfile.txt, we are passing 4 column values in each row i.e. EmpID, EmpName, Designation and MGRID but in Inputfile.txt there are two rows with more than 4 columns i.e. they are erroneous rows and should be re-directed to a table called SuspectData.
6. We have used data viewers to help you see that error row by default come as a single column and further that has been broken into relevant columns to pass desired column data to underlying tables.
Hope this will help….!
Kuldeepc
MSFT
Comments
Anonymous
June 07, 2009
PingBack from http://weakbladder.info/story.php?id=5132Anonymous
May 26, 2014
Hi, why -3 is there in your package name FlatFileErrorFlow_StringFunctions.dtsx Actually i didnt understood how you build spilt function for Emp Name.