Compartilhar via


Flat File source cannot handle file with uneven number of columns in each row

I hope not many people are using flat file with such kind of structure, but if you are then no worries now you don’t have to do any weird adjustments. J

I took some time to figure out what is going wrong here and why it is not working, and it caused some extra time to figure out all and I didn’t get paid for that extra time so you can understand my frustration!!!

But anyways here it goes,

Consider you have file which has data like

01406E98|01406E98|01|20111027|5112|E|9|V|0.220000|0.000000|0.000000|0.000000|0.000000|0.078100~

01406E98|01406E98|02|20111031|5112|E|9|V|0.220000|0.000000|0.000000|0.000000~

01406E98|01406E98|05|20111017|5112|E|9|V|0.220000|0.000000|0.000000~

01406E98|01406E98|38|20080430|5018|E|9|U|0.220000|0.000000|0.000000|0.000000|0.000000|0.073480~

01406E98|01406E98|40|20091028|5110|E|9|V|0.220000|0.000000|0.000000|0.000000|0.000000|0.077220~

So if we see some of rows does not have entries for all columns (Row 2 and 3)

Now when you create Flat File Connection manager in SSIS and use this file as source

Column Delimiter: ~

Row Delimiter: |

Go columns and click on refresh so that changes take place you might get error:

TITLE: Microsoft Visual Studio

------------------------------

The last row in the sampled data is incomplete. The column or the row delimiter may be missing or the text is qualified incorrectly.

------------------------------

BUTTONS:

OK

------------------------------

Also if you see preview will show only two rows and not all four rows

Or

You just don’t see all result when you preview.

What went wrong?

Reason behind issue:

So we have Row delimiter and column delimiter.

Column Delimiter gets first preference and then Row delimiter.

Consider our file,

01406E98|01406E98|01|20111027|5112|E|9|V|0.220000|0.000000|0.000000|0.000000|0.000000|0.078100~

--14 Columns in all, so SSIS flat file connection manager expects 13 “|”

01406E98|01406E98|02|20111031|5112|E|9|V|0.220000|0.000000|0.000000|0.000000~

--In this row it finds only 11 “|” so it thinks row is not yet over and continues till it finds 2 more “|”, even if it encounters row delimiter while before it reaches two more “|” it will ignore that row delimiter

01406E98|01406E98|05|20111017|5112|E|9|V|0.220000|0.000000|0.000000~

--Here it will append this row to previous row as it thinks previous row is incomplete and after two “|” once it reaches 13 “|” it will append everything else on this row till it finds row delimiter

01406E98|01406E98|38|20080430|5018|E|9|U|0.220000|0.000000|0.000000|0.000000|0.000000|0.073480~

01406E98|01406E98|40|20091028|5110|E|9|V|0.220000|0.000000|0.000000|0.000000|0.000000|0.077220~

--For this two rows everything is fine so it will be shown normally

So if I do preview with this file I will see

01406E98|01406E98|01|20111027|5112|E|9|V|0.220000|0.000000|0.000000|0.000000|0.000000|0.078100~

01406E98|01406E98|02|20111031|5112|E|9|V|0.220000|0.000000|0.000000|0.000000~01406E98|01406E98|05|20111017|5112|E|9|V|0.220000|0.000000|0.000000~

01406E98|01406E98|38|20080430|5018|E|9|U|0.220000|0.000000|0.000000|0.000000|0.000000|0.073480~

01406E98|01406E98|40|20091028|5110|E|9|V|0.220000|0.000000|0.000000|0.000000|0.000000|0.077220~

So this is not right…

What can be done?

Unfortunately nothing (other than using codeplex tool for flat file source) if you are on SQL 2005, SQl 2008 or SQL 2008 R2

We also have connect article which talks about this bug,

https://connect.microsoft.com/SQLServer/feedback/details/293193/ssis-import-of-flat-file-with-uneven-number-of-columns

And solution is to use codeplex utility

https://ssisdfs.codeplex.com/

However if you are on SQL 2012 be happy, as this bug is fixed in SQL Server data Tools with comes with SQl 2012

If you look at documentation for Flat File connection manager in SQL Server data Tools

https://msdn.microsoft.com/en-us/library/ms140266%28v=sql.110%29.aspx
https://blogs.msdn.com/b/mattm/archive/2011/07/17/flat-file-source-changes-in-denali.aspx

BOL says,

By default, the Flat File connection manager always checks for a row delimiter in unquoted data, and starts a new row when a row delimiter is found. This enables the connection manager to correctly parse files with rows that are missing column fields.

In some cases, disabling this feature may improve package performance. You can disable this feature by setting the Flat File connection manager property, AlwaysCheckForRowDelimiters, to False.

This is not present in documentation till SQL 2008 R2 SSIS

I end this here!!!

Cheers…

Author : Dilkush(MSFT) SQL Developer Engineer, Microsoft

Reviewed by : Debarchan(MSFT) SQL Developer Engineer, Microsoft

Comments

  • Anonymous
    May 01, 2013
    I use a flat file connection manager. I configure the  FFCM to see the different columns as 1 column in a row. Then I input this into a C# Script task. I use the split command in C# and then I access the upper and lower bound methods to determine the count of columns. Based on the count of columns I know exactly what to do. Normally the first couple of columns is static..your first 8 columns and then the variable columns.  

  • Anonymous
    May 12, 2015
    Good