SQL Server 2017: Understanding the Internals of Import Flat File Wizard
Introduction
One of the main enhancements that got introduced in SQL Server 2017 was a brand new import wizard which can be utilized for quick import of data from flat files. This was one of the most sought-after features as flat file data import is one of the most common use cases that we come across in most of our projects. This article explains how the new import data wizard can be used for quick importing data from simple flat files. It also gives an overview of its internal working and draws a comparison with the conventional Export Import wizard on the various aspects.
Illustration
Consider the case of a requirement where we need to transfer data from a file as given below:
Now let's see how we can utilize the new import flat file wizard for the above scenario.
The import flat file wizard is available as an option from latest version of SQL Server Management Studio i.e. 17.3 (build 14.0.17199.0)
You can launch it from within a database by right-clicking on the database icon and choosing Tasks-> Import Flat File Data as shown in the figure below:
Once you launch the wizard, you would be presented with the below screen.
Click Next and you will be navigated to the tab to select the input file and the table information. At the time of writing this article, there's a small issue with the table being created as the schema is always prefixed as dbo to the table name that is being entered by the user regardless of whether you included the schema or not.
Once the table is mapped and navigated we get a preview screen where a preview of the file data will be provided.
We can verify the data and navigate to the Modify columns tab which will allow modifying the data type and nullability of columns in case it's different from what the wizard interprets by default. A primary key for the table can also be chosen based on the applicable business rules from this tab.
Once the data type, nullability and primary key properties are set further navigation will take you to a summary screen where all information regarding the source and the destination is given.
Finally, you'll have the results screen where execution happens and results are displayed.
Now we can check and ensure that data has been transferred to a table created in the database successfully by the wizard.
Background
Let's now take a look at behind the scenes of this wizard.
When executing the wizard if you capture the statements executed in the background using a profiler, you will see statements like below:
CREATE TABLE [Test]
(
[ID] int NOT NULL,
[Val] nvarchar(50) NOT NULL,
CONSTRAINT [PK_TestNewImp] PRIMARY KEY ([ID])
);
select @@trancount; SET FMTONLY ON select * from [Test] SET FMTONLY OFF exec ..sp_tablecollations_100 N'.[Test]';
insert bulk [Test] ([ID] Int, [Val] NVarChar(50) COLLATE SQL_Latin1_General_CP1_CI_AS) with (KEEP_NULLS)
..
These are the statements related to the creation of the destination table object by the wizard as well as the insertion of the bulk data to it.
On analyzing the last statement it can be seen that the wizard uses INSERT BULK syntax which utilizes Tabular Data Stream (TDS) packets under the hood for performing the data transfer. This is analogous to the bulk insert APIs used by programs like SqlBulkCopy in .Net and BULK INSERT syntax in T-SQL. So data will be transferred as tabular streams from the client to the database table in this wizard.
Comparison with Conventional Export Import Wizard
The conventional Export Import wizard uses SSIS under the hood for the data transfer.The source and destination connections are managed through the connection managers initialized through the wizard. The wizard utilizes buffers for transferring the data from the source to the destinations mapped. Since it involves all these stages the number of steps required for creating import package using export import wizard would be more.
In comparison since new flat file wizard utilizes the TDS packets using bulk API as discussed previously it would transfer data much faster. Also, the development effort would be less as a number of steps required is less in this case.
Regarding flexibility, conventional Export Import wizard provides much better options compared to the new Import Flat File wizard. Since it uses separate editors for table creation, mapping etc Export Import wizard gives us the flexibility to modify the mappings if required. The wizard also provides the option to include additional columns in the destination table like for example an IDENTITY column or a computed column based on an expression. Another good option is the ability to store the package within the file system or SQL database so that it can be reused or modified at a later time if required.
All these aspects can be summarized in a tabular format as below
Import Flat File Wizard |
Export Import Wizard |
Specifically designed for flat file data transfers |
More generic. |
Uses TDS stream-based APIs for data transfer internally through PROSE SDK |
Uses data flow engine of Integration Services utilizing data buffers in the background |
Development effort is less as number of steps involved is less |
Development effort more due to more number of steps and screens involved |
Very little flexibility as table structure is fixed based on file metadata |
More flexibility as table creation step has an editor to edit the underlying SQL statement |
Reusability is not possible as it provides an ad-hoc transfer capability with no save option |
Reusability is possible as package can be saved and retrieved later for use as is or with modification using client tools like SQL Server Data Tools |
Conclusion
As seen from the illustration and description above, we can infer that new import flat file wizard provides a quick way to do ad-hoc data imports from flat files to a SQL server database. For any data transfer operations which require flexibility and periodic recurrence, the conventional export import wizard will still remain the first option. We shall hope that the current limitations on the Import Flat File wizard will be taken care of in the coming releases making it a robust tool.