Importing an Excel Spreadsheet into a SQL Server Database
Introduction
We often have to perform data integration in SQL Server, with different data sources such as ".txt" files (tabular text or with separator character), ".csv" files or ".xls" (Excel) files.
It is always not possible to create a SSIS package to do this data import, a useful alternative is to use OPENROWSET method for importing data.
In this article, we will use data import from Excel files (.xls e .xlsx).
Building the Environment for Testing
So that we see the data import process steps from an Excel file to a table from database, we need:
- Create an Excel file to import sample;
- Configure Windows Server, installing the necessary components;
- Configure the necessary permissions to the SQL instance that we need to obtain data files.
Let's prepare environment for data import!
Creating an Excel File to test
In this step, we will create an Excel file sample with just a few rows to demo.
Add a header row, to explicitly define the data: "ID", "Item Name" and "Date Created".
The data sequences is only to facilitate the visualization of the content that is being manipulated.
See this Excel file in the image below (click to enlarge)
Installing the necessary components in Windows Server
To get the data through a query inside SQL Server, use an OLE DB Data Provider.
Most files can now use the Microsoft.ACE.OLEDB.12.0 Data Provider that can be obtained free through Data Connectivity Components.
This package will provide all ODBC and OLEDB drivers for data manipulation, as follow below:
File Type (extension) | Extended Properties |
Excel 97-2003 Workbook (.xls) | Excel 8.0 |
Excel 2007-2010 Workbook (.xlsx) | Excel 12.0 XML |
Excel 2007-2010 Macro-enabled workbook (.xlsm) | Excel 12.0 Macro |
Excel 2007-2010 Non-XML binary workbook (.xlsb) | Excel 12.0 |
There are two versions of this package: "AccessDatabaseEngine.exe" for x86 platform and other "AccessDatabaseEngine_x64.exe" for x64 platform.
The minimum system requirements for this installation can be obtained in the same download package page.
If you are installing the x86 package you must ensure that your user is allowed access to the Temporary directory of your Windows OS.
To know what your Temporary directory open the "Control Panel", click "Advanced System Settings" option. A window will open, select the "Advanced" tab and click the "Environment Variables" button.
A new window will open with your environment variables, including "TEMP" and "TMP" variables, indicating your Temporary directory.
See this windows in the image below (click to enlarge)
So if your operating system is Windows 32-bit (x86) is necessary to include read and write access to the user of your SQL Server instance.
It's important to remember that the user of your SQL Server instance must be a local user or the default "Local System" account to grant this access.
See this window Service Properties in the image below
Enabling SQL Server Instance to Read File
The settings and permissions to execute a query external data has some details that should be performed to be able to get the data from an Excel files (.xls ou .xlsx) and also other formats.
The execution of distributed queries as OPENROWSET is only possible when the SQL Server instance has the Ad Hoc Distributed Queries configuration enabled. By default, every SQL Server instance maintains this permission denied.
Note
The Advanced Settings should only be changed by an experienced professional or a certified professional in SQL Server. It's important to note not use these commands in Production Databases without previous analysis. We recommend you run all tests in an isolated environment, at your own risk.
To enable this feature just use the sp_configure system stored procedure in your SQL instance to display its Advanced Settings in show advanced options parameter and soon to follow, enable the Ad Hoc Distributed Queries setting to enabling the use of distributed queries.
USE [master]
GO
--CONFIGURING SQL INSTANCE TO ACCEPT ADVANCED OPTIONS
EXEC sp_configure 'show advanced options', 1
RECONFIGURE
GO
--ENABLING USE OF DISTRIBUTED QUERIES
EXEC sp_configure 'Ad Hoc Distributed Queries', 1
RECONFIGURE
GO
These changes in the Advanced settings only take effect after the execution of the RECONFIGURE command.
To get permission granted to use the Data Provider through sp_MSset_oledb_prop system stored procedure to link Microsoft.ACE.OLEDB.12.0** **in SQL Server using AllowInProcess parameter so we can use the resources of the Data Provider and also allow the use of dynamic parameters in queries through of DynamicParameters parameter for our queries can use T-SQL clauses.
USE [master]
GO
--ADD DRIVERS IN SQL INSTANCE
EXEC master.dbo.sp_MSset_oledb_prop N'Microsoft.ACE.OLEDB.12.0', N'AllowInProcess', 1
GO
EXEC master.dbo.sp_MSset_oledb_prop N'Microsoft.ACE.OLEDB.12.0', N'DynamicParameters', 1
GO
See this output SQL script in the image below
After setting up your SQL instance to use the Microsoft.ACE.OLEDB.12.0 Data Provider and make the appropriate access permissions, we can implement the distributed queries of other data sources, in this case to Excel files.
Querying and Importing the Spreadsheet
As this demo is for Excel files (.xls) we will perform a query using an OPENROWSET method with the Excel test file that was created earlier in this article.
We use some parameters for this method to be able to data query:
- Data Provider - In this case, using Microsoft.ACE.OLEDB.12.0
- BULK Options - File Version;Where it's stored; Header (HDR); Import Mode (IMEX)
- Query - T-SQL statement with or without clauses to data filter and process.
--CONSULTING A SPREADSHEET
SELECT * FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
'Excel 12.0; Database=C:\Microsoft\Test.xls; HDR=YES; IMEX=1',
'SELECT * FROM [Plan1$]')
GO
See this output SQL script in the image below
To data group and perform other tasks for data manipulation, the ideal is always load the data into the database. You can insert data into an existing table using the INSERT statement or you can create a table through of INTO command in SELECT statement.
--CONSULTING A SPREADSHEET
SELECT *
INTO TB_EXAMPLE
FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
'Excel 12.0; Database=C:\Microsoft\Test.xls; HDR=YES; IMEX=1',
'SELECT * FROM [Plan1$]')
GO
SELECT * FROM TB_EXAMPLE
GO
****See this output SQL script in the image below
****
It's also important to check if the SQL Server Service user has access in Windows directory where Excel files are stored.
Conclusion
Have the possibility to use an alternative resource for importing data with T-SQL command is very useful, especially when we have to manipulate files in proprietary formats, as for .xlsx files where it's necessary to use the Data Provider appropriate to obtain the data correctly and with ease use.
It's important to watch out that only users that have actually need to manipulate these files can use these resources, while minimizing the vulnerability of their environment through a permission in your SQL Server.
References
- OPENROWSET (Transact-SQL)
- Import Bulk Data by Using BULK INSERT or OPENROWSET(BULK...) (SQL Server)
- OLE DB Providers Tested with SQL Server
- Excel Source
See Also
Other Languages
This article was awarded the silver medal in the TechNet Guru of April 2014