Share via


SQL Server Import/Export to CSV using R script

Introduction

In this post, let us see how to import CSV file into SQL Server / export SQL Server data into CSV file by executing R scripts within T-SQL.  There is also a suggestion on MSDN connect for implementing FOR CSV (similar to FOR XML / JSON).

 

SQLCMD can also be used to invoke SQL file / export query output to CSV file. Also for importing, If the requirement is to import some random files on adhoc basis then we can make use of Import Flat File Wizard from SQL Server Management Studio. But this approach is more flexible as it is used within t-sql.

 

We have used SQL Server 2019 evaluation edition and WideWorldImporters SQL Server sample database for this example. Below scripts will work starting from SQL Server 2016 and above (as execution of R language using T-SQL was introduced in SQL Server 2016).

 

Pre-requisites

 

Install R services and then from SSMS enable the external scripting feature. Restart the database engine and then verify the installation as mentioned in MSDN.

 

You might face “Access is denied” error while executing below scripts, to overcome that right-click on the folder where CSV files will be imported / exported and provide folder access (read, write) to "ALL_APPLICATION_PACKAGES".

Exporting SQL Server data (adhoc SQL query output or list of tables, views) into CSV file:

For this approach, we have created a scalar function and Stored procedure.

Scalar function - returns a unique table name for a session. This function is required only if the ad-hoc SQL query has to be exported to CSV file.

USE [WideWorldImporters]
GO
  
CREATE OR  ALTER FUNCTION  fngetcsvtblname() 
RETURNS NVARCHAR(128)  
AS
BEGIN
     DECLARE @tblname NVARCHAR(128)     
     SELECT @tblname = CONCAT('ExportCSV_',@@SPID,DB_ID(),'_temptbl') 
  RETURN @tblname
END;  
GO

Stored procedure - Has below three input parameters and writes the output to CSV file
i)   @ExportPath - path for exporting a CSV file
ii)  @ObjectlisttoExport - list of tables, views to be exported
iii) @Objectlistdelim  - delimiter for the above object list

 

 
USE [WideWorldImporters]
GO
CREATE  OR ALTER  PROC usp_ExportCSV (@ExportPath NVARCHAR(MAX),
                                    @ObjectlisttoExport NVARCHAR(MAX),
                                    @Objectlistdelim CHAR(1))
AS
BEGIN
SET NOCOUNT ON;
  
IF ISNULL(@ExportPath,'') <> ''
 BEGIN
  
 SELECT @ExportPath = REPLACE(@ExportPath,'\','/')
  
 DECLARE @tblname NVARCHAR(128)
 SELECT @tblname= dbo.fngetcsvtblname() 
  
 IF EXISTS (SELECT 1 FROM sys.tables WHERE name = @tblname) OR (ISNULL(@ObjectlisttoExport,'') <> '' AND ISNULL(@Objectlistdelim,'') <> '')
 BEGIN
  
    
  DECLARE @TableColList TABLE (Cols NVARCHAR(MAX),Tbl NVARCHAR(128))
  
  IF EXISTS (SELECT 1 FROM sys.tables WHERE name = @tblname)
  BEGIN
  INSERT @TableColList
  SELECT  CASE WHEN TY.name IN ('date','datetime2','datetimeoffset','time','timestamp') 
               THEN CONCAT('TRY_CONVERT(','VARCHAR(MAX),',C.name,') AS  ',QUOTENAME(C.NAME)) 
          ELSE C.name END Columns  -- To cover poor data type conversions b/n R & SQL Server
      ,CONCAT(S.name,'.',O.name) TableName
  FROM sys.objects O
  JOIN sys.schemas S
  ON S.schema_id = O.schema_id
  JOIN sys.columns C
  ON O.object_id = C.object_id
  JOIN sys.types TY 
  ON C.[user_type_id] = TY.[user_type_id]
  WHERE CONCAT(S.name,'.',O.name) = (SELECT CONCAT(schema_name(schema_id),'.',name ) FROM sys.tables WHERE name = @tblname)
  AND TY.name NOT IN ('geography','varbinary','binary','text', 'ntext', 'image', 'hierarchyid', 'xml', 'sql_variant')  -- Ignore the datatypes that are not required to be exported
  AND O.type IN ('U','V')
  END
  IF ISNULL(@ObjectlisttoExport,'') <> '' 
  BEGIN
  INSERT @TableColList
  SELECT  CASE WHEN TY.name IN ('date','datetime2','datetimeoffset','time','timestamp') 
               THEN CONCAT('TRY_CONVERT(','VARCHAR(MAX),',C.name,') AS  ',QUOTENAME(C.NAME)) 
          ELSE C.name END Columns  -- To cover poor data type conversions b/n R & SQL Server
      ,CONCAT(S.name,'.',O.name) TableName
  FROM sys.objects O
  JOIN sys.schemas S
  ON S.schema_id = O.schema_id
  JOIN sys.columns C
  ON O.object_id = C.object_id
  JOIN sys.types TY 
  ON C.[user_type_id] = TY.[user_type_id]
  WHERE CONCAT(S.name,'.',O.name) IN (SELECT value FROM STRING_SPLIT(@ObjectlisttoExport, @Objectlistdelim))
  AND TY.name NOT IN ('geography','varbinary','binary','text', 'ntext', 'image', 'hierarchyid', 'xml', 'sql_variant')  -- Ignore the datatypes that are not required to be exported
  AND O.type IN ('U','V')
  END
  
  DROP TABLE IF EXISTS #ExportTablesList
SELECT IDENTITY(INT,1,1) Rn,
tbl, 
STUFF(
(
        SELECT ', ' + C.Cols
        From @TableColList As C
        WHERE C.tbl = T.tbl
        FOR XML PATH('')
        ), 1, 2, '') AS cols
INTO #ExportTablesList
From @TableColList As T
GROUP BY tbl
  
  DECLARE @I INT = 1
      ,@SQL NVARCHAR(MAX) = N''
      ,@RScript NVARCHAR(MAX) = N''
      ,@filename NVARCHAR(MAX) = N''
   
  WHILE @I <= (SELECT MAX(Rn) FROM #ExportTablesList)
  BEGIN
                --just for testing selecting top 10, this can be removed
    SELECT @SQL = CONCAT('SELECT TOP  10',Cols,'  FROM ',tbl,';')
       ,@tblname = Tbl
    FROM #ExportTablesList WHERE Rn = @I
  
  
    SELECT @tblname = REPLACE(@tblname,'.','_')
    SELECT @filename = CASE WHEN SUBSTRING (@ExportPath,LEN(@ExportPath),1) = '/'  THEN  CONCAT(@ExportPath,@tblname,'.csv')
          ELSE CONCAT(@ExportPath,'/',@tblname,'.csv') END
   
    SET @RScript = CONCAT('write.csv(InputDataSet, file = "',@filename,'")')
      
    EXEC   sp_execute_external_script
       @language = N'R'
      ,@script = @RScript
      ,@input_data_1 = @SQL
   
    SET @tblname = ''
    SET @filename = ''
    SET @I = @I + 1
  END
  
  --After successful processing, dropping the table created for exporting adhoc sql into csv 
  SELECT @tblname= dbo.fngetcsvtblname() 
  IF EXISTS (SELECT 1 FROM sys.tables WHERE name = @tblname) 
  BEGIN
  EXEC('DROP TABLE  '+@tblname)
  END
  
  END ELSE PRINT 'No object specified for exporting to CSV or  Objectlist params are empty'
  
  END ELSE PRINT 'Export folder path need to  be mentioned'
  
END;
GO

 

Let us see the different execution results:

  1. When all the parameters are passed blank

 

2) When folder path is passed but no objects specified for exporting

 

3) Notice in the below screenshot, we are getting a unique table name using scalar function and inserting my adhoc sql output into that table and then calling the stored procedure.

 

This stored procedure exports the adhoc sql output to csv file in the mentioned path and drops the table created after successful processing.

 

 

4) I am passing the list of tables from Sales schema in WideWorldImporters database.

This stored procedure exports the list of tables data into separate csv files with table name as file name in the mentioned path.

 

5) This is to show that Stored procedure can export adhoc sql output, list of tables, views data into separate csv files simultaneously.

 

Sample execution code block:

 
USE [WideWorldImporters]
GO
SELECT  dbo.fngetcsvtblname() --function to get unique table name for inserting adhoc sql into a table
DECLARE @tblname NVARCHAR(128) = dbo.fngetcsvtblname()  
        IF EXISTS (SELECT 1 FROM sys.tables WHERE name  = @tblname) 
        BEGIN
        EXEC('DROP TABLE '+@tblname)
        END
 
/** place your adhoc sql to  be exported into  csv and  SELECT INTO  tablename provided by above function **/
SET NOCOUNT ON;
SELECT TOP  5 OL.OrderLineID,O.*
INTO ExportCSV_766_temptbl
FROM [WideWorldImporters].[Sales].[Orders] O
JOIN [Sales].[OrderLines] OL
ON O.OrderID = OL.OrderID
 
/** place your adhoc sql to  be exported into  csv and  SELECT INTO  tablename provided by above function **/
 
EXEC usp_ExportCSV 'H:\Temp\Blog\test','SAles.Orders|Sales.Invoices|Website.Customers','|'

Importing csv file into SQL Server:

If the requirement is to import on regular basis or import multiple files then we can do this using t-sql script that executes R script to read an csv or text file & load into SQL Server table.

Let us make use of files generated in the above example. CSV files will be loaded as different SQL tables created dynamically with CSV filename as a table name.

DECLARE @RScript NVARCHAR(MAX) =N' 
    library(foreach);
 
    #provide username & password
 
    sqlConnString <- "Driver=SQL Server;Server=SQLServerName; Database=WideWorldImporters;Uid=sa;Pwd=***"
 
    list_filespath <- list.files(path = CSVFilePath, pattern = "*.csv", all.files = FALSE,
           full.names = TRUE, recursive = TRUE,
           ignore.case = FALSE, include.dirs = FALSE, no.. = TRUE)
    list_filenames <- list.files(path = CSVFilePath, pattern = "*.csv", all.files = FALSE,
                        full.names = FALSE, recursive = TRUE,
                        ignore.case = FALSE, include.dirs = FALSE, no.. = TRUE)
 
  #Check file exists in path
    if (length(list_filespath) != 0) {
 
 foreach(m = 1:length(list_filespath)) %do%
 {
  #sql table name is formatted such that all special characters are removed
  sqlTable <- strsplit(list_filenames[m],split = ".csv") 
  sqlTable <- gsub("\\s", "", sqlTable)
  sqlTable <- gsub("[^[:alnum:]]", "", sqlTable)
 
  Output <- read.csv(file=list_filespath[m], header=TRUE, sep=",")
  sqlDS <- RxSqlServerData(connectionString = sqlConnString,table = sqlTable)
  rxDataStep(inData = Output, outFile = sqlDS,overwrite = TRUE)
 
 }
 }'
 
EXEC   sp_execute_external_script
      @language = N'R'
     ,@script = @RScript
     ,@params = N'@CSVFilePath NVARCHAR(MAX),@SQLServerName NVARCHAR(200)'
     ,@CSVFilePath = 'H:/Temp/Blog/test'  -- file path where csv files are placed
     ,@SQLServerName = 'LAPTOP-LCUB6HBB'  -- SQL Server name where table has to be created dynamically to hold imported csv data

Summary:

This post is just to give an overview of this new approach of importing / exporting CSV files using R script. Above solution can be tweaked further as per requirement. Also, note that no additional R packages are required to be installed for this approach.