SQL Server: Export SQL file output to Excel file with R script
Introduction
In this post, let us see how to invoke sql files in a folder and export the output of the query as Excel files using R script. This task can be accomplished using SQLCMD but SQLCMD cannot export output in Excel file format.
Explanation on how this approach works
R script invokes the powershell command using "system2" function. Powershell command then reads all the sql files in a folder(using Invoke-Sqlcmd) and writes the sql result to a text file (just as a staging). Only one staging file will be created for entire process of exporting SQL files output to Excel and will be deleted at the end of the process. Using the data in staging text file on each iteration, an excel is created using "xlsx" R package.
We have used SQL Server 2019 evaluation edition on Windows 10 64 bit and WideWorldImporters SQL Server sample database for this example. In this example, we have made use of R services installed as part of SQL Server. So we need to install required R packages and load into SQL Server R library folder as part of prerequisite steps.
Below script will work with any SQL Server version that supports powershell cmdlet (Invoke-Sqlcmd) to access the database engine.
Though prerequisite steps are bit heavy, it is one time and thereafter enable us to export to excel easily with just few lines of code.
Pre-requisites:
As we are going to make use of the powershell SQL Server cmdlet "Invoke-Sqlcmd", let us make sure SQL Server powershell modules are downloaded & installed from Powershell.
Install R services and then from SSMS enable the external scripting feature. Restart the database engine and then verify the installation as mentioned in MSDN.
You might face “Access is denied” error while executing below scripts, to overcome that right-click on folder where excel files will be imported / exported and provide folder access (Full Control) to "ALL_APPLICATION_PACKAGES".
Step 1
Check the version of R by running below code from R.exe
sessionInfo()
We have installed SQL Server 2019 so R.exe will be found in below path:
C:\Program Files\Microsoft SQL Server\MSSQL15.MSSQLSERVER\R_SERVICES\bin
Step 2
From above step we got to know that it is 64 bit, So we have downloaded Java JDK for 64 bit (.exe file) from here & installed it.
C:\Program Files\Java\jdk-11.0.1
Step 3
Now let us install rJava package from R.exe
install.packages("rJava")
Step 4
Now let us set the JAVA_HOME environment to Java JDK installed in Step 2 and load the rJava package to check if it working from R.exe.
Sys.setenv(JAVA_HOME="C:/Program Files/Java/jdk-11.0.1/")library(rJava)
Step 5
Now it's time to install & load the xlsx package from R.exe.
- xlsx package is one of the powerful R packages to read, write and format Excel files.
- It is a java-based solution and it is available for Windows, Mac and Linux.
- It works for both Excel 2007 and Excel 97/2000/XP/2003 file formats (xls and xlsx file formats).
install.packages("xlsx")
After successful installation, load xlsx package to check everything is fine.
library("xlsx")
Step 6
Copy & paste the packages downloaded into default R library from above steps to SQL Server R_services library.
We have copied xlsx,xlsxjars & rJava folders from C:\Users\Sathy\Documents\R\win-library\3.4
to C:\Program Files\Microsoft SQL Server\MSSQL15.MSSQLSERVER\R_SERVICES\library
Now as part of last prerequisite step, from SSMS let us check the installed R packages.
sp_execute_external_script @language = N'R', @script = N'OutputDataSet <- data.frame(installed.packages())'
Exporting SQL file output into Excel file
For this example, we have created below folder structure:
- Temp folder - Is a placeholder for staging files created in the intermediate process
- Script folder - Has the R script that runs this entire process
- BusinessReports folder - Where excel reports are exported
- SQL folder - Contains the SQL files to be executed and ErrorLog.txt file which contains failed SQL's log info.
For this example, we have placed four .sql files in a folder and the sql inside the file are as shown in below screenshot. All four sql queries are selecting data from three different tables in "WideWorldImporters" sample database. SQL2 has been created purposely with incorrect syntax to simulate error scenario.
Script
Below R script does everything what we have mentioned in above explanation: Just open & edit the R script to provide input file paths and DB server details and save it.
#******* Input variables ***********#
#provide double slash to escape the character
SQlpath <- "H:\\Temp\\Blog\\ImportExportExcel\\SQL"
Reportpath <- "H:\\Temp\\Blog\\ImportExportExcel\\BusinessReports"
Stgpath <- "H:\\Temp\\Blog\\ImportExportExcel\\Temp\\Stagingfile.txt"
DBServ <- "'LAPTOP-LCUB6HBB'"
DBName <- "WideWorldImporters"
#******* Input variables ***********#
#loading necessary packages
library(foreach);
Sys.setenv(JAVA_HOME="C:/Program Files/Java/jdk-11.0.1/")
library(rJava);
library("xlsx");
#processing input file paths in R format
SQlpath <- gsub("\\\\","/",SQlpath)
Reportpath <- gsub("\\\\","/",Reportpath)
Stgpath <- gsub("\\\\","/",Stgpath)
#create file to log sql files with error
SQLErrLog <- paste(SQlpath,"/ErrorLog.txt",sep="")
if (!file.exists(SQLErrLog))
{
file.create(SQLErrLog)
}
#getting list of sql files & file names from folder
list_filespath <- list.files(path = SQlpath, pattern = ".sql", all.files = FALSE,
full.names = TRUE, recursive = FALSE,
ignore.case = FALSE, include.dirs = FALSE, no.. = TRUE)
list_filenames <- list.files(path = SQlpath, pattern = ".sql", all.files = FALSE,
full.names = FALSE, recursive = FALSE,
ignore.case = FALSE, include.dirs = FALSE, no.. = TRUE)
#Check file exists in path
if (length(list_filespath) != 0) {
#creating new directory on each day for SQL output
newdir <- paste(Reportpath,"/Report","_",Sys.Date(), sep = "")
dir.create(newdir, showWarnings = FALSE)
foreach(m = 1:length(list_filespath)) %do%
{
#forming the powershell command to invoke sql files
#provide username & password in Invoke-sqlcmd if the sql server connection is not trusted
command1 <- paste("try { Invoke-Sqlcmd -ServerInstance ",DBServ,"-Database ",DBName," -QueryTimeout 65535 -ErrorAction 'Stop' -InputFile ", sep = "")
command2 <- paste("| Export-Csv ",Stgpath, sep = "")
ErrKey <- gsub("\\.", "", paste(list_filenames[m],as.numeric(Sys.time()),sep=""))
ErrInfo <- paste("This script [",list_filespath[m],"] tried to execute at ",Sys.time()," has error and failed to export. Key - ",ErrKey,sep="")
command3 <- paste("} catch {'",ErrInfo,"' | out-file ",SQLErrLog," -append}", sep = "")
command <- paste(command1,list_filespath[m],command2,command3, sep=" ")
system2("powershell", args = command)
#reading the error log file to process only sql files with no error
command <- paste("Select-String -Path ",SQLErrLog," -Pattern ",ErrKey,sep ="")
errf <- system2("powershell", args = command,stdout = TRUE)
if (length(errf) == 0)
{
#powershell Invoke-sqlcmd output to staging file in .txt format
StagingOut <- read.table(Stgpath, skip = 1, header =TRUE, sep =',')
#create excel output using .txt staging file
ExcelFilename <- paste(newdir,"/",gsub(".sql", "", list_filenames[m]),".xlsx", sep="")
#Pass excel password
write.xlsx2(StagingOut, file = ExcelFilename, sheetName=list_filenames[m],
col.names=TRUE, append=FALSE, password= "")
#Delete staging file (intermediate file before exporting to excel) if it exist
if (file.exists(Stgpath))
{
file.remove(Stgpath)
}
}
}
}
Script execution methods
We have saved the above script in Script folder, it can be either manually opened & executed from any R environment (R studio etc.) or this script can be invoked from command line.
Below screenshot shows on how to invoke R script from command Prompt by using Rscript.exe in SQL Server folder:
Goto Run -> cmd ->
"C:\Program Files\Microsoft SQL Server\MSSQL15.MSSQLSERVER\R_SERVICES\bin\Rscript.exe" "H:\Temp\Blog\ImportExportExcel\Script\SQLExportExcel.R"
This command line can also be created as a SQL agent job and scheduled as per requirement.
Script output screenshots
After executing the R script from Command Prompt or R studio or SQL Agent Job:
We can see the output of SQL files in Excel format within a directory created with date and Excel file & sheet name is SQL file name
Executing the R script on same day, just overwrites the directory & excel files without error and a separate directory is created when the script is executed on the next day.
We tested a scenario with the above script, where if one of the SQL file has incorrect syntax, script still works for all the other SQL files and runs the entire process. An entry is made in the error log file.
Note: Using password argument in write.xlsx2 function, the exported excel files can be password protected. But unfortunately writing of password protected workbooks is supported for Excel 2007 OOXML format only.
Summary
R & Python language extension was introduced in SQL Server 2016 & 2017 as part of machine learning. With support of R in Azure SQL database and Java language extension support in SQL Server 2019, this new approach can be used extensively as it easy, fast and flexible.
This post is just to give an overview of this new approach of exporting SQL output to excel files in SQL Server using R script. Based on specific requirement tweaking the solution mentioned above can cover any scenario.