Populate Excel files from Data Source
Scope
In this article, we'll see how we can use Visual Basic .NET, together with Excel Interops, to connect to a data source and extract information to be imposed on an already existant Excel model. In other words, how is it possible to write a simple Excel reporting software.
Introduction
Recently I've run into the necessity of connecting a software I've made to a complex hierarchy of pre-existant Excel files. What i was asked for, is to make my software able to compile reports as native Excels, to keep unaltered the old print model (and facilitate final users, for they will continue to receive the files they always had). Technically, the task was pretty clear: starting from the fact my application was interfacing with SQL Server, i needed to find a way to extract a certain record to write it on the Excel files that were given to me. Plus, those files can undergo a revision process, i.e. the authorized user could decide a determined value must not be printed again - for example - in A1 cell, but on C14 from now on. So my Excel-populating program needs a sort of "map" to know what-goes-where (allowing me or anybody else to quickly adapt to a modified situation).
Here I'll present a simplified version of what I'm talking about: i think it could be a useful introductive reading into inter-procedurality, having a central software which will manage a SQL Server session, while writing Excel files, basing the latter on further access to external parameters. Hope you'll enjoy it.
In order to make things as reusable as possible, we'll create a stand-alone project that will execute a SQL query on its own. Think it this way: if you have a program that could execute an external executable, feeding it some starting parameters, then we could use our project as an all-purpose Excel printer, that can be executed from other programs simply by launching it with some "switches".
Prerequisites
- Windows 7 or later
- Visual Studio 2013 Community Edition or later
- Office 2010 (it's the version used in this article: with other versions, the referenced Interop we'll see could change)
Step 1: Create a new project, add references and settings
Let's create a new Visual Basic project, selecting Windows Form. Then, go to Project » Add Reference, and select Microsoft Excel and Microsoft Office 14.0 Object Libraries. That will give us the tools to interact with our Excel models. In our main form, we need to specify in the import section the following:
Imports Microsoft.Office.Interop
That will make easier to reference the Office objects we'll use later.
Realistically, we can think our database connection will be always the same once defined, and the same can be said about the path in which our Excel models will reside. So we can define two Settings variables at application level, in order to store them in the application config file, accessing them through the My.Settings namespace.
Step 2: Create a parameter file functional to our means
Our program will surely need a sort of rules file, something from which to read the query to be executed, and what are the cells we need to write (and the specific field that will go in that particular cell). Also, we may want to indicate the type of data we need to write: for example, if we wish to paste images on ur Excel, we must have a method the application will use to determine the read data is a path to open, and not a string to be written.
For the sake of simplicity, i've chosen to use here a CSV file, which we'll process to extract what we need. Here's the sample, more on it later.
Query# SELECT 'TEXTCODE' + '?' AS Code, 'This is a sample' AS Descr
C4 # Code #TEXT
C5 # Descr #TEXT
H4 # c:\tmp\sample.jpg # IMAGE
As you can see, each line is introduced by a fixed text. In case of the first line, we'll use the string "Query" as a sort of function name, so when our program will meet it, it will know what follows is to be executed against a valid data source (we'll see how in minutes). Ignore the '?' character for now, let's focus only on the file structure. The lines from the second on are introduced by the cell name as it appears in Excel (so C4, C5, H4 are actual cells in our model). For each line, i've used the '#' character as a field separator. For the first line, the second line field is the query to be executed, while for other lines, the second parameter is the name of the query field to be written in that particular cell. A third field identifies the typology of the data: TEXT for textual data, IMAGE for paths from which retrieve an image (and that could be extended, of course).
When our program will be run, it should read a file like this, storing its information, to being able to subsequently exploit our Excel model in the correct way.
The question mark in the query is a hypothetical launch argument we need to operate with: in the example case, it will be simply a string that will be attached to TEXTCODE string, but in a real scenario, it could be an argument to be passed to a WHERE clause.
With an option file like the above, its easy to jot down a function to retrieve its contents, storing them in memory. Here follows the code, with some comments after it.
Private Structure Corrispondenze
Dim Cella As String
Dim Variabile As String
Dim Tipo As String
End Structure
Private Sub LeggiOpzioni(modello As String, parametro As String)
Using sR As New IO.StreamReader(My.Settings.ModelDirectory & "\" & modello & ".dat")
_mappa = New List(Of Corrispondenze)
While Not (sR.EndOfStream)
Dim parms() As String = sR.ReadLine.Split("#")
Select Case parms(0).Trim()
Case "Query"
_sqlCmd = parms(1).Replace("?", parametro)
Case Else
Dim item As New Corrispondenze
item.Cella = parms(0).Trim
item.Variabile = parms(1).Trim
item.Tipo = parms(2).Trim
_mappa.Add(item)
End Select
End While
End Using
End Sub
The LeggiOpzioni function requires two arguments:
- 'modello' is the name of our Excel model. I've imagined that for each Excel file to be used, that file must be accompanied by its parameter file. So, for example, we could have a "TEST.xls" file (with all its fixed text, formulas, predefined cells, style, and so on), and a "TEXT.dat" file, in which we'll define the query to be run, and what cells must be written. A model file and its rules, to put it another way.
- 'parametro' is the string/number we could want to start our program with. It's the value that will replace the question mark in the query, and - again - it could be used any way you like, as a concatenating string (like in the previous query), or as a WHERE clause match value, etc. If you've spotted a potential security flaw in this, you're right - what if a user passes a parameter containing T-SQL, special characters, and so on? Exactly, he could mess badly with our databases. It will be necessary to use a parametrized query, but since it's beyond the scope of the current article, i won't present it here, adding a bibliography link to further study the argument.
Our function will open the DAT file named as our model, read each line and splits each of them using our separator, '#'. Then it proceeds in analyzing the first field obtained. If it's "Query", we'll save in a string variable the read T-SQL query, simply substituting the question mark with the start argument (and, like i said few lines above, THIS IS BAD and must not be used in production environment, preferring a parametrized query approach). In cases which differs from the string "Query", we know we're talking about cells: so I've created a structure named Corrispondenze, in order to create a new List(Of Corrispondenze) in which to store the content of each option line, to use them later. From the first field we extract the cell name, from the second the name of a particular T-SQL field, and from the third the type of data we'll write (in our example, TEXT or IMAGE).
Step 3: Extract and write data on pre-existant Excel file
We have now all what it takes to process an Excel model. Here we'll see the subroutine which does the variable assignment and file saving step by step.
Step 3.1: Connect to a database and fill DataTable
To optimize database interaction, we'll open a connection, execute our query, and store the retrieved records in a DataTable object. That way, we could close the connection immediately after query execution, with no need at all to maintain busy our data source. We'll do that in the following way:
Dim connection As New SqlConnection(My.Settings.ConnectionString)
connection.Open()
Dim adapter As New SqlDataAdapter(_sqlCmd, connection)
Dim builder As New SqlCommandBuilder(adapter)
Dim dt As New DataTable
adapter.Fill(dt)
connection.Close()
We could summarize this snippet as: open the connection specified in My.Settings.ConnectionString, execute the given query with a DataAdapter then fill a DataTable using what DataAdapter has read. Then, close the connection. Now for our example, we'll have a single record, but the structure of DataTable object will be shaped like the query definition: in other words, in our DataTable we'll find rows and columns, the latter named as our T-SQL fields.
Step 3.2: Create and reference Excel objects
Now that we have the data to work with, it's time to write them down. But first, we need to open the model file, positioning us on a certain worksheet and - more generally - referencing the variable which will give us the possibility to access our Excel file.
Dim oExcel As Object
oExcel = CreateObject("Excel.Application")
oExcel.Workbooks.Open(My.Settings.ModelDirectory & "\" & modello & ".xlsx")
Dim oBook As Excel.Workbook
Dim oSheet As Excel.Worksheet
oBook = oExcel.ActiveWorkbook
oSheet = oExcel.Worksheets(1)
I've made use of My.Settings.ModelDirectory setting here, as the place from which our models will be read. As you can see from the last line, I've hard-coded the value of 1 as the worksheet to use, assuming our models will have a single worksheet.
Step 3.3: Writing on cells
In reading our model rules file, we've compiled a List(Of Corrispondenze) variable named _mappa, and now it's the momento to use it. We know each entry in _mappa is a cell to be written, so we loop the entire list, using each item's properties to do what we need. In the following snippet, you can see the first test is made on "Tipo" property, i.e. what kind of cell it is: textual or image? In case it's textual, we simply compile the cell specified by the property Cella with the requested content (Variable property).
For Each c As Corrispondenze In _mappa
If c.Tipo.CompareTo("IMAGE") = 0 Then
Dim oRange As Excel.Range = CType(oSheet.Range(c.Cella), Excel.Range)
Dim Left As Double = oRange.Left
Dim Top As Double = oRange.Top
Dim imagesize As Integer = 32
oSheet.Shapes.AddPicture(c.Variabile, Microsoft.Office.Core.MsoTriState.msoFalse, Microsoft.Office.Core.MsoTriState.msoCTrue, Left, Top, ImageSize, ImageSize)
Else
oSheet.Range(c.Cella).Value = dt.Rows(0).Item(c.Variabile)
End If
Next
Step 3.4: Saving the file
At the end of the loop, our model will be compiled and ready to be saved. Since we don't want to overwrite the model, which must stay intact through executions, we need to save the new file in a temporary location.
Dim resFile As String = System.IO.Path.GetTempPath() & "\" & modello & "_" & Now.Year.ToString("0000") & Now.Month.ToString("00") & Now.Day.ToString("00") & Now.Hour.ToString("00") & Now.Minute.ToString("00") & Now.Second.ToString("00") & ".xls"
resFile = resFile.Replace("\\", "\")
oExcel.DisplayAlerts = False
oBook.SaveAs(resFile, 1)
We use the GetTempPath() function to retrieve the current user's temporary path, forging a name file with the present timestamp. Using our Interop objects, we then proceed in saving a copy of our modified model (the constant 1 in SaveAs method means the file must be saved in XLS format). A final routine will release the used objects and open our saved file as a Process() to show it to the user. Let's see now the complete source code, with a live example of its use.
Step 4: Putting all together
Imports Microsoft.Office.Interop
Imports System.Data.SqlClient
Public Class MainFrm
Private Structure Corrispondenze
Dim Cella As String
Dim Variabile As String
Dim Tipo As String
End Structure
Dim _sqlCmd As String = ""
Dim _mappa As List(Of Corrispondenze)
Private Sub LeggiOpzioni(modello As String, parametro As String)
Using sR As New IO.StreamReader(My.Settings.ModelDirectory & "\" & modello & ".dat")
_mappa = New List(Of Corrispondenze)
While Not (sR.EndOfStream)
Dim parms() As String = sR.ReadLine.Split("#")
Select Case parms(0).Trim()
Case "Query"
_sqlCmd = parms(1).Replace("?", parametro)
Case Else
Dim item As New Corrispondenze
item.Cella = parms(0).Trim
item.Variabile = parms(1).Trim
item.Tipo = parms(2).Trim
_mappa.Add(item)
End Select
End While
End Using
End Sub
Private Sub LoadVariables(modello)
Try
Application.DoEvents()
Dim connection As New SqlConnection(My.Settings.ConnectionString)
connection.Open()
Dim adapter As New SqlDataAdapter(_sqlCmd, connection)
Dim builder As New SqlCommandBuilder(adapter)
Dim dt As New DataTable
adapter.Fill(dt)
connection.Close()
Dim p As New Process()
Dim ps As New ProcessStartInfo("cmd.exe", "/C taskkill.exe /IM EXCEL.EXE /F")
p.StartInfo = ps
p.Start()
Threading.Thread.Sleep(2000)
Dim oExcel As Object
oExcel = CreateObject("Excel.Application")
oExcel.Workbooks.Open(My.Settings.ModelDirectory & "\" & modello & ".xlsx")
Dim oBook As Excel.Workbook
Dim oSheet As Excel.Worksheet
oBook = oExcel.ActiveWorkbook
oSheet = oExcel.Worksheets(1)
For Each c As Corrispondenze In _mappa
If c.Tipo.CompareTo("IMAGE") = 0 Then
Dim oRange As Excel.Range = CType(oSheet.Range(c.Cella), Excel.Range)
Dim Left As Double = oRange.Left
Dim Top As Double = oRange.Top
Dim imagesize As Integer = 32
oSheet.Shapes.AddPicture(c.Variabile, Microsoft.Office.Core.MsoTriState.msoFalse, Microsoft.Office.Core.MsoTriState.msoCTrue, Left, Top, ImageSize, ImageSize)
Else
oSheet.Range(c.Cella).Value = dt.Rows(0).Item(c.Variabile)
End If
Next
Dim resFile As String = System.IO.Path.GetTempPath() & "\" & modello & "_" & Now.Year.ToString("0000") & Now.Month.ToString("00") & Now.Day.ToString("00") & Now.Hour.ToString("00") & Now.Minute.ToString("00") & Now.Second.ToString("00") & ".xls"
resFile = resFile.Replace("\\", "\")
oExcel.DisplayAlerts = False
oBook.SaveAs(resFile, 1)
ReleaseObject(oSheet)
oBook.Close(False, Type.Missing, Type.Missing)
ReleaseObject(oBook)
oExcel.Quit()
ReleaseObject(oExcel)
GC.Collect()
Dim pC As New Process
Dim pCs As New ProcessStartInfo(resFile)
pC.StartInfo = pCs
pC.Start()
Catch ex As Exception
MsgBox(ex.Message & Environment.NewLine & ex.StackTrace)
End Try
End Sub
Private Sub ReleaseObject(ByVal o As Object)
Try
While (System.Runtime.InteropServices.Marshal.ReleaseComObject(o) > 0)
End While
Catch
Finally
o = Nothing
End Try
End Sub
Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
If Environment.GetCommandLineArgs.Length < 2 Then
Application.ExitThread()
Else
Dim modello As String = Environment.GetCommandLineArgs(1).ToString
Dim parametro As String = Environment.GetCommandLineArgs(2).ToString
LeggiOpzioni(modello, parametro)
LoadVariables(modello)
Application.ExitThread()
End If
End Sub
End Class
A sample session of the running program could be seen here:
Source code
The source code used in our example could be downloaded here: https://code.msdn.microsoft.com/Populate-Excel-files-from-39ef8314
Bibliography
- SqlParameter class (to study a secure way to call a query)
- Microsoft Office Excel Interop
- System.Io.Path.GetTempPath function
Other Languages
The present article is available in the following localizations: