Document Assembly Solution for SpreadsheetML
In my last post I gave a high level overview of the architecture of the SDK. Now that we've gone over the basics, we're ready to talk about some real world solutions and end-to-end scenarios. Based on feedback I heard from you guys, I will start with scenarios and solutions around SpreadsheetML. Let me know if you have any specific scenarios or solutions you would like me to address in future posts.
In this post, I will show you how to populate a spreadsheet with data from a real SQL database and create a cool looking chart based on that data, all without using Excel client.
Scenario
Today's scenario theme is document assembly, which is all about constructing a file based on external data sources, like other files or databases.
Imagine a scenario where I'm a developer working for a fictional company called Adventure Works (based on the sample company on MSDN, which can be found here). Like other companies, my company stores all sales information in a database. Currently the Sales team uses Excel to manually connect to the database in order to pull out the latest sales information. I have been asked to create a report generation tool that takes data from a database into Excel and creates a report with a chart. The Sales team would like to run this solution on our report server in overnight batches so there is a strict requirement that we cannot use the Excel client.
Solution
Before I get into the details of my solution I want to call out a couple of things:
- I created my solution using Visual Studio 2008
- I am using the freely available Adventure Works database built for SQL Server 2005
Please note that you are not required to use either of these two technologies to build Open XML solutions. Feel free to use your favorite editor/compiler and database or data source.
This solution will leverage the Open XML SDK, so if you don't already have it, download the SDK here. Also make sure you install the Adventure Works database from the link above.
Okay, we're ready to get started. If you just want to jump straight into the code, feel free to download this solution here.
Step 1 – Setting the Project up
Once you've installed the Open XML SDK, your first step is to create a .Net project that has a reference to the SDK. In my case, I created a C# project called v2WorkbookFromDB and included a reference to DocumentFormat.OpenXML, which is the dll name of the Open XML SDK. Now we have access to the SDK.
We then need to add a connection to our database. I am a huge fan of LINQ, especially for making my life a lot easier for querying and connecting to SQL databases. If you're not already familiar with the technology, you should check out LINQ to SQL. Within Visual Studio, you can add automatically generated LINQ to SQL classes to my Adventure Works database in a matter of seconds. These classes allow you to connect and query databases without the need of ever writing SQL code. For those of you who have never tried this functionality before, let me walk you through the steps:
In your Visual Studio project, right click the project and go to Add | New Item
Select the LINQ to SQL Classes template. Choose a name and click add
Now that you have added a template, let's connect to the database. Choose your database connection in the Server Explorer window and drag the SalesTerritory table to the designer window, which should end up looking like the following
That's it. No extra coding or anything. You now have access to the database and not only that, there are .Net classes automatically generated, which allow you to easily query the database through Objects. Sweet!
For the sake of this scenario, let's say I am only interested in last year's and this year's sales for each territory.
Step 2 – Creating a Template File
The scenario I listed above talks about creating a workbook from data within my database. I could either create my workbook from the ground up or start from an existing template. I definitely recommend the latter choice. It's almost always easier to create an Open XML solution by starting from an existing template.
In this scenario, I created a workbook template in Excel that has two sheets. The first sheet contains the header rows for the information I am interested in, which in this case is "Territory Name", "Sales Last Year", and "Sales This Year". The second sheet contains the overall structure of what I would like my chart to look like.
My template looks like the following:
With this template I am left with the following core tasks:
- Insert data into the first sheet of my workbook
- Modify the ranges in the my chart so it picks up the data
Step 3 – Insert Data into my Workbook
Before we can use the SDK, we need to add the appropriate namespaces to our C# file. In this case, we need to add my Packaging and Spreadhseet API components as follows:
using DocumentFormat.OpenXml.Packaging; using DocumentFormat.OpenXml.Spreadsheet; using DocumentFormat.OpenXml; |
I know by taking a look at our template file that the first worksheet will need to contain all the necessary data. The first thing we need to do is find this worksheet within the workbook. The following code shows you how to open an existing workbook file and get access to the first worksheet. Once you have access to the worksheet we can access the SheetData Object, which represents all the cell data contained within the worksheet.
//Make a copy of the template file File.Copy("template.xlsx", "generated.xlsx", true); //Open up the copied template workbook using (SpreadsheetDocument myWorkbook = SpreadsheetDocument.Open("generated.xlsx", true)) { //Access the main Workbook part, which contains all references WorkbookPart workbookPart = myWorkbook.WorkbookPart; //Grab the first worksheet WorksheetPart worksheetPart = workbookPart.WorksheetParts.First(); //SheetData will contain all the data SheetData sheetData = worksheetPart.Worksheet.GetFirstChild<SheetData>(); ... } |
To add data to the worksheet we will need to append more rows to the SheetData object based on data from the database. The following code shows you how to query the Adventure Works database for all sales information in the SalesTerriroty table. For each row in the query we then need to create and add a new row to the worksheet based on the data from the query.
//Connect to database AdventureWorksDataContext db = new AdventureWorksDataContext(); //My data starts at row 2 int index = 2; //Select * from SalesTerritoties table var territoryQuery = from t in db.SalesTerritories select t; //For each row in my database add a row to my spreadsheeet foreach (var item in territoryQuery) { string territoryName = item.Name; decimal salesLastYear = Math.Round(item.SalesLastYear, 2); decimal salesThisYear = Math.Round(item.SalesYTD, 2); //Add a new row Row contentRow = CreateContentRow(index, territoryName, salesLastYear, salesThisYear); index++; //Append new row to sheet data sheetData.AppendChild(contentRow); } |
I know in our case we want to create a row that has three cells: the name of the territory, sales from last year, and sales for this year. The first cell is a text based cell, while the other two cells are value based cells. The following code creates a row with the three cells:
string[] headerColumns = new string[] { "A", "B", "C" }; Row CreateContentRow(int index, string territory, decimal salesLastYear, decimal salesThisYear) { //Create new row Row r = new Row(); r.RowIndex = (UInt32)index; //First cell is a text cell, so create it and append it Cell firstCell = CreateTextCell(headerColumns[0], territory, index); r.AppendChild(firstCell); //Create cells that contain data for (int i=1; i<headerColumns.Length; i++) { Cell c = new Cell(); c.CellReference = headerColumns[i] + index; CellValue v = new CellValue(); if (i == 1) v.Text = salesLastYear.ToString(); else v.Text = salesThisYear.ToString(); c.AppendChild(v); r.AppendChild(c); } return r; } |
Text based cells and value based cells are stored differently in SpreadsheetML. Take a look at this past post for more information on the differences. Based on these differences we have the following code to create text based cells:
Cell CreateTextCell(string header, string text, int index) { //Create new inline string cell Cell c = new Cell(); c.DataType = CellValues.InlineString; c.CellReference = header + index; //Add text to text cell InlineString inlineString = new InlineString(); Text t = new Text(); t.Text = text; inlineString.AppendChild(t); c.AppendChild(inlineString); return c; } |
At this point we have a fully functional workbook with data populated from our database.
Step 4 – Adjusting the Chart Data
If you take a look at the chart data within our template you will notice that the ranges for the data are fixed and reference data up to the second row in the spreadsheet as follows:
<c:f>'My Data'!$B$2</c:f>
In order to fix our chart data, we need to extend this range to include all of our new rows. The following code accomplishes this task:
void FixChartData(WorkbookPart workbookPart, int totalCount) { //Grab the appropriate chart part from template file ChartPart chartPart = workbookPart.ChartsheetParts.First().DrawingsPart.ChartParts.First(); //Change the ranges to accomodate for newly inserted data foreach (Charts.Formula formula in chartPart.ChartSpace.Descendants<Charts.Formula>()) { if (formula.Text.Contains("$2")) { string s = formula.Text.Split('$')[1]; formula.Text += ":$" + s + "$" + totalCount; } } chartPart.ChartSpace.Save(); } |
End Result
The end result is we have just created a new workbook that contains all the data from our database as well as a cool looking chart that represents the data. The generated file looks like:
Next Time
In my next few posts I am going to walk through other solutions to some key scenarios. Suggestions are always welcome.
Zeyad Rajabi
Comments
Anonymous
November 04, 2008
Nice. But could you explain how I do the same thing on Mac and Linux. We will have a need for this on those platforms. Thanks.Anonymous
November 05, 2008
Darren, Are you wondering about generating the documents on Mac and Linux, or consuming the documents once they've been generated? -BrianAnonymous
November 06, 2008
Can you give some example how to generate the spreadsheet in code? There are great need of automatically generate excel spreadsheet, yet it seems very few good example for this purpose, the sample code I saw are all require manually create many xmls to even create a simple spreadsheet which I hates it and I think it greatly slow down development effort for open xml related application. I would like to see similar method as office interop assemblies which does not requires me to create workbook/worksheet xml manually......Anonymous
November 06, 2008
HZ, Can you explain your scenario a bit more? Have you taken a look at the DocumentReflector tool that comes part of the Open XML SDK? This tool can automatically generate code based on a document. Zeyad RajabiAnonymous
November 07, 2008
The comment has been removedAnonymous
November 11, 2008
This article help me very much to export my Database contents to Excel document. Thanks Brian!Anonymous
November 11, 2008
@HZ Is this the kind of info you look for ? http://openxmldeveloper.org/forums/3825/ShowThread.aspxAnonymous
November 11, 2008
In my last post , I showed you how to create a spreadsheet based on data from a database using the OpenAnonymous
November 18, 2008
In my last two posts, I showed you how to create a spreadsheet based on data and how to read data fromAnonymous
November 21, 2008
Comme à l'accoutumé, voici une brochettes de liens de la semaine sur Open XML. Posts techniques en vracAnonymous
November 25, 2008
Over on Brian Jones' blog there are two recent articles that may be of interest to our readers here.Anonymous
January 19, 2009
First off I want to thank everyone for leaving comments suggesting future posts about the Open XML SDK.Anonymous
January 30, 2009
This is a common task for ASP.NET developers. You have a Web application where you expose data from aAnonymous
March 13, 2009
There have been several requests made by people asking how to import a chart from one document type to