Open XML Format SDK 2.0: Getting Started Best Practices

The Open XML Format allows you to generate, manipulate, or pull data from Word 2007, Excel 2007, and PowerPoint 2007 files. If you are working with Microsoft-based solutions, you can generate, manipulate, and pull data from documents using the following tools and technologies:

Note: If you are working with Java or PHP/Linux/Unix/Solaris, you can also find plenty of resources here and here.

Anyway, today I want to focus on a couple best practices for getting started with coding for Open XML Format SDK 2.0.

For the last couple of months I have been working with some Open XML gurus: Zeyad Rajabi, Linda Lanqing Brownell, Eric White, Joel Krist, and Ken Getz. We are working together on a new set of Visual How-tos for the Open XML Format SDK 2.0 and a set of 50+ code snippets for Visual Studio 2008 to be released soon on MSDN.

Quite frequently I hear from most of them a couple best practices that I’d like to share with you today: If you plan to build a solution using the Open XML Format SDK 2.0…

  1. Always start with a template for your solution
  2. Use DocumentReflector to get started with coding

Always start with the template for your solution

The very first thing we recommend you to do before you start writing code for a document solution, is to create the template. Depending on what kind of solution you want to build, you can use Word 2007, Excel 2007, or PowerPoint 2007 to create a template.

Imagine that you own a software company and you sent all your developers and consultants to the great Office and SharePoint sessions at TechEd this year. The end of the fiscal year is coming soon and you need expense data from each conference attendee ASAP. Developers and consultants plan to spend this weekend on New Orleans, so you ask them to fill out a Web-based Expense Report form that sends data to your company’s accounting database. Your accountant loves Excel 2007 and she asked you to export all expenses per/employee to Excel spreadsheets so she can review details before approving. CSV export won’t do this time because your accountant wants it all pretty and styled.  You want to figure out a way of programmatically generate the Expense Report spreadsheets by pulling data from your accounting database. You need styles and formatting as well. After evaluating lots of tools and technologies, you decide that you want to build this solution using the super cool Open XML Format SDK 2.0. Now what?

The very first thing we recommend you to do before you start writing code is create the template using Excel 2007. You may ask your accountant to design the Expense Report template using Excel 2007, or you can always download tons of great templates from Office Online. Here’s my sample expense report template:

expenseRTemplate[1]

Open the template using Excel 2007 and add some dummy data. Don’t forget to save the template as Excel Workbook file (.xlsx). Now that you have a template to start with, you can start coding the solution.

Use DocumentReflector to get started with coding

The Open XML Format SDK 2.0 download ships with a set of tools that facilitate the process of building document solutions using the SDK. One of this tools is the DocumentReflector. This tool has the ability to open an existing Open XML document and dynamically generate C# source code that uses the Open XML SDK 2.0 typesafe classes to create the document parts. Exploring the source code created by DocumentReflector is a great way to become familiar with the Open XML SDK 2.0 classes and the Office document formats. Zeyad refers to this tool as the “Macro Recorder on vitamins” and I can tell you it rocks! The main idea is that once you have a solution template, you open it using the DocumentReflector and you get a set of autogenerated classes that you can later modify with Visual Studio 2008.

To illustrate using the DocumentReflector to create the base code for the Expense Report solution, follow the next steps:

  1. Open the DocumentReflector tool: The DocumentReflector tool is located in the Tools folder under the Open XML Format SDK 2.0 installation folder. Use Windows Explorer to navigate to the Tools folder and double-click the DocumentReflector.exe file to launch the DocumentReflector tool.
  2. Select the DocumentReflector's File | Open menu and in the Open dialog browse to the folder containing the ExpenseReport.xlsx workbook created previously, select the workbook, and click on the Open button. DocumentReflector will open the workbook and display the content of the document. Clicking on the top-level Package node will display the generated code that can be used to create the entire package.

docReflector[1]

At this point the code generated by DocumentReflector can be copied and pasted into the Visual Studio solution for reuse and learning purposes.

Using Visual Studio 2008 to create a Windows console application

For testing purposes, you can create a Console application in Visual Studio 2008. Here are some generic steps:

  1. Open Visual Studio 2008.

  2. Create a new C# Windows Console Application project.

  3. Add a reference to the Open XML API assembly.

  4. Add a reference to the WindowsBase assembly.

    The next steps involve copying code generated by the DocumentReflector tool to the Visual Studio project. The DocumentReflector tool provided with the Open XML Format SDK 2.0 allows users to open a valid Open XML document, choose an XML element, part, or the whole package, and have DocumentReflector generate a C# class that can create the selected document parts using the Open XML Format SDK 2.0 classes.

  5. Select the using statements from the top of the DocumentReflector code window then copy and paste them to the top of the Program.cs file in the Visual Studio project.

[C#]

using DocumentFormat.OpenXml.Packaging;
using ap = DocumentFormat.OpenXml.ExtendedProperties;
using vt = DocumentFormat.OpenXml.VariantTypes;
using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Spreadsheet;
using a = DocumentFormat.OpenXml.Drawing;
using op = DocumentFormat.OpenXml.CustomProperties;

6. Select the code for all of the methods inside of the GeneratedClass class from the DocumentReflector code window then copy and paste the code as methods of the Program class in the Program.cs file in the Visual Studio project.

7. Change the copied CreatePackage method to be private static so it can be called from the static Main method.

[C#]

private static void CreatePackage(string filePath) {
    using (SpreadsheetDocument package =
        SpreadsheetDocument.Create(filePath,
SpreadsheetDocumentType.Workbook)) {
AddParts(package);
    }
}

The AddParts method creates all the parts that you need in the Expense Report spreadsheet.

[C#]

private static void AddParts(SpreadsheetDocument parent) {
var extendedFilePropertiesPart1 = parent.AddNewPart<ExtendedFilePropertiesPart>("rId3");
           GenerateExtendedFilePropertiesPart1().Save(extendedFilePropertiesPart1);

           var coreFilePropertiesPart1 = parent.AddNewPart<CoreFilePropertiesPart>("rId2");
           GenerateCoreFilePropertiesPart1(coreFilePropertiesPart1);

           var workbookPart1 = parent.AddWorkbookPart();
           GenerateWorkbookPart1().Save(workbookPart1);

           var workbookStylesPart1 = workbookPart1.AddNewPart<WorkbookStylesPart>("rId3");
           GenerateWorkbookStylesPart1().Save(workbookStylesPart1);

           var themePart1 = workbookPart1.AddNewPart<ThemePart>("rId2");
           GenerateThemePart1().Save(themePart1);

           var worksheetPart1 = workbookPart1.AddNewPart<WorksheetPart>("rId1");
           GenerateWorksheetPart1().Save(worksheetPart1);

           var spreadsheetPrinterSettingsPart1 = worksheetPart1.AddNewPart<SpreadsheetPrinterSettingsPart>("rId1");
           GenerateSpreadsheetPrinterSettingsPart1(spreadsheetPrinterSettingsPart1);

           var calculationChainPart1 = workbookPart1.AddNewPart<CalculationChainPart>("rId5");
           GenerateCalculationChainPart1().Save(calculationChainPart1);

           var sharedStringTablePart1 = workbookPart1.AddNewPart<SharedStringTablePart>("rId4");
           GenerateSharedStringTablePart1().Save(sharedStringTablePart1);

           var customFilePropertiesPart1 = parent.AddNewPart<CustomFilePropertiesPart>("rId4");
           GenerateCustomFilePropertiesPart1().Save(customFilePropertiesPart1);

       }

The GenerateSharedStringTablePart1() method contains the code that you need to create values for the table on the Expense Report. You can modify this code to iterate through the accounting database. You can pull expense report data by employee and replace the dummy data you entered.

[C#]

private static SharedStringTable GenerateSharedStringTablePart1() {
var element =
new SharedStringTable(
                 new SharedStringItem(
                     new Text("Name")),
                 new SharedStringItem(
                     new Text("Department")),
                 new SharedStringItem(
                     new Text("Manager")),
                 new SharedStringItem(
                     new Text("Position")),
                 new SharedStringItem(
                     new Text("From")),
                 new SharedStringItem(
                     new Text("To")),
                 new SharedStringItem(
                     new Text("Date")),
                 new SharedStringItem(
                     new Text("Account")),
                 new SharedStringItem(
                     new Text("Description")),
                 new SharedStringItem(
                     new Text("Transport")),
                 new SharedStringItem(
                     new Text("Fuel")),
                 new SharedStringItem(
                     new Text("Meals")),
                 new SharedStringItem(
                     new Text("Phone")),
                 new SharedStringItem(
                     new Text("Subtotal")),
                 new SharedStringItem(
                     new Text("Advances")),
                 new SharedStringItem(
                     new Text("For Office Use Only")),
                 new SharedStringItem(
                     new Text("Misc.")),
                 new SharedStringItem(
                     new Text("Hotel")),
                 new SharedStringItem(
                     new Text("Entertainment")),
                 new SharedStringItem(
                     new Text("PURPOSE:")),
                 new SharedStringItem(
                     new Text("STATEMENT NUMBER:")),
                 new SharedStringItem(
                     new Text("PAY PERIOD:")),
                 new SharedStringItem(
                     new Text("EMPLOYEE INFORMATION:")),
                 new SharedStringItem(
                     new Text("Total")),
                 new SharedStringItem(
                     new Text("APPROVED:")),
                 new SharedStringItem(
                     new Text("NOTES: "){ Space = "preserve" }),
                 new SharedStringItem(
                     new Text(" SSN"){ Space = "preserve" }),
                 new SharedStringItem(
                     new Text(" Employee ID"){ Space = "preserve" }),
                 new SharedStringItem(
                     new Text("Expense report")),
                 new SharedStringItem(
                     new Text("TechEd 2009")),
                 new SharedStringItem(
                     new Text("Erika Ehrli Cabral")),
                 new SharedStringItem(
                     new Text("123456")),
                 new SharedStringItem(
                     new Text("12345678")),
                 new SharedStringItem(
                     new Text("Office Development")),
                 new SharedStringItem(
                     new Text("JPBagel")),
                 new SharedStringItem(
                     new Text("Delicious breakfast")),
                 new SharedStringItem(
                     new Text("Developer (in my dreams)"))
){ Count = (UInt32Value)38U, UniqueCount = (UInt32Value)37U };
return element;
     }

7. Modify the Main method and add a call to the CreatePackage method.

[C#]

static void Main(string[] args){
    // Create an Excel workbook named ExpenseReportTest.xlsx
// in the current folder. You can write some code here to iterate through
// your accounting database and generate one Spreadsheet per employee.

CreatePackage("ExpenseReportTest.xlsx");
}

8. Build and run the sample. Using the code shown above the sample application will create an Excel workbook named ExpenseReportTest.xlsx located in the Visual Studio project's Debug or Release build folder depending on the selected build mode.

Opening the workbook with Excel will display a workbook that looks just like the ExpenseReport.xlsx workbook created previously.

More resources

If you are looking for more end-to-end solutions using the Open XML Format SDK 2.0, you must see this articles:

Also, as mentioned before, Brian’s/Zeyad’s blog and Eric’s blog are always a great resource for Open XML code samples.

Coming soon the set of Open XML Format SDK 2.0 VHTs and code snippets for Visual Studio 2008!

Comments

  • Anonymous
    May 14, 2009
    PingBack from http://microsoft-sharepoint.simplynetdev.com/open-xml-format-sdk-20-getting-started-best-practices/

  • Anonymous
    May 15, 2009
    This is an excellent set of resources to get started with Open XML format. Thanks! Dennis

  • Anonymous
    May 15, 2009
    So I hate to ask - any news of when an updated CTP/Beta/RC/Release of the OOXML SDK 2.0 is coming?!

  • Anonymous
    May 15, 2009
    Thank you for submitting this cool story - Trackback from DotNetShoutout

  • Anonymous
    May 18, 2009
    Top News Stories Microsoft Readies SharePoint Server 2010 Beta (PC World) Microsoft will launch an invitation

  • Anonymous
    May 19, 2009
    Does this provide the option to output the document in older formats, even though it is built in the newer one?

  • Anonymous
    May 20, 2009
    Dennis! Thanks so much for the kind comments. Sean, more CTPs coming for sure in the next few months. I don't have confirmation of release dates, but I suggest that you follow Biran/Zeyad's blog.This is the best place for updates related to API CTP releases. Guy, by installing the Compatibility Pack in addition to Microsoft Office 2000, Office XP, or Office 2003, you will be able to open, edit, and save files using the file formats new to Word, Excel, and PowerPoint 2007. The Compatibility Pack can also be used in conjunction with the Microsoft Office Word Viewer 2003, Excel Viewer 2003, and PowerPoint Viewer 2003 to view files saved in these new formats. Here's a link to the download: http://www.microsoft.com/downloads/details.aspx?familyid=941B3470-3AE9-4AEE-8F43-C6BB74CD1466&displaylang=en

  • Anonymous
    May 20, 2009
    sorry Erika, What i mean is - can i output older formats "from the code". we have clients on both office 2003 and 2007.  We cannot move development to the Open SDK if it can only output in the new format.  As clients will not undertake the rollout required to implement the compatability pack.

  • Anonymous
    May 21, 2009
    Guy, The SDK only outputs Open XML Format files, not binary formats. If you need to output 2003 binary formats and end-users of your solution can't deploy the compat pack, you can alternatively use the object model through automation. This is a good alternative if your application is running client-side. If your solution is running server-side, OM through interop is not supported.

  • Anonymous
    May 25, 2009
    Hi Erica, -not sure if my post made the cut so trying again- Excellent list of resources. I have one more: The New Excel 2007 File Format:

  • Anonymous
    May 29, 2009
    Encore cette fin de semaine, voici les quelques posts ou astuces que j’ai pu rencontrer sur la toile

  • Anonymous
    June 12, 2009
    The comment has been removed

  • Anonymous
    July 19, 2009
    Why doesn't DocumentReflector output VB code?

  • Anonymous
    July 21, 2009
    Introducing such a topic you'd like to congratulate you've let us know. Have good work

  • Anonymous
    August 21, 2009
    Great article. Thanks. Awaiting to the VB version of the DocumentReflector... Please be kind and do not forget the now out of date learners... (En attendant avec impatience la version VB du documentReflector. Je soutiens entièrement la remarque de DD)

  • Anonymous
    September 03, 2009
    Hi Great Developers,  Is it possible for any C++ developers to use OpenXML and Excel to generate SpreadSheet in C++ language. I was very disappointed to see not a single code snippet in C++ to create an Excel Spreadsheet. Can you please provide which compiles and executes and generates Excel spreadsheet? Awaiting your reply. Thanks Nxetgenguy

  • Anonymous
    October 20, 2009
    When trying to run the tools i get an exception: System.IO.FileLoadException: Could not load file or assembly 'DocumentFormat.OpenXml, Version=2.0.3930.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35' or one of its dependencies. The located assembly's manifest definition does not match the assembly reference. (Exception from HRESULT: 0x80131040) File name: 'DocumentFormat.OpenXml, Version=2.0.3930.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35' however the assembly that ships in the lib folder is of version 2.0.4330.0. Looks to me that tools that shipped with August CTP were not recompiled with the assembly version that shipped in August CTP

  • Anonymous
    March 18, 2010
    I have created an Excel 2007 spreadsheet using  DocumentFormat.OpenXml.Spreadsheet.  This file is supposed to be used as an attachment to an email message.  However, I get errors when opening it stating that this file cannot be opened by using Microsoft Excel.  "Do you want to search... online..?"  I want to save this file in Excel 2003 format so I won't get that message.  What do I do?  I don't want users having trouble reading the attached Excel doc.

  • Anonymous
    December 28, 2010
    mywayrockaway.blog2.fc2.com http://panjabi.coolog.jp/

  • Anonymous
    April 08, 2011
    The comment has been removed