Jaa


Open XML and Office Services

If you didn't see the news up on the Word team blog announcing Word Automation Services, and you have any interest in server side conversion of .docx files into .pdf or .xps, you should definitely go take a look: https://blogs.msdn.com/microsoft_office_word/archive/2009/10/26/introducing-word-automation-services.aspx

Capturing Business Processes in Office

I see a lot of different types of solutions people build to make their workgroups run more accurately and efficiently. Many of the solutions are created in order to codify one or more pieces of a business process, and as you can imagine, Office plays a huge role in these processes. For example, there may be people using Outlook for communication; Excel for analysis; Word for documentation; and just about any process ends in PowerPoint where you present the results of the work.

As we see the shift from a focus on an individual's productivity to a focus on the entire workgroup's productivity, it also means the way in which people program against our applications changes. This is why you see a lot more SharePoint development in addition to traditional Office development (I briefly mentioned this over the summer). It's also why the Open XML formats and the Open XML SDK were such important investments for us… they help the Office applications continue to play an important role in these workgroup level scenarios.

Printing and Re-Calcing on the Server

For example, one request we've heard from folks working with the Open XML formats on the server is that they want access to some core client functionality, like printing/layout in Word, or calculations in Excel. A couple weeks ago at the SharePoint conference we announced the next version of Excel Services, as well as a new service called Word Automation Services. These two services are great resources for people doing server side document assembly/manipulation with Open XML. The Open XML formats are great for consuming or generating content for dynamic document assembly scenarios, but often there is also the need to recalc the model, refresh the charts, or print the document, and you don't want to necessarily call back to the client to do this.

Server Side Document Assembly Example

A quick example to help explain this is a process I've seen a number of times, where banks were trying to do bulk generation of loan applications (thousands a day). There are end users involved in creating the template for the loan application, as well as building out the financial models for how the terms of the loan are determined, but the generation of the individual applications was a bulk server-side process based an incoming list of applicants. The combination of Open XML and content controls makes it pretty easy to create .docx files on the server, but in this scenario the banks wanted to send out either PDFs or hard copies, which meant you needed to include the Word client in the process (and it was obviously the big bottle neck).

With the new services announced at the SharePoint conference though, you get the perfect mix of software + services to solve the scenario. The end users can continue to work in the applications they are familiar with (Word & Excel) to build out the template and financial model, and the bulk generation can all happen on the server in an automated process:

Here's a quick step by step explanation of the process:

  1. Client-side: Loan Template author generates the template for the loan application, and uses content controls to specify where the data should go. He saves it up to SharePoint so others can collaborate with him.
  2. Client-side: Folks from the Legal department are able to work with in the document at the same time as the Template author because of the new co-authoring functionality in Word 2010.
  3. Client-side: A financial analyst builds up a model for determining whether an applicant should be considered and what the terms of the loan should be. This model is saved up to SharePoint as an .xlsx
  4. Server-side: As new applicants request an application, a server side process takes their data, and uses the Open XML SDK to inject it into xlsx. (Example 1; Example 2)
  5. Server-side: The financial model (.xlsx) is then sent off to Excel Services to perform calculations and pull out the results
  6. Server-side: The process takes the results of the calculation, and injects them into the Word template, using the content controls to determine what data should go where, producing a .docx file. (Example 1; Example 2)
  7. Server-side: The .docx file is passed off to Word services where a .pdf file is generated, which can either then be sent on to a high volume printer, or e-mailed directly to the applicant.
  8. Client-side: Any of the users can make updates to the documents (assuming this is allowed as part of the workflow), and those changes will automatically make their way into the bulk generation process.

So, you can see this is a pretty basic but also extremely power scenario, and if you've been reading Zeyad's examples over the past year or so you know how easy this can be with the Open XML SDK. I've been excited about these technologies for a few years now, and it's great that we can finally start talking publicly about them. Zeyad had a couple great demos at the SharePoint conference that really helped show the value of these services in combination with the Open XML formats. We'll post a video of the presentation as soon as it comes available, and will also have separate blog posts that drill into each one over the coming months.

-Brian

Comments

  • Anonymous
    November 11, 2009
    Brian: Question. Is it possible (perhaps with proper namespace markup)to add new user tags at the cell level of a sheet in the xl/worksheets/ folder? When I try I, get validation errors, and Excel repairs the sheet's xml file. I also never heard the answer to the question about preventing Excel from moving string content into the 'shairedStrings.xml' file upon saving, and the difficulty that makes reading/extraction the sheet file data. Easy of reading is more important then compation sometimes. Thanks.

  • Anonymous
    November 13, 2009
    Hey Jack, Excel does allow for custom XML markup, but not in the way that you're asking. The two ways of adding "semantics" to the cell is either through the use of named ranges, or the custom XML mapping support. In both cases, the "labels" are stored outside of the grid, and they then reference the cell or range they apply to. Excel will roundtrip this information, and even update the references properly if new rows or columns are added at runtime. There is no way to tell Excel to not use the shared string table. One option however is to post process the output to move the strings back inline. Should be an easy example to pull together (I'll see if Zeyad or I have some time to give it a shot). -Brian